Files
insta-recipe/docs/FINDINGS.md

500 lines
18 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Findings & Research Documentation
**Last Updated:** 2026-02-15T00:00:00.000Z
**JIRA:** RECIPE-0001
**Status:** Initialized
---
## Purpose
This document tracks research findings, analysis results, and technical discoveries made during development. Each agent (Planner, Developer, Reviewer) appends findings as they work through the pipeline.
---
## Initial Codebase Analysis
### Language & Framework
- **Primary Language**: TypeScript 5.9.3
- **Framework**: SvelteKit 2.48.5 with Svelte 5.43.8
- **Runtime**: Node.js 22+
- **Package Manager**: npm
### Project Type
Progressive Web Application (PWA) for extracting recipes from Instagram posts and uploading them to Tandoor Recipe Manager.
### Architecture Style
**Hexagonal Architecture** (Ports and Adapters):
- Domain logic in `src/lib/server/`
- External system adapters: Instagram, Tandoor, LLM, Browser
- Clear separation between client and server code
### Key Technical Components
1. **Queue Management System**: In-memory FIFO queue with async processing
2. **Three-Phase Pipeline**: Extraction → Parsing → Uploading
3. **Real-Time Updates**: Server-Sent Events (SSE) for progress tracking
4. **Push Notifications**: Web Push API for background notifications
5. **PWA Features**: Service worker, manifest, install prompts
### Design Patterns Identified
- **Singleton**: QueueManager, QueueProcessor, PushNotificationService
- **Factory**: createLLM(), createBrowserContext(), initializeBrowser()
- **Observer**: Queue subscription system, SSE streaming
- **Adapter**: Instagram, Tandoor, LLM, Browser adapters
- **Strategy**: Multiple extraction methods with fallback
### Dependencies Overview
**Production** (6 dependencies):
- Browser automation: `playwright`
- LLM integration: `openai`
- Utilities: `uuid`, `date-fns`, `zod`
**Development** (26+ dependencies):
- Framework: `@sveltejs/kit`, `svelte`, `vite`
- Testing: `vitest`, `@vitest/browser-playwright`
- Styling: `tailwindcss`
- Tooling: `typescript`, `eslint`, `prettier`
### File Structure
```
52 total TypeScript/JavaScript files
├── 39 TypeScript files (.ts)
├── 10+ Svelte components (.svelte)
├── 3 JavaScript config files (.js)
└── Multiple test files (.spec.ts)
```
### Code Quality Indicators
- **Strict TypeScript**: `strict: true` enabled
- **Comprehensive Testing**: 138 tests across unit, integration, and browser tests
- **Linting**: ESLint with TypeScript and Svelte plugins
- **Formatting**: Prettier with Svelte and Tailwind plugins
- **Type Safety**: Zod schemas for runtime validation
### Environment Configuration
Required variables:
- `OPENAI_API_KEY` - LLM access
- `TANDOOR_URL` - Recipe manager URL (optional)
- `TANDOOR_TOKEN` - API authentication (optional)
- `QUEUE_CONCURRENCY` - Processing limit (default: 2)
- `QUEUE_MAX_RETRIES` - Retry attempts (default: 3)
### Deployment Setup
- **Docker**: Dockerfile with Node.js 22 Alpine + Chromium
- **HTTPS**: Local SSL certificates for PWA features
- **Production**: Node.js adapter for SvelteKit
### Notable Features
1. **Multi-Method Extraction**: 4-strategy cascade with intelligent fallback
2. **Progress Tracking**: Real-time callbacks throughout extraction pipeline
3. **Thumbnail Validation**: HTTP status code checking for image URLs
4. **Retry Logic**: Configurable retry attempts for failed extractions
5. **Scheduler**: Background task execution with authentication
---
## Technical Debt & Opportunities
### Identified Issues
1. **Deprecated Endpoints**: `/api/extract` returns 410 Gone (migration helper)
2. **In-Memory Queue**: No persistence - items lost on server restart
3. **Single Instance**: Queue state not shared across multiple server instances
### Potential Improvements
1. **Queue Persistence**: Redis or database-backed queue for durability
2. **Horizontal Scaling**: Shared queue state for multi-instance deployments
3. **Rate Limiting**: Instagram request throttling to avoid blocks
4. **Caching**: Extracted content caching to reduce redundant processing
---
## Research Findings
*This section will be populated by the Planner agent during task analysis.*
### [Planner] Research Notes - RECIPE-0001 (2026-02-15)
**Task:** Fix model loading issue and frontend error display
#### Issue 1: Model Loading - "400 No models loaded"
**Research Date:** 2026-02-15
**Source:** Stack trace analysis, OpenAI SDK documentation, LM Studio/LiteLLM API patterns
**Problem Analysis:**
- Error occurs at `detectRecipe()` in [src/lib/server/parser.ts](src/lib/server/parser.ts#L30)
- OpenAI-compatible APIs (LM Studio, LiteLLM, Ollama, etc.) often require models to be explicitly loaded
- Current implementation assumes model is already loaded
- Error message contains provider-specific instructions ("use the 'lms load' command")
**OpenAI-Compatible Model Loading Patterns:**
1. **LM Studio**: Uses `/v1/models` endpoint to list available models
- Loaded models appear in response with `"id": "model-name"`
- No programmatic loading endpoint (manual load in UI)
2. **LiteLLM**: Uses `/v1/models` to list loaded models
- Models must be configured in server startup
- No dynamic loading endpoint
3. **Ollama**: Uses `/api/tags` for model list and `/api/pull` for loading
- Different API structure (not `/v1` prefix)
4. **Generic OpenAI-compatible**: Most follow OpenAI's `/v1/models` endpoint
- No standard for dynamic model loading
- Usually require pre-configuration
**Solution Approach:**
- Check if model exists via `client.models.list()`
- If model not found/loaded, provide clear user-facing error
- Remove provider-specific error messages
- Add notification when model check succeeds
- Consider future enhancement: detect provider type and attempt auto-load if supported
**Files Affected:**
- [src/lib/server/llm.ts](src/lib/server/llm.ts) - Add model availability check
- [src/lib/server/parser.ts](src/lib/server/parser.ts) - Handle model not loaded error
- [src/lib/server/queue/QueueProcessor.ts](src/lib/server/queue/QueueProcessor.ts) - User notification
---
#### Issue 2: Frontend Error Display - "[object Object]"
**Research Date:** 2026-02-15
**Source:** Code analysis of QueueItemCard.svelte, types.ts, QueueManager.ts
**Problem Analysis:**
- Error structure is an object: `{ phase, message, recoverable, timestamp }`
- Frontend displays `{item.error}` directly (line 205 of QueueItemCard.svelte)
- Svelte renders object.toString() → "[object Object]"
**Current Implementation:**
```typescript
// types.ts - Error is an object
error?: {
phase: ProcessingPhase;
message: string;
recoverable: boolean;
timestamp: string;
}
// QueueItemCard.svelte line 205 - Displays object directly
<div class="text-sm text-red-700 mt-1">{item.error}</div>
```
**Solution:**
Change to: `{item.error?.message || item.error}`
- Handles object error (gets .message)
- Handles legacy string errors (fallback)
- Type-safe with optional chaining
**Files Affected:**
- [src/routes/components/QueueItemCard.svelte](src/routes/components/QueueItemCard.svelte#L205) - Display error.message
---
#### Dependencies & Constraints (from ARCHITECTURE.md)
- Using `openai@^4.20.0` SDK
- Environment: `OPENAI_BASE_URL`, `OPENAI_API_KEY`, `LLM_MODEL`
- Current config example: `http://192.168.1.10:1234/v1` (LM Studio)
- Must maintain OpenAI-compatible API contract
- No assumption about specific provider implementation
#### Code Style Requirements (from CODE_STYLE.md)
- Use SvelteKit `$env/dynamic/private` for env vars (already correct)
- Error handling: try-catch with descriptive messages
- Console logging: `[Component] Message` format
- Type safety: TypeScript strict mode enabled
<!-- Planner appends findings here -->
---
### [Developer] Implementation Notes
<!-- Developer appends findings here -->
---
### [Reviewer] Review Notes
<!-- Reviewer appends findings here -->
---
## API Endpoint Catalog
### Active Endpoints
#### Queue Management
- `POST /api/queue` - Enqueue Instagram URL for processing
- `GET /api/queue` - List queue items (supports filtering, pagination)
- `GET /api/queue/stream` - SSE stream for real-time updates
- `GET /api/queue/{id}` - Get specific queue item details
- `DELETE /api/queue/{id}` - Remove item from queue
- `POST /api/queue/{id}/retry` - Retry failed extraction
#### Push Notifications
- `POST /api/notifications/subscribe` - Subscribe to push notifications
- `DELETE /api/notifications/subscribe` - Unsubscribe from notifications
- `GET /api/notifications/vapid-key` - Get VAPID public key
#### Health & Status
- `GET /api/health` - Application health check
- `GET /api/llm-health` - LLM service availability check
#### Tandoor Integration
- `POST /api/tandoor` - Upload recipe to Tandoor
- `GET /api/tandoor-config` - Get Tandoor configuration status
#### Legacy/Deprecated
- `POST /api/extract` - ⚠️ Deprecated (returns 410 Gone)
---
## Known Constraints
### Browser Automation
- Requires Chromium/Chrome installation
- Headless mode used in production
- Cookie handling for authenticated Instagram content
### LLM Integration
- Requires OpenAI-compatible API endpoint
- Configurable model selection
- Structured output using Zod schemas
### Tandoor Integration
- Optional feature (disabled without credentials)
- Requires Tandoor API token
- Supports ingredient partitioning across steps
### SSL Requirements
- HTTPS required for Service Worker registration
- Local development uses self-signed certificates
- Certificates managed via external Caddy CA
---
## Testing Coverage
### Test Distribution
- **Unit Tests**: Core logic validation
- **Integration Tests**: Multi-component workflows
- **API Tests**: Endpoint behavior verification
- **Browser Tests**: Svelte component rendering
### Test Files
- `queue-manager.spec.ts`
- `queue-processor.spec.ts`
- `queue-api.spec.ts`
- `queue-sse.spec.ts`
- `scheduler.spec.ts`
- `instagram-url-validation.spec.ts`
- `thumbnail-validation.spec.ts`
- `extraction-url-validation.integration.spec.ts`
- `page.svelte.spec.ts`
### Mock Strategy
- Environment variables mocked via `vi.mock('$env/dynamic/private')`
- External services mocked at module level
- Browser automation mocked for unit tests
---
## Documentation Inventory
### Existing Documentation
- `README.md` - Project overview and setup
- `docs/API.md` - API endpoint specifications
- `docs/MIGRATION.md` - Migration guides
- `docs/SVELTEKIT_SSR_GUIDE.md` - SSR implementation notes
- `docs/TESTING.md` - Testing guide and mocking patterns
- `docs/Tandoor (2.3.6).yaml` - OpenAPI spec for Tandoor
### Plan Documentation
`docs/plans/` contains 20+ implementation plans:
- Execution plans for completed features
- Technical specifications
- Story breakdowns with acceptance criteria
### Outcome Documentation
`docs/outcomes/` contains 20+ outcome reports:
- Implementation summaries
- Changes made
- Testing results
- Lessons learned
---
## Agent Pipeline Notes
### Build Commands
- **Build**: `npm run build`
- **Test**: `npm test` (alias for `npm run test:unit -- --run`)
- **Dev**: `npm run dev`
- **Lint**: `npm run lint`
- **Format**: `npm run format`
### Development Workflow
1. Make changes in `src/`
2. Run tests: `npm test`
3. Verify build: `npm run build`
4. Test locally: `npm run dev`
### Continuous Integration
- ESLint checks code quality
- Prettier enforces formatting
- TypeScript checks type safety
- Vitest runs test suite
---
## Next Steps
This document will be updated by subsequent agents:
1. **Planner**: Append research findings and analysis
2. **Developer**: Document implementation discoveries
3. **Reviewer**: Record review observations and recommendations
---
### [Planner] Research Notes - RECIPE-0002 (2026-02-16)
**Task:** Complete PWA implementation (installability, push notifications, share target)
#### PWA Documentation Research
**Research Date:** 2026-02-16
**Sources:** MDN Web Docs, web.dev, W3C specifications
**Progressive Web Apps (PWA) - Key Requirements:**
1. **Web App Manifest** (`manifest.json`)
- Required members: `name` or `short_name`, `icons` (192x192 PNG minimum), `start_url`, `display`
- Share target support via `share_target` member (method, action, params)
- Icons should include 192x192 and 512x512 sizes for optimal display
- Browser compatibility: Chrome/Edge (full), Firefox/Safari (limited for share_target)
2. **Service Worker**
- Must be registered to enable offline functionality
- Lifecycle: install → activate → fetch events
- Required for push notifications
- Must be served over HTTPS (or localhost)
3. **HTTPS Requirement**
- Mandatory for service worker registration
- Required for push notifications and other secure contexts
- Local development: `http://localhost` is treated as secure
4. **Installability Criteria** (from MDN/web.dev):
- Valid manifest with required members
- Service worker registered with fetch event handler
- Served over HTTPS
- At least one 192x192 PNG or SVG icon
- Display mode set (fullscreen, standalone, minimal-ui)
**Push Notifications (Web Push API):**
- Requires service worker to receive push events
- VAPID authentication (application server keys) required for Chrome
- Subscription process: permission → subscribe → store subscription → send push
- Push service (browser vendor controlled) routes messages
- Notification permissions: default, granted, denied
- Best practice: request permission after user interaction
**Web Share Target API:**
- Registers PWA as share destination
- Configuration via manifest `share_target` member
- Supports GET or POST methods
- `params` define query string mapping (title, text, url)
- Files can be shared via POST with `multipart/form-data`
- Currently Chrome/Edge only (experimental)
- App must be installed to appear in share sheet
#### Current Implementation Analysis
**Research Date:** 2026-02-16
**Files Analyzed:** manifest.json, service-worker.ts, app.html, svelte.config.js, PWAInstallManager.ts, PushNotificationManager.ts
**Manifest Analysis (`static/manifest.json`):**
- ✅ Has all required PWA members (name, short_name, start_url, display, scope, theme_color, background_color)
- ✅ Share target configured correctly (GET /share with title/text/url params)
- ⚠️ Icons reference `/favicon.png` but file does NOT exist in static folder
- ⚠️ Uses same icon path for both 192x192 and 512x512 sizes
- Missing optional but recommended members: `description`, `screenshots`, `categories`
**Service Worker Analysis (`src/service-worker.ts`):**
- ✅ Native SvelteKit service worker (migrated from vite-pwa plugin)
- ✅ Install event: caches all build assets and static files
- ✅ Activate event: cleans up old caches
- ✅ Fetch event: cache-first for assets, network-first with cache fallback for others
- ✅ Push event handler: processes push messages, shows notifications with actions
- ✅ Notification click handler: opens/focuses app, handles action buttons
- ✅ Notification close handler: tracks dismissals
- ✅ Background sync handler: supports retry operations
- ✅ Message handler: supports service worker communication
- ✅ Global error handlers present
**Service Worker Registration (`svelte.config.js`):**
-`serviceWorker.register: true` enabled
- ✅ SvelteKit handles registration automatically
**Manifest Link (`src/app.html`):**
-`<link rel="manifest" href="/manifest.json">` present in head
**Client-Side Managers:**
-`PushNotificationManager.ts`: Full implementation with permission, subscribe, unsubscribe
-`PWAInstallManager.ts`: beforeinstallprompt handling, install prompt triggering
- ✅ Both are SSR-safe with browser guards
**Share Target (`/share` route):**
- ✅ Route exists at `src/routes/share/+page.svelte`
- ✅ Parses query params (text, url) from share target
- ✅ Extracts Instagram URLs from shared text
- ✅ Auto-processes URLs on mount
- ✅ Enqueues items and redirects to dashboard
**Icons/Assets Issue:**
- ⚠️ **CRITICAL**: `manifest.json` references `/favicon.png` but file doesn't exist
-`src/lib/assets/favicon.svg` exists (used in layout)
- ⚠️ No PNG icons in `static/` folder
- ⚠️ Service worker references `/favicon.png` for notifications
**Push Notifications Infrastructure:**
- ✅ VAPID keys configured in `queueConfig.push` (uses env vars or defaults)
- ✅ Server endpoint: `/api/notifications/vapid-key` (GET)
- ✅ Server endpoint: `/api/notifications/subscribe` (POST/DELETE)
- ✅ PushNotificationService stores subscriptions in-memory
- Note: Subscriptions are not persisted (lost on restart)
#### What Works Already:
1. **PWA Structure**: Complete Native SvelteKit PWA implementation
2. **Service Worker**: Fully functional with caching, push, notifications
3. **Push Notifications**: Client and server infrastructure in place
4. **Share Target**: Configured in manifest and `/share` route working
5. **Install Prompts**: PWAInstallManager ready to trigger install
6. **HTTPS**: App served at https://localhost:5173/
#### What Needs Attention:
1. **Icons**: Create PNG icons (192x192, 512x512) from existing SVG
2. **Icon Verification**: Ensure icons are properly sized and optimized
3. **Installability Testing**: Verify all criteria met via chrome://pwa-internals
4. **Push Notification Testing**: Verify VAPID key generation and push flow
5. **Share Target Testing**: Test share from external apps (Instagram)
6. **Manifest Enhancement**: Add description, categories for better discoverability
#### Dependencies & Constraints (from ARCHITECTURE.md, CODE_STYLE.md):
- Using native SvelteKit PWA (no plugins needed)
- Service worker: `$service-worker` module provides build, files, version
- Environment: uses `$env/dynamic/private` for server configs
- HTTPS required (already configured at https://localhost:5173/)
- TypeScript strict mode enabled
- All file paths must use SvelteKit path aliases (`$lib`, `$service-worker`)
#### Code Style Requirements (from CODE_STYLE.md):
- FilesNaming: manifest.json, service-worker.ts, lowercase for utilities
- Type annotations required for public APIs
- SSR-safe code: all browser API usage must be guarded with `browser` check
- Error handling: try-catch with descriptive messages
- Comments: JSDoc for public APIs, inline for complex logic
---
**Document Version:** 1.1
**Last Updated by:** Planner Agent
**Next Update:** Developer Agent