Documents hard-won discoveries from active debugging sessions: - Instagram GraphQL/mobile API silent caption truncation (no marker) - DOM extraction (html-section strategy) as the only reliable approach - creator-written '….' vs API truncation — cannot use as signal - cookies.txt vs auth.json session management and sessionid loss - Playwright browser session expiry independent of API cookies - phi4-mini too strict for Italian recipe posts → gemma4 switch - gemma4 thinking model behavior with max_tokens: 1024 - Tandoor requires Step for ingredients to be saved - SvelteKit SSE: 3 bugs that caused phase updates to never reach UI - Gitea CI gotchas: Alpine Chromium, $env/dynamic/private, secrets - yt-dlp + Playwright split architecture rationale - Infrastructure reference table Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
105 KiB
Findings & Research Documentation
Last Updated: 2026-02-15T00:00:00.000Z
JIRA: RECIPE-0001
Status: Initialized
Purpose
This document tracks research findings, analysis results, and technical discoveries made during development. Each agent (Planner, Developer, Reviewer) appends findings as they work through the pipeline.
Initial Codebase Analysis
Language & Framework
- Primary Language: TypeScript 5.9.3
- Framework: SvelteKit 2.48.5 with Svelte 5.43.8
- Runtime: Node.js 22+
- Package Manager: npm
Project Type
Progressive Web Application (PWA) for extracting recipes from Instagram posts and uploading them to Tandoor Recipe Manager.
Architecture Style
Hexagonal Architecture (Ports and Adapters):
- Domain logic in
src/lib/server/ - External system adapters: Instagram, Tandoor, LLM, Browser
- Clear separation between client and server code
Key Technical Components
- Queue Management System: In-memory FIFO queue with async processing
- Three-Phase Pipeline: Extraction → Parsing → Uploading
- Real-Time Updates: Server-Sent Events (SSE) for progress tracking
- Push Notifications: Web Push API for background notifications
- PWA Features: Service worker, manifest, install prompts
Design Patterns Identified
- Singleton: QueueManager, QueueProcessor, PushNotificationService
- Factory: createLLM(), createBrowserContext(), initializeBrowser()
- Observer: Queue subscription system, SSE streaming
- Adapter: Instagram, Tandoor, LLM, Browser adapters
- Strategy: Multiple extraction methods with fallback
Dependencies Overview
Production (6 dependencies):
- Browser automation:
playwright - LLM integration:
openai - Utilities:
uuid,date-fns,zod
Development (26+ dependencies):
- Framework:
@sveltejs/kit,svelte,vite - Testing:
vitest,@vitest/browser-playwright - Styling:
tailwindcss - Tooling:
typescript,eslint,prettier
File Structure
52 total TypeScript/JavaScript files
├── 39 TypeScript files (.ts)
├── 10+ Svelte components (.svelte)
├── 3 JavaScript config files (.js)
└── Multiple test files (.spec.ts)
Code Quality Indicators
- Strict TypeScript:
strict: trueenabled - Comprehensive Testing: 138 tests across unit, integration, and browser tests
- Linting: ESLint with TypeScript and Svelte plugins
- Formatting: Prettier with Svelte and Tailwind plugins
- Type Safety: Zod schemas for runtime validation
Environment Configuration
Required variables:
OPENAI_API_KEY- LLM accessTANDOOR_URL- Recipe manager URL (optional)TANDOOR_TOKEN- API authentication (optional)QUEUE_CONCURRENCY- Processing limit (default: 2)QUEUE_MAX_RETRIES- Retry attempts (default: 3)
Deployment Setup
- Docker: Dockerfile with Node.js 22 Alpine + Chromium
- HTTPS: Local SSL certificates for PWA features
- Production: Node.js adapter for SvelteKit
Notable Features
- Multi-Method Extraction: 4-strategy cascade with intelligent fallback
- Progress Tracking: Real-time callbacks throughout extraction pipeline
- Thumbnail Validation: HTTP status code checking for image URLs
- Retry Logic: Configurable retry attempts for failed extractions
- Scheduler: Background task execution with authentication
Technical Debt & Opportunities
Identified Issues
- Deprecated Endpoints:
/api/extractreturns 410 Gone (migration helper) - In-Memory Queue: No persistence - items lost on server restart
- Single Instance: Queue state not shared across multiple server instances
Potential Improvements
- Queue Persistence: Redis or database-backed queue for durability
- Horizontal Scaling: Shared queue state for multi-instance deployments
- Rate Limiting: Instagram request throttling to avoid blocks
- Caching: Extracted content caching to reduce redundant processing
Research Findings
This section will be populated by the Planner agent during task analysis.
[Planner] Research Notes - RECIPE-0001 (2026-02-15)
Task: Fix model loading issue and frontend error display
Issue 1: Model Loading - "400 No models loaded"
Research Date: 2026-02-15
Source: Stack trace analysis, OpenAI SDK documentation, LM Studio/LiteLLM API patterns
Problem Analysis:
- Error occurs at
detectRecipe()in src/lib/server/parser.ts - OpenAI-compatible APIs (LM Studio, LiteLLM, Ollama, etc.) often require models to be explicitly loaded
- Current implementation assumes model is already loaded
- Error message contains provider-specific instructions ("use the 'lms load' command")
OpenAI-Compatible Model Loading Patterns:
- LM Studio: Uses
/v1/modelsendpoint to list available models- Loaded models appear in response with
"id": "model-name" - No programmatic loading endpoint (manual load in UI)
- Loaded models appear in response with
- LiteLLM: Uses
/v1/modelsto list loaded models- Models must be configured in server startup
- No dynamic loading endpoint
- Ollama: Uses
/api/tagsfor model list and/api/pullfor loading- Different API structure (not
/v1prefix)
- Different API structure (not
- Generic OpenAI-compatible: Most follow OpenAI's
/v1/modelsendpoint- No standard for dynamic model loading
- Usually require pre-configuration
Solution Approach:
- Check if model exists via
client.models.list() - If model not found/loaded, provide clear user-facing error
- Remove provider-specific error messages
- Add notification when model check succeeds
- Consider future enhancement: detect provider type and attempt auto-load if supported
Files Affected:
- src/lib/server/llm.ts - Add model availability check
- src/lib/server/parser.ts - Handle model not loaded error
- src/lib/server/queue/QueueProcessor.ts - User notification
Issue 2: Frontend Error Display - "[object Object]"
Research Date: 2026-02-15
Source: Code analysis of QueueItemCard.svelte, types.ts, QueueManager.ts
Problem Analysis:
- Error structure is an object:
{ phase, message, recoverable, timestamp } - Frontend displays
{item.error}directly (line 205 of QueueItemCard.svelte) - Svelte renders object.toString() → "[object Object]"
Current Implementation:
// types.ts - Error is an object
error?: {
phase: ProcessingPhase;
message: string;
recoverable: boolean;
timestamp: string;
}
// QueueItemCard.svelte line 205 - Displays object directly
<div class="text-sm text-red-700 mt-1">{item.error}</div>
Solution:
Change to: {item.error?.message || item.error}
- Handles object error (gets .message)
- Handles legacy string errors (fallback)
- Type-safe with optional chaining
Files Affected:
- src/routes/components/QueueItemCard.svelte - Display error.message
Dependencies & Constraints (from ARCHITECTURE.md)
- Using
openai@^4.20.0SDK - Environment:
OPENAI_BASE_URL,OPENAI_API_KEY,LLM_MODEL - Current config example:
http://192.168.1.10:1234/v1(LM Studio) - Must maintain OpenAI-compatible API contract
- No assumption about specific provider implementation
Code Style Requirements (from CODE_STYLE.md)
- Use SvelteKit
$env/dynamic/privatefor env vars (already correct) - Error handling: try-catch with descriptive messages
- Console logging:
[Component] Messageformat - Type safety: TypeScript strict mode enabled
[Developer] Implementation Notes
[Reviewer] Review Notes
API Endpoint Catalog
Active Endpoints
Queue Management
POST /api/queue- Enqueue Instagram URL for processingGET /api/queue- List queue items (supports filtering, pagination)GET /api/queue/stream- SSE stream for real-time updatesGET /api/queue/{id}- Get specific queue item detailsDELETE /api/queue/{id}- Remove item from queuePOST /api/queue/{id}/retry- Retry failed extraction
Push Notifications
POST /api/notifications/subscribe- Subscribe to push notificationsDELETE /api/notifications/subscribe- Unsubscribe from notificationsGET /api/notifications/vapid-key- Get VAPID public key
Health & Status
GET /api/health- Application health checkGET /api/llm-health- LLM service availability check
Tandoor Integration
POST /api/tandoor- Upload recipe to TandoorGET /api/tandoor-config- Get Tandoor configuration status
Legacy/Deprecated
POST /api/extract- ⚠️ Deprecated (returns 410 Gone)
Known Constraints
Browser Automation
- Requires Chromium/Chrome installation
- Headless mode used in production
- Cookie handling for authenticated Instagram content
LLM Integration
- Requires OpenAI-compatible API endpoint
- Configurable model selection
- Structured output using Zod schemas
Tandoor Integration
- Optional feature (disabled without credentials)
- Requires Tandoor API token
- Supports ingredient partitioning across steps
SSL Requirements
- HTTPS required for Service Worker registration
- Local development uses self-signed certificates
- Certificates managed via external Caddy CA
Testing Coverage
Test Distribution
- Unit Tests: Core logic validation
- Integration Tests: Multi-component workflows
- API Tests: Endpoint behavior verification
- Browser Tests: Svelte component rendering
Test Files
queue-manager.spec.tsqueue-processor.spec.tsqueue-api.spec.tsqueue-sse.spec.tsscheduler.spec.tsinstagram-url-validation.spec.tsthumbnail-validation.spec.tsextraction-url-validation.integration.spec.tspage.svelte.spec.ts
Mock Strategy
- Environment variables mocked via
vi.mock('$env/dynamic/private') - External services mocked at module level
- Browser automation mocked for unit tests
Documentation Inventory
Existing Documentation
README.md- Project overview and setupdocs/API.md- API endpoint specificationsdocs/MIGRATION.md- Migration guidesdocs/SVELTEKIT_SSR_GUIDE.md- SSR implementation notesdocs/TESTING.md- Testing guide and mocking patternsdocs/Tandoor (2.3.6).yaml- OpenAPI spec for Tandoor
Plan Documentation
docs/plans/ contains 20+ implementation plans:
- Execution plans for completed features
- Technical specifications
- Story breakdowns with acceptance criteria
Outcome Documentation
docs/outcomes/ contains 20+ outcome reports:
- Implementation summaries
- Changes made
- Testing results
- Lessons learned
Agent Pipeline Notes
Build Commands
- Build:
npm run build - Test:
npm test(alias fornpm run test:unit -- --run) - Dev:
npm run dev - Lint:
npm run lint - Format:
npm run format
Development Workflow
- Make changes in
src/ - Run tests:
npm test - Verify build:
npm run build - Test locally:
npm run dev
Continuous Integration
- ESLint checks code quality
- Prettier enforces formatting
- TypeScript checks type safety
- Vitest runs test suite
Next Steps
This document will be updated by subsequent agents:
- Planner: Append research findings and analysis
- Developer: Document implementation discoveries
- Reviewer: Record review observations and recommendations
[Planner] Research Notes - RECIPE-0002 (2026-02-16)
Task: Complete PWA implementation (installability, push notifications, share target)
PWA Documentation Research
Research Date: 2026-02-16
Sources: MDN Web Docs, web.dev, W3C specifications
Progressive Web Apps (PWA) - Key Requirements:
-
Web App Manifest (
manifest.json)- Required members:
nameorshort_name,icons(192x192 PNG minimum),start_url,display - Share target support via
share_targetmember (method, action, params) - Icons should include 192x192 and 512x512 sizes for optimal display
- Browser compatibility: Chrome/Edge (full), Firefox/Safari (limited for share_target)
- Required members:
-
Service Worker
- Must be registered to enable offline functionality
- Lifecycle: install → activate → fetch events
- Required for push notifications
- Must be served over HTTPS (or localhost)
-
HTTPS Requirement
- Mandatory for service worker registration
- Required for push notifications and other secure contexts
- Local development:
http://localhostis treated as secure
-
Installability Criteria (from MDN/web.dev):
- Valid manifest with required members
- Service worker registered with fetch event handler
- Served over HTTPS
- At least one 192x192 PNG or SVG icon
- Display mode set (fullscreen, standalone, minimal-ui)
Push Notifications (Web Push API):
- Requires service worker to receive push events
- VAPID authentication (application server keys) required for Chrome
- Subscription process: permission → subscribe → store subscription → send push
- Push service (browser vendor controlled) routes messages
- Notification permissions: default, granted, denied
- Best practice: request permission after user interaction
Web Share Target API:
- Registers PWA as share destination
- Configuration via manifest
share_targetmember - Supports GET or POST methods
paramsdefine query string mapping (title, text, url)- Files can be shared via POST with
multipart/form-data - Currently Chrome/Edge only (experimental)
- App must be installed to appear in share sheet
Current Implementation Analysis
Research Date: 2026-02-16
Files Analyzed: manifest.json, service-worker.ts, app.html, svelte.config.js, PWAInstallManager.ts, PushNotificationManager.ts
Manifest Analysis (static/manifest.json):
- ✅ Has all required PWA members (name, short_name, start_url, display, scope, theme_color, background_color)
- ✅ Share target configured correctly (GET /share with title/text/url params)
- ⚠️ Icons reference
/favicon.pngbut file does NOT exist in static folder - ⚠️ Uses same icon path for both 192x192 and 512x512 sizes
- ℹ️ Missing optional but recommended members:
description,screenshots,categories
Service Worker Analysis (src/service-worker.ts):
- ✅ Native SvelteKit service worker (migrated from vite-pwa plugin)
- ✅ Install event: caches all build assets and static files
- ✅ Activate event: cleans up old caches
- ✅ Fetch event: cache-first for assets, network-first with cache fallback for others
- ✅ Push event handler: processes push messages, shows notifications with actions
- ✅ Notification click handler: opens/focuses app, handles action buttons
- ✅ Notification close handler: tracks dismissals
- ✅ Background sync handler: supports retry operations
- ✅ Message handler: supports service worker communication
- ✅ Global error handlers present
Service Worker Registration (svelte.config.js):
- ✅
serviceWorker.register: trueenabled - ✅ SvelteKit handles registration automatically
Manifest Link (src/app.html):
- ✅
<link rel="manifest" href="/manifest.json">present in head
Client-Side Managers:
- ✅
PushNotificationManager.ts: Full implementation with permission, subscribe, unsubscribe - ✅
PWAInstallManager.ts: beforeinstallprompt handling, install prompt triggering - ✅ Both are SSR-safe with browser guards
Share Target (/share route):
- ✅ Route exists at
src/routes/share/+page.svelte - ✅ Parses query params (text, url) from share target
- ✅ Extracts Instagram URLs from shared text
- ✅ Auto-processes URLs on mount
- ✅ Enqueues items and redirects to dashboard
Icons/Assets Issue:
- ⚠️ CRITICAL:
manifest.jsonreferences/favicon.pngbut file doesn't exist - ✅
src/lib/assets/favicon.svgexists (used in layout) - ⚠️ No PNG icons in
static/folder - ⚠️ Service worker references
/favicon.pngfor notifications
Push Notifications Infrastructure:
- ✅ VAPID keys configured in
queueConfig.push(uses env vars or defaults) - ✅ Server endpoint:
/api/notifications/vapid-key(GET) - ✅ Server endpoint:
/api/notifications/subscribe(POST/DELETE) - ✅ PushNotificationService stores subscriptions in-memory
- ℹ️ Note: Subscriptions are not persisted (lost on restart)
What Works Already:
- PWA Structure: Complete Native SvelteKit PWA implementation
- Service Worker: Fully functional with caching, push, notifications
- Push Notifications: Client and server infrastructure in place
- Share Target: Configured in manifest and
/shareroute working - Install Prompts: PWAInstallManager ready to trigger install
- HTTPS: App served at https://localhost:5173/
What Needs Attention:
- Icons: Create PNG icons (192x192, 512x512) from existing SVG
- Icon Verification: Ensure icons are properly sized and optimized
- Installability Testing: Verify all criteria met via chrome://pwa-internals
- Push Notification Testing: Verify VAPID key generation and push flow
- Share Target Testing: Test share from external apps (Instagram)
- Manifest Enhancement: Add description, categories for better discoverability
Dependencies & Constraints (from ARCHITECTURE.md, CODE_STYLE.md):
- Using native SvelteKit PWA (no plugins needed)
- Service worker:
$service-workermodule provides build, files, version - Environment: uses
$env/dynamic/privatefor server configs - HTTPS required (already configured at https://localhost:5173/)
- TypeScript strict mode enabled
- All file paths must use SvelteKit path aliases (
$lib,$service-worker)
Code Style Requirements (from CODE_STYLE.md):
- FilesNaming: manifest.json, service-worker.ts, lowercase for utilities
- Type annotations required for public APIs
- SSR-safe code: all browser API usage must be guarded with
browsercheck - Error handling: try-catch with descriptive messages
- Comments: JSDoc for public APIs, inline for complex logic
[Planner] Research Notes - RECIPE-0003 (2026-02-16)
Task: Update application icon and configure Docker deployment
PWA Icon Generation - icon-source.png
Research Date: 2026-02-16
Source: Project analysis, PWA best practices, sharp documentation
Icon Source File:
- Location:
static/icon-source.png - Size: 672KB PNG file
- Format: PNG with transparency (confirmed via file analysis)
- Destination sizes: 192x192 (favicon.png), 512x512 (icon-512.png)
PWA Icon Requirements: From RECIPE-0002 research and W3C Web App Manifest specification:
- Minimum Size: 192x192 pixels (required for PWA installability)
- Recommended Size: 512x512 pixels (for splash screens, high-DPI displays)
- Format: PNG with transparency support
- Purpose: "any maskable" for optimal Android compatibility
- Location: static/ directory (served at root path)
Sharp Library Configuration:
- Version: 0.34.5 (already in dependencies)
- Method: resize() with fit: 'contain' to preserve aspect ratio
- Background: transparent (rgba 0,0,0,0)
- Format: PNG with optimization
- Quality: Default compression for web delivery
Implementation Pattern:
await sharp('static/icon-source.png')
.resize(192, 192, {
fit: 'contain',
background: { r: 0, g: 0, b: 0, alpha: 0 }
})
.png()
.toFile('static/favicon.png');
Rationale:
fit: 'contain'preserves aspect ratio without cropping- Transparent background maintains icon transparency
- PNG format required by Web App Manifest spec
- Same approach for both 192x192 and 512x512 variants
Docker Volume Configuration
Research Date: 2026-02-16
Source: Codebase analysis, Dockerfile, scheduler.ts, extraction.ts
Volume Requirements Analysis: From code analysis, only one persistent volume is required:
1. /app/secrets - Instagram Authentication Storage
- Purpose: Persist Instagram session cookies across container restarts
- File: auth.json (Playwright storage state)
- Usage:
- scheduler.ts: Checks
/app/secrets/auth.jsonfor Docker deployments - extraction.ts: Loads authentication from
/app/secrets/auth.json - gen-auth.js: Browser automation saves session to secrets/auth.json
- scheduler.ts: Checks
- Rationale: Prevents re-login on every container restart
- Docker Path: /app/secrets
- Host Path: ./secrets (relative to docker-compose.yml)
Volumes NOT Required:
- Database: Queue uses in-memory storage (QueueManager.ts)
- Cache: Service worker cache is ephemeral
- Uploads: No file upload functionality
- Logs: Console logs to stdout/stderr (Docker logging)
- Build artifacts: Built into image at build time
VOLUME Directive:
VOLUME ["/app/secrets"]
docker-compose.yml Volume Mount:
volumes:
- ./secrets:/app/secrets
Environment Variable Inventory
Research Date: 2026-02-16
Source: queue/config.ts, llm.ts, tandoor-config.ts, scheduler.ts
Comprehensive Variable List:
LLM Configuration (REQUIRED):
OPENAI_BASE_URL- OpenAI-compatible API endpointOPENAI_API_KEY- API authentication keyLLM_MODEL- Model identifier (default: gpt-4o)
Queue Configuration (OPTIONAL):
QUEUE_CONCURRENCY- Parallel processing limit (default: 2)QUEUE_MAX_RETRIES- Retry attempts (default: 3)
Tandoor Integration (OPTIONAL):
TANDOOR_ENABLED- Enable Tandoor upload (default: false)TANDOOR_SERVER_URL- Tandoor base URLTANDOOR_SPACE- Space ID (default: 1)TANDOOR_TOKEN- API token
Push Notifications (OPTIONAL):
VAPID_PUBLIC_KEY- Web Push public key (has default)VAPID_PRIVATE_KEY- Web Push private key (has default)
Authentication Scheduler (OPTIONAL):
AUTH_SCHEDULER_ENABLED- Enable auto-renewal (default: false)AUTH_SCHEDULER_INTERVAL_MINUTES- Renewal interval (default: 720)
Runtime Configuration:
NODE_ENV- Environment mode (production/development)PORT- SvelteKit port (default: 3000)DISPLAY- X11 display for Playwright (set to :99 in docker-compose.yml)
Default Values: All variables have sensible defaults except:
- OPENAI_BASE_URL (required)
- OPENAI_API_KEY (required)
VAPID Keys: Current defaults in queue/config.ts:
- Public: BNextdcB_fQ0BVvyGioM5L8Tf9vKQjs-WnF-rUbnU8MdWIZQYfggIHxBnW21I-lq_0HykLCdMpYj8d5joavWdxQ
- Private: JwxI_KcsBcehYcTOufMcbVWJjCq1QbH5FJmSyQuG680
- Note: These should be regenerated for production deployments
Variable Access Pattern:
- Server-side only: Uses
$env/dynamic/privatefrom SvelteKit - No client-side environment variable exposure
- Runtime configuration (no build-time substitution)
Docker Health Check Configuration
Research Date: 2026-02-16
Source: routes/api/health/+server.ts analysis
Health Check Endpoint:
- Path:
/api/health - Method: GET
- Response: 200 OK with JSON body
- Implementation:
src/routes/api/health/+server.ts
Health Check Response:
{
"status": "ok",
"timestamp": "2026-02-16T..."
}
Docker Health Check Configuration:
healthcheck:
test:
[
'CMD',
'node',
'-e',
"fetch('http://localhost:3000/api/health').then(r => r.ok ? process.exit(0) : process.exit(1)).catch(() => process.exit(1))"
]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
Rationale:
interval: 30s- Balance between responsiveness and overheadtimeout: 10s- Sufficient for app initializationretries: 3- Allow transient failuresstart_period: 40s- Accounts for Playwright browser initialization- Uses internal fetch to avoid curl dependency
Docker Deployment Constraints
Research Date: 2026-02-16
Source: Dockerfile, app.server.ts, browser.ts
Current Dockerfile Analysis:
- Base: node:22-alpine (minimal, production-ready)
- Chromium: Installed via apk (headless browser for Instagram extraction)
- Fonts: liberation-fonts, noto, noto-cjk (text rendering)
- Build: npm ci + npm run build
- Runtime: Node.js ESM import
- Port: 3000 (EXPOSE)
- Environment: NODE_ENV=production
Browser Initialization: From app.server.ts:
- initializeBrowser() called on server start
- Graceful shutdown handlers (SIGTERM, SIGINT)
- Critical for extraction.ts Playwright usage
Security Options:
seccomp=unconfined- Required for Chromium sandbox--no-sandboxin browser.ts launch args- Necessary for containerized Chromium
No Changes Required: Current Dockerfile is production-ready, only needs VOLUME addition.
[Planner] Research Notes - RECIPE-0003 Iteration 1 (2026-02-16)
Task: Fix Docker deployment issues (Alpine packages, Playwright installation)
Alpine Linux Font Packages
Research Date: 2026-02-16
Source: https://wiki.alpinelinux.org/wiki/Fonts, Alpine package database
Incorrect Package Names in Current Dockerfile:
liberation-fonts→ No such package (ERROR)noto→ No such package (ERROR)noto-cjk→ No such package (ERROR)
Correct Alpine Font Package Names:
font-liberation→ Correct (already in Dockerfile)font-noto→ Correct name for Noto fontsfont-noto-cjk→ Correct name for Noto CJK (Chinese, Japanese, Korean) fonts
Rationale:
- Alpine Linux uses
font-*prefix for all font packages - Common mistake: using Debian/Ubuntu package names which differ from Alpine
- These fonts are essential for rendering text in Instagram content extraction
Recommended Font Installation:
RUN apk add --no-cache \
chromium \
font-liberation \
font-noto \
font-noto-cjk
Playwright on Alpine Linux
Research Date: 2026-02-16
Source: https://playwright.dev/docs/docker, Playwright GitHub issues
Official Playwright + Alpine Status:
- Not officially supported: Browser builds require glibc, Alpine uses musl
- Firefox/WebKit: Cannot run on Alpine (glibc dependency)
- Chromium: Can work using system chromium package
Problem Analysis:
- Current Dockerfile installs system chromium via
apk add chromium - Playwright's
chromium.launch()expects Playwright's own Chromium binary - Playwright's Chromium is built for glibc environments (Ubuntu/Debian)
npx playwright install chromiumwill download glibc binary that won't run on Alpine
Solution: Configure Playwright to Use System Chromium
Approach A - Use System Chromium (Recommended):
// src/lib/server/browser.ts
browser = await chromium.launch({
executablePath: '/usr/bin/chromium-browser',
headless: true,
args: [...]
});
Environment Variable Approach:
ENV PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1
ENV PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH=/usr/bin/chromium-browser
Approach B - Switch to Debian Base:
FROM node:22-bookworm
RUN npx -y playwright@1.56.1 install --with-deps chromium
Recommendation:
- Use Approach A (system chromium with executablePath)
- Minimal changes to existing Alpine setup
- System chromium is already installed and working
- Avoids full base image migration
Chromium System Dependencies: When using system chromium on Alpine, these packages are auto-installed as dependencies:
- ca-certificates, mesa-gbm, wayland-libs-server, libxkbcommon
- ffmpeg-libs, gtk+3.0, libexif, libevent, nss, etc. (64 total dependencies)
Playwright Version Compatibility
Research Date: 2026-02-16
Source: package.json analysis
Current Version: playwright@1.56.1 (production dependency) Chromium Version: Bundled with Playwright 1.56.1
System Chromium Compatibility:
- Alpine edge: chromium 145.0.7632.75 (as of 2026-02-15)
- Playwright 1.56.1 expects: Chromium ~133.x
- Version mismatch OK: Playwright API is compatible across minor Chromium versions
- System chromium is newer, should work without issues
executablePath Configuration:
- Path on Alpine:
/usr/bin/chromium-browser - Must be set in browser.ts or via environment variable
- No additional Playwright installation needed when using system browser
Docker Compose Configuration for Playwright
Research Date: 2026-02-16
Source: resolution_context.yaml, docker-compose.yml analysis
Current Configuration Analysis:
environment:
- DISPLAY=:99 # X11 display (not needed for headless)
security_opt:
- seccomp=unconfined # Required for Chromium sandbox
Issues:
DISPLAY=:99set but no X11 server (Xvfb) running- Headless mode doesn't need DISPLAY
- docker-compose.yml has DISPLAY but it's unused
Recommendation:
- Keep
DISPLAY=:99as harmless fallback (no changes needed) seccomp=unconfinedis necessary for Chromium sandbox (keep as-is)- No additional configuration needed for Playwright
[Planner] Node.js Versions and npm Lockfile Compatibility - RECIPE-0003 Iteration 2 (2026-02-16)
Research Date: 2026-02-16T17:00:00.000Z
Source: Node.js Release Schedule, npm documentation (v10 & v11), Docker Hub
Problem Analysis
Docker build fails at npm ci with error: "package-lock.json and package.json are out of sync"
- Root Cause: package.json updated to Tailwind v4, but package-lock.json still contains Tailwind v3 dependencies (@csstools/*)
- Secondary Issue: npm version mismatch - local (npm 11.6.2) vs Docker (npm 10.9.4)
Node.js LTS Status Research
Source: https://github.com/nodejs/release, https://nodejs.org/en/about/previous-releases
Currently Supported Versions:
- Node.js 20 (Iron): Maintenance LTS - EOL 2026-04-30
- Node.js 22 (Jod): Maintenance LTS - EOL 2027-04-30 ← Current Dockerfile
- Node.js 24 (Krypton): Active LTS - EOL 2028-04-30 ← Best choice
- Node.js 25: Current (not LTS) - EOL 2026-06-01
LTS Phase Definitions:
- Current: Latest features, 6-month cycle for odd versions
- Active LTS: Audited features and updates (18 months for even versions since v12)
- Maintenance: Critical fixes only (12 months)
Conclusion: Node.js 24 is Active LTS (until Oct 2026) providing better support than Node.js 22 (already in Maintenance).
npm Lockfile Version Compatibility
Source: https://docs.npmjs.com/cli/v10/configuring-npm/package-lock-json, https://docs.npmjs.com/cli/v11/configuring-npm/package-lock-json
Lockfile Version History:
lockfileVersion: 1- npm v5-v6lockfileVersion: 2- npm v7-v8 (backwards compatible with v1)lockfileVersion: 3- npm v9+ (backwards compatible with v7)
npm Version Bundled with Node.js:
- node:22-alpine → npm 10.9.4 (uses lockfileVersion: 3)
- node:24-alpine → npm 11.x (uses lockfileVersion: 3)
- Local environment → npm 11.6.2 (uses lockfileVersion: 3)
Compatibility Analysis:
- Current package-lock.json has
"lockfileVersion": 3✓ - npm 10 and npm 11 both support lockfileVersion: 3 ✓
- The issue is NOT version incompatibility but stale dependency data
npm ci Strict Behavior:
npm ci performs strict validation:
- Requires exact match between package.json and package-lock.json
- Does not update lockfile automatically (unlike
npm install) - Fails if dependencies are missing or mismatched
- This is intentional for reproducible builds in CI/CD
Tailwind CSS v3 → v4 Migration Impact
Source: package.json analysis, package-lock.json inspection
Current State:
// package.json (Tailwind v4)
"@tailwindcss/vite": "^4.1.17",
"tailwindcss": "^4.1.17"
// package-lock.json (still has Tailwind v3 transitive deps)
"@csstools/css-parser-algorithms": "3.0.5",
"@csstools/css-tokenizer": "3.0.4"
Why This Happened:
- package.json was updated to Tailwind v4
- package-lock.json was NOT regenerated afterward
- Tailwind v4 has different dependency tree than v3 (no @csstools/*)
npm cidetects mismatch and fails
Solution Options Analysis
Option A: Regenerate with Docker node:22-alpine (Review's RECOMMENDED)
docker run --rm -v "$PWD":/app -w /app node:22-alpine sh -c "rm package-lock.json && npm install"
- ✓ Ensures exact npm version match with deployment
- ✗ Stays on Maintenance LTS (Node 22)
- ✗ Doesn't align with local development (node 24)
Option B: Update to node:24-alpine
FROM node:24-alpine
rm package-lock.json && npm install
- ✓ Uses Active LTS (better support)
- ✓ Aligns Docker with local development
- ✗ Changes base image (minimal risk)
Option C: Hybrid (BEST SOLUTION)
- Update Dockerfile to node:24-alpine
- Regenerate package-lock.json locally (npm 11.x matches node:24)
- ✓ Active LTS with longer support window
- ✓ Perfect alignment between local dev and Docker
- ✓ Single lockfile regeneration
- ✓ Future-proof (Active LTS until Oct 2026)
Chosen Approach: Option C
Implementation Details
Files to Modify:
Dockerfile- Change FROM node:22-alpine → node:24-alpinepackage-lock.json- Regenerate to sync with package.json
Verification Steps:
npm install- Regenerate lockfilenpm run build- Verify local buildnpm test- Verify all tests passdocker build- Verify Docker build succeedsdocker compose up- Verify runtime
No Code Changes Needed:
- All application code remains unchanged
- .env.example already complete (no new variables)
- docker-compose.yml does not need changes (node version transparent)
[Planner] Research Notes - RECIPE-0004 (2026-02-16)
Task: Fix .dockerignore, favicon.ico, push notifications, e2e tests, and logging serialization
.dockerignore Research
Research Date: 2026-02-16
Source: Project analysis, .gitignore comparison, Docker best practices
Current State:
- No
.dockerignorefile exists in project root .gitignoreexists and excludes: node_modules, build outputs, env files, SSL certs, symlinks, prompts/
Docker Build Context Issues:
Without .dockerignore, Docker sends entire workspace to build context including:
node_modules/(if exists locally) - causes conflicts withnpm ciin Dockerfilebuild/outputs - unnecessary.git/directory - large, unused in containerprompts/directory - development artifacts.envfiles - should use environment variables instead
Recommended .dockerignore Content:
Based on .gitignore and Docker best practices:
node_modules
.git
build
.output
.vercel
.netlify
.wrangler
.svelte-kit
.DS_Store
Thumbs.db
.env
.env.*
!.env.example
.ssl/
vite.config.*.timestamp-*
debug_page.txt
prompts/
*.md
!README.md
.github/
.vscode/
*.log
coverage/
.vitest/
Rationale:
- Exclude development dependencies and build artifacts
- Keep README.md for documentation
- Exclude version control metadata
- Reduce build context size significantly
- Prevent conflicts with Dockerfile's npm ci
Favicon 404 Error Research
Research Date: 2026-02-16
Source: Static folder analysis, browser behavior, PWA specifications
Files Present:
static/favicon.png(192x192 PNG) ✓ existsstatic/icon-512.png(512x512 PNG) ✓ existsstatic/icon-source.png(source file) ✓ existsstatic/manifest.jsonreferences both PNG files ✓
404 Source:
- Browsers automatically request
/favicon.ico(legacy format) - SvelteKit serves from
static/folder - No
favicon.icofile exists → 404 error
Solution Options:
Option A - Create favicon.ico (Recommended): Use Sharp to generate ICO from PNG source:
// New script: scripts/gen-favicon-ico.js
await sharp('static/icon-source.png').resize(32, 32).png().toFile('static/favicon.ico');
Option B - SvelteKit Hook Redirect: Add server hook to redirect /favicon.ico → /favicon.png
- More complex
- Adds runtime overhead
- Not recommended
Chosen Approach: Option A (generate favicon.ico during build)
Push Notifications Implementation Research
Research Date: 2026-02-16
Source: PushNotificationService.ts, web-push library docs, Web Push Protocol RFC 8030
Current Implementation Analysis:
Client-Side (Complete):
PushNotificationManager.ts- Full implementation ✓- Permission request ✓
- VAPID key fetch ✓
- pushManager.subscribe() ✓
- Server subscription registration ✓
service-worker.ts- Push event handler ✓NotificationSettings.svelte- UI toggle ✓
Server-Side (Mock Only):
// Current PushNotificationService.ts line 106-125
private async sendToSubscription(subscription: PushSubscription, data: any): Promise<void> {
// In production, use web-push library:
// [COMMENTED OUT CODE]
// For development, we'll log the notification
console.log(`[PushService] Would send push notification:`, {
endpoint: subscription.endpoint,
data: data
});
await new Promise(resolve => setTimeout(resolve, 100)); // Simulate
}
Problem: Push notifications are logged but never actually sent to browser.
Web Push Library Integration:
1. Install Dependency:
// package.json
{
"dependencies": {
"web-push": "^3.6.7"
}
}
2. Implementation Pattern:
import webpush from 'web-push';
// On init
webpush.setVapidDetails('mailto:your-email@example.com', vapidPublicKey, vapidPrivateKey);
// In sendToSubscription
await webpush.sendNotification(subscription, JSON.stringify(payload), {
TTL: 60 * 60 * 24 // 24 hours
});
3. Configuration Requirements:
- VAPID keys already configured in
queueConfig.push - Default keys present (should regenerate for production)
- Email contact required by spec (add env var)
Files to Modify:
package.json- add web-push dependencysrc/lib/server/notifications/PushNotificationService.ts- implement actual sendingsrc/lib/server/queue/config.ts- add VAPID_EMAIL env var
Manual Push Notification Test Button Research
Research Date: 2026-02-16
Source: NotificationSettings.svelte, PushNotificationService API
Current UI:
- Only has enable/disable toggle
- No manual trigger for testing different notification types
Test Button Requirements:
- Trigger different notification types:
- Success notification (recipe completed)
- Error notification (parsing failed)
- Progress notification (extraction in progress)
- Send to own subscription only
- Debug output showing notification payload
Implementation Approach:
Frontend Component:
Add to NotificationSettings.svelte:
<button onclick={testNotification('success')}>Test Success</button>
<button onclick={testNotification('error')}>Test Error</button>
<button onclick={testNotification('progress')}>Test Progress</button>
async function testNotification(type: 'success' | 'error' | 'progress') {
await fetch('/api/notifications/test', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ type })
});
}
Backend Endpoint:
New file: src/routes/api/notifications/test/+server.ts
export const POST: RequestHandler = async ({ request }) => {
const { type } = await request.json();
const payload = {
success: {
/* ... */
},
error: {
/* ... */
},
progress: {
/* ... */
}
}[type];
await pushNotificationService.sendNotification(payload);
return json({ success: true });
};
Playwright E2E Push Notification Testing Research
Research Date: 2026-02-16
Source: Playwright API docs (BrowserContext.grantPermissions), existing test patterns
Playwright Push Notification Testing Pattern:
Key Methods:
context.grantPermissions(['notifications'])- Grant permission without promptpage.evaluate()- Access PushManager in browser contextpage.waitForEvent()- Wait for service worker events
Test Structure:
// New file: src/tests/push-notifications.e2e.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Push Notifications E2E', () => {
test('should subscribe to push notifications', async ({ browser }) => {
const context = await browser.newContext();
await context.grantPermissions(['notifications']);
const page = await context.newPage();
await page.goto('http://localhost:5173');
// Click notification toggle
await page.getByRole('button', { name: /enable notifications/i }).click();
// Verify subscription created
const subscription = await page.evaluate(async () => {
const reg = await navigator.serviceWorker.ready;
return await reg.pushManager.getSubscription();
});
expect(subscription).toBeTruthy();
expect(subscription.endpoint).toBeDefined();
await context.close();
});
});
Test Coverage:
- Permission grant flow
- Subscription creation via PushManager
- Server registration (POST /api/notifications/subscribe)
- Manual test notification trigger
- Subscription persistence in localStorage
- Unsubscribe flow
Vitest Configuration: Current project uses Vitest with @vitest/browser-playwright:
- Already configured for browser tests
- Playwright already installed (playwright@^1.56.1)
- Pattern:
*.e2e.spec.tsfor e2e tests vs*.spec.tsfor unit tests
Logging Serialization Research
Research Date: 2026-02-16
Source: Codebase grep analysis, Node.js console behavior, error object structure
Problem Analysis:
Root Cause:
JavaScript error objects logged directly show [object Object]:
// Current pattern (WRONG)
console.error('[Label]', error); // Output: [Label] [object Object]
console.log('[Label]', data); // Output: [Label] [object Object]
Affected Files (25 matches found):
src/lib/server/extraction.ts- 12 occurrencessrc/lib/server/parser.ts- 4 occurrencessrc/lib/server/queue/QueueProcessor.ts- 3 occurrencessrc/lib/server/notifications/PushNotificationService.ts- 1 occurrencesrc/lib/server/api/errorHandler.ts- 1 occurrencesrc/lib/server/llm.ts- 2 occurrencessrc/lib/server/scheduler.ts- 1 occurrence- Others: QueueManager.ts, tandoor.ts
Solution Patterns:
1. Error Objects:
// GOOD - Extract relevant properties
console.error('[Label]', error.message, error.stack);
console.error('[Label] Error:', {
message: error.message,
stack: error.stack,
name: error.name
});
2. Complex Objects:
// GOOD - JSON.stringify with formatting
console.log('[Label] Data:', JSON.stringify(data, null, 2));
// GOOD - Specific properties
console.log('[Label] Response:', {
status: response.status,
statusText: response.statusText,
body: responseBody
});
3. Utility Function:
Create src/lib/server/utils/logger.ts:
export function serializeError(error: unknown): string {
if (error instanceof Error) {
return JSON.stringify(
{
name: error.name,
message: error.message,
stack: error.stack,
...error
},
null,
2
);
}
return JSON.stringify(error, null, 2);
}
console.error('[Label]', serializeError(error));
Testing Impact:
- Logs are visible in Docker deployments (stdout/stderr)
- JSON format easier for log aggregation tools
- Stack traces preserved for debugging
- Human-readable in console
[Planner] Research Notes - RECIPE-0004 Iteration 1 (2026-02-17)
Task: Fix TypeScript type error - NodeJS.Timer should be NodeJS.Timeout in scheduler.ts
Node.js Timer Types Research
Research Date: 2026-02-17
Source: Node.js v25.6.1 Official Documentation (https://nodejs.org/docs/latest/api/timers.html)
Problem Analysis:
TypeScript compile error in src/lib/server/scheduler.ts:180:
Argument of type 'Timer' is not assignable to parameter of type 'Timeout'
Type 'Timer' is missing the following properties from type 'Timeout':
close, _onTimeout, [Symbol.dispose]
Root Cause:
The SchedulerState interface incorrectly uses NodeJS.Timer type for intervalId, but setInterval() returns NodeJS.Timeout and clearInterval() expects NodeJS.Timeout parameter.
Official Node.js API Documentation:
Class: Timeout
- Returned by
setInterval()andsetTimeout() - Can be passed to
clearInterval()orclearTimeout() - Has methods:
ref(),unref(),hasRef(),close(),refresh(),[Symbol.toPrimitive](),[Symbol.dispose]() - TypeScript type:
NodeJS.Timeout
API Signatures:
// setInterval returns Timeout
function setInterval(callback: Function, delay?: number, ...args: any[]): NodeJS.Timeout;
// clearInterval expects Timeout
function clearInterval(timeout: NodeJS.Timeout | string | number): void;
NodeJS.Timer Type:
- Deprecated/incorrect type for timer return values
- Missing required properties:
close,_onTimeout,[Symbol.dispose] - Should NOT be used for
setInterval()/setTimeout()return types - Causes TypeScript strict mode errors when passed to
clearInterval()
Codebase Analysis:
grep -r "NodeJS.Timer" src/
src/lib/server/scheduler.ts:13 intervalId: NodeJS.Timer | null;
src/tests/fixtures.ts:151 let timers: NodeJS.Timer[] = [];
grep -r "NodeJS.Timeout" src/
src/routes/api/queue/stream/+server.ts:54 let keepAliveInterval: NodeJS.Timeout | null = null;
Findings:
-
Incorrect usage (2 occurrences):
src/lib/server/scheduler.ts:13— SchedulerState interfacesrc/tests/fixtures.ts:151— Timer array in test helper
-
Correct usage (1 occurrence):
src/routes/api/queue/stream/+server.ts:54— keepAliveInterval type
Solution:
Change all NodeJS.Timer to NodeJS.Timeout to align with Node.js official API contracts and TypeScript type definitions.
Files to Modify:
src/lib/server/scheduler.ts:13— Type in SchedulerState interfacesrc/tests/fixtures.ts:151— Type in createTimerSpy helper
Impact:
- Type-only change, no runtime behavior modification
- Fixes TypeScript strict mode compile error
- Aligns codebase with Node.js standard types
- Existing tests (260 total) already provide 100% coverage
References:
- Node.js Timers Documentation: https://nodejs.org/docs/latest/api/timers.html#class-timeout
- TypeScript @types/node package: Official Node.js type definitions
- Related Error: RECIPE-0004 iteration 0 review_report.yaml
Document Version: 1.7
Last Updated by: Planner Agent (RECIPE-0005 Iteration 0)
Next Update: Developer Agent
[Planner] Research Notes - RECIPE-0005 (2026-02-17)
Task: Fix Playwright Docker dependencies and create LMStudio integration for E2E testing
Playwright Alpine Linux Docker Integration - RECIPE-0005
Research Date: 2026-02-17
Source: FINDINGS.md (RECIPE-0003), Dockerfile analysis, browser.ts, Playwright documentation
Problem Analysis:
- Container fails with: "Executable doesn't exist at /root/.cache/ms-playwright/chromium_headless_shell-1208/"
- Alpine Linux uses musl libc, Playwright's bundled browsers require glibc
- Current Dockerfile installs system chromium via
apk add chromiumbut browser.ts doesn't specify executable path - Playwright API defaults to searching for its own bundled browser binary (not present)
Solution (Already Researched in RECIPE-0003): Configure Playwright to use system chromium installed by Alpine APK:
// src/lib/server/browser.ts - initializeBrowser()
browser = await chromium.launch({
executablePath: '/usr/bin/chromium-browser', // System chromium path
headless: true,
args: [
'--disable-blink-features=AutomationControlled',
'--disable-dev-shm-usage',
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-gpu'
]
});
Files to Modify:
src/lib/server/browser.ts- AddexecutablePath: '/usr/bin/chromium-browser'to launch options
No Changes Needed:
- Dockerfile already has
chromiumand fonts installed correctly - No need for
npx playwright install(would fail on Alpine anyway)
LMStudio Docker Networking - RECIPE-0005
Research Date: 2026-02-17
Source: Docker networking documentation, LMStudio API patterns, OpenAI-compatible endpoints
Problem:
- LMStudio runs on host at
http://localhost:1234 - Docker containers have isolated networking -
localhostinside container != hostlocalhost - Container needs to access host services
Docker Networking Solutions:
Option A - network_mode: host (Recommended for LMStudio):
services:
app:
network_mode: host
- Container shares host network stack
localhost:1234inside container = host'slocalhost:1234- Trade-off: Loses container network isolation, port mapping ignored
- Best for: Local development/testing with host services
Option B - extra_hosts (Alternative):
services:
app:
extra_hosts:
- 'host.docker.internal:host-gateway'
environment:
- OPENAI_BASE_URL=http://host.docker.internal:1234/v1
- Works on Docker Desktop (Mac/Windows) and Linux with Docker 20.10+
- Maintains container network isolation
- Trade-off: Requires changing OPENAI_BASE_URL from localhost
Chosen Approach: network_mode: host
- Rationale: Simplest for local LMStudio integration, no URL changes needed
- Tool mandate specifies "http://localhost:1234" must work
- Matches requirement for local development/testing setup
LMStudio + Gemma 3 Configuration - RECIPE-0005
Research Date: 2026-02-17
Source: .env.example, llm.ts, prompt.yaml tool mandates
Current Configuration:
OPENAI_BASE_URL=http://localhost:1234/v1
OPENAI_API_KEY=your-api-key-here
LLM_MODEL=google/gemma-3-4b
LMStudio API Compatibility:
- LMStudio provides OpenAI-compatible endpoint at
/v1 - Uses same API client: openai@^4.20.0
- Model identifiers match LMStudio's loaded model names
- API key can be any non-empty value (LMStudio doesn't validate in local mode)
Model Availability Check:
From prior research (RECIPE-0001), llm.ts already implements:
checkModelAvailability(model: string)- verifies model loaded viaclient.models.list()- Returns available models if specified model not found
- User must manually load model in LMStudio UI before running container
No Code Changes Needed:
- LLM integration already OpenAI-compatible
- Model check already implemented
- Only need environment variable configuration
Docker Compose Complete Configuration - RECIPE-0005
Research Date: 2026-02-17
Source: docker-compose.yml, .env.example, queueConfig, tandoorConfig
Required Changes:
- Add
network_mode: hostfor LMStudio access - Update LLM_MODEL default to
google/gemma-3-4b - Update .env.example defaults to match tool mandates
Current docker-compose.yml:
- Already has all environment variables configured
- Already has
./secrets:/app/secretsvolume mount - Already has healthcheck configured
- Already has
seccomp=unconfinedfor Chromium
Port Mapping with network_mode: host:
ports:section ignored when usingnetwork_mode: host- App will bind directly to host port 3000
- No conflicts expected (LMStudio uses 1234, app uses 3000, Tandoor external)
End-to-End Testing Strategy - RECIPE-0005
Research Date: 2026-02-17
Source: Test URL from prompt, queue system architecture
Test URL: https://www.instagram.com/reel/DP6oN7JCEo8/?utm_source=ig_web_button_share_sheet
Testing Workflow:
- Build Docker image:
docker-compose build - Start container:
docker-compose up - Verify LMStudio loaded Gemma 3 model:
http://localhost:1234/v1/models - Verify app health:
http://localhost:3000/api/health - Verify LLM health:
http://localhost:3000/api/llm-health - Enqueue test URL:
POST http://localhost:3000/api/queue - Monitor progress:
GET http://localhost:3000/api/queue/stream - Verify extraction succeeds with Gemma 3
- Check Tandoor upload (if configured)
Success Indicators:
- Chromium launches without "Executable doesn't exist" error
- LLM health check passes
- Extraction phase completes successfully
- Recipe parsing succeeds with Gemma 3
- All existing tests pass (
npm test)
Files Summary - RECIPE-0005
Modified Files:
src/lib/server/browser.ts- Add executablePath for Alpine chromiumdocker-compose.yml- Add network_mode: host, update LLM_MODEL default.env.example- Update LLM_MODEL default to google/gemma-3-4b
No Changes:
Dockerfile- Already correct (chromium + fonts installed)src/lib/server/llm.ts- Already OpenAI-compatiblesrc/lib/server/queue/config.ts- Already reads env vars correctly- Test files - All existing tests should pass
Testing:
- Manual E2E test with provided Instagram URL
- Verify in Docker container with LMStudio
- All unit tests must pass
Dependencies:
- User must have LMStudio running on host at localhost:1234
- User must manually load google/gemma-3-4b model in LMStudio
- Secrets volume must exist for Instagram auth (optional)
[Planner] Research Notes - RECIPE-0006 Iteration 1 (2026-02-17)
Task: Transform E2E test to unit test with mocked fixtures and fix extraction logic iteratively
Problem Analysis
Research Date: 2026-02-17T10:00:00.000Z
Source: review_report.yaml, extraction.ts analysis, test fixtures
Iteration 0 Failure:
- E2E test created but never executed during development
- User manually ran test and it FAILED
- Current output:
"16K likes, 325 comments - chef.antonio.la.cava on October 17, 2025: "La cacio e pepe..." - Expected output: Full recipe starting with
"La cacio e pepe infallibile di Luciano Monosilio 🍝"
Root Cause Analysis:
- DOM selectors failing: Lines 331-341 of extraction.ts try selectors but none match Instagram's current structure
- Fallback to og:description: Line 348-357 extracts from
<meta property="og:description">which contains metadata prefix - Regex cleanup insufficient: Line 356 tries to clean metadata with regex
^\d+K?\s+likes,\s+\d+\s+comments\s+-\s+[\w.]+\s+on\s+[^:]+:\s+but it's not removing the text properly
Current extractFromDOM() Flow:
1. Try selectors: article h1, article span[dir="auto"], article div[role="button"] + span, article span:not([aria-label])
→ All fail (return null or < 100 chars)
2. Fallback to og:description meta tag
→ Returns: "16K likes, 325 comments - username on date: caption..."
3. Apply metadata cleanup regex
→ Regex doesn't match properly (or matches but leaves quotes)
4. Pass to cleanText()
→ cleanText() removes hashtags but metadata prefix remains
Vitest Unit Testing for Playwright Mocking
Research Date: 2026-02-17T10:00:00.000Z
Source: TESTING.md, existing tests (queue-processor.spec.ts, scheduler.spec.ts)
Mocking Strategy: From TESTING.md and existing test patterns, Vitest provides module-level mocking:
// Mock entire module BEFORE imports
vi.mock('$lib/server/extraction', () => ({
extractTextAndThumbnail: vi.fn().mockResolvedValue({
bodyText: 'Mocked text',
thumbnail: 'https://example.com/thumb.jpg'
})
}));
For Unit Testing extractFromDOM():
- Cannot mock the entire
extraction.tsmodule (we're testing functions inside it) - Need to test internal functions directly (extractFromDOM, cleanText are not exported)
- Options:
- Export functions for testing (add
exportto extractFromDOM and cleanText) - Mock Playwright Page.evaluate() (mock the browser automation layer)
- Integration test with mocked browser context
- Export functions for testing (add
Chosen Approach: Export Internal Functions
- Cleanest separation of concerns
- Allows direct unit testing without browser overhead
- Follows existing pattern (extractTextAndThumbnail is already exported)
- Test Runtime: < 10ms (vs 30s for E2E test)
Test Structure:
// Unit test with fixtures
import { extractFromDOM, cleanText } from '$lib/server/extraction';
describe('Instagram Caption Extraction Unit Tests', () => {
it('should clean metadata prefix from og:description', async () => {
const input =
'16K likes, 325 comments - chef.antonio.la.cava on October 17, 2025: "La cacio e pepe...';
const expected = 'La cacio e pepe infallibile di Luciano Monosilio...';
// Create mock page that returns problematic og:description
const mockPage = {
evaluate: vi.fn().mockResolvedValue(input)
};
const result = await extractFromDOM(mockPage as any);
expect(result.bodyText).toBe(expected);
});
});
Metadata Prefix Regex Analysis
Research Date: 2026-02-17T10:00:00.000Z
Source: extraction.ts line 356, test fixtures
Current Regex (Line 356):
const cleanedContent = content.replace(
/^\d+K?\s+likes,\s+\d+\s+comments\s+-\s+[\w.]+\s+on\s+[^:]+:\s+/,
''
);
Test Against Actual Input:
Input: '16K likes, 325 comments - chef.antonio.la.cava on October 17, 2025: "La cacio e pepe...'
Pattern: '^\d+K?\s+likes,\s+\d+\s+comments\s+-\s+[\w.]+\s+on\s+[^:]+:\s+'
^----- Should match "16K likes, 325 comments - chef.antonio.la.cava on October 17, 2025: "
Issue: Pattern matches but leaves opening quote " after the colon.
Problems Identified:
- Pattern doesn't account for quotes after colon
- Date pattern
[^:]+is too greedy (matches "October 17, 2025") - Pattern assumes single space after colon, but actual format may have
": "(colon-space-quote)
Improved Regex:
// Match: "X likes, Y comments - username on date: " (with optional quote)
/^\d+K?\s+likes,\s+\d+\s+comments\s+-\s+[\w.]+\s+on\s+[^:]+:\s*["']?/;
Breakdown:
^\d+K?- Matches "16K" or "16" (K is optional)\s+likes,\s+\d+\s+comments- Matches " likes, 325 comments"\s+-\s+[\w.]+- Matches " - chef.antonio.la.cava" (alphanumeric + dots)\s+on\s+[^:]+:- Matches " on October 17, 2025:" (anything before colon)\s*- Optional whitespace after colon["']?- Optional quote character (single or double)
This should properly strip:
"16K likes, 325 comments - chef.antonio.la.cava on October 17, 2025: "→ (empty)
Files to Modify - RECIPE-0006 Iteration 1
Primary Changes:
-
src/lib/server/extraction.ts
- Export
extractFromDOMfor unit testing - Export
cleanTextfor unit testing - Fix metadata prefix regex in extractFromDOM() (line 356)
- Export
-
src/tests/instagram-caption-extraction.unit.spec.ts (NEW)
- Replace E2E test with unit test
- Mock page.evaluate() to return test fixtures
- Test both problematic and expected outputs
- Runtime < 100ms
-
src/tests/instagram-caption-extraction.e2e.spec.ts (MODIFY)
- Mark as
.skipor remove (replaced by unit test) - Keep file for future real-world validation (optional)
- Mark as
Dependencies:
- Vitest mocking (vi.fn(), mockResolvedValue)
- Test fixtures from context_compact.yaml
- No external libraries needed
Parallelization:
- All changes are independent
- Unit test can be written in parallel with extraction.ts fix
- Test validates fix iteratively
Document Version: 1.9
Last Updated by: Planner Agent (RECIPE-0008 Iteration 0)
Next Update: Developer Agent
[Planner] Research Notes - RECIPE-0008 (2026-02-17)
Task: Resolve npm package vulnerabilities and fix TypeScript strict mode errors
TypeScript Strict Mode Status Analysis
Research Date: 2026-02-17T22:15:00.000Z
Source: tsconfig.json, get_errors output, extraction.ts analysis
Current Configuration:
// tsconfig.json line 11
"strict": true
Status: ✅ TypeScript strict mode is ALREADY ENABLED
The task description says "Enable TypeScript strict mode (if not already enabled)" - it is already enabled. The real issue is fixing the compilation errors that exist.
Current TypeScript Errors: 7 errors in src/lib/server/extraction.ts
Error 1-5: bestCandidate Type Narrowing (Lines 632, 636, 641, 643)
Property 'score' does not exist on type 'never'.
Property 'text' does not exist on type 'never'.
Property 'innerHTML' does not exist on type 'never'.
Root Cause Analysis:
// Line 552-558: Type definition
let bestCandidate: {
element: Element;
text: string;
score: number;
innerHTML: string;
brCount: number;
} | null = null;
// Line 624-630: Null guard
if (!bestCandidate) {
return {
success: false,
error: 'No suitable caption span found',
text: ''
};
}
// Line 632: TypeScript cannot infer bestCandidate is non-null after guard
console.log(`[Extractor] Final caption candidate: score=${bestCandidate.score}, ...`);
// Error: Property 'score' does not exist on type 'never'
Why TypeScript Infers 'never':
- TypeScript's control flow analysis cannot track that
bestCandidateis non-null after the early return - The return statement exits the function, but TypeScript doesn't always narrow the type in the remaining scope
- This is a known limitation of TypeScript's type narrowing in complex control flow
Previous Attempt (RECIPE-0007 Iteration 1): Attempted fix using type assertion:
const candidate = bestCandidate as NonNullable<typeof bestCandidate>;
Result: FAILED - TypeScript still inferred 'candidate' as type 'never'
Correct Solution: Extract the inline type to a named type and use explicit type assertion after the guard:
// Define type at module level
type CaptionCandidate = {
element: Element;
text: string;
score: number;
innerHTML: string;
brCount: number;
};
// In function
let bestCandidate: CaptionCandidate | null = null;
// After null guard
if (!bestCandidate) {
return { success: false, error: 'No suitable caption span found', text: '' };
}
// Explicit assertion (TypeScript now knows it's safe)
const candidate: CaptionCandidate = bestCandidate;
// Use 'candidate' instead of 'bestCandidate' for remaining code
Alternative Solution (simpler): Use non-null assertion operator since we know it's safe after the guard:
console.log(`[Extractor] Final caption candidate: score=${bestCandidate!.score}, ...`);
Recommended: Use explicit typing to avoid ! operator proliferation (better code clarity).
Error 6: extractCaptionFromGraphQL Parameter Type Mismatch (Line 1224)
Argument of type 'string | null' is not assignable to parameter of type 'string | undefined'.
Type 'null' is not assignable to type 'string | undefined'.
Context:
// Line 1209: extractShortcode returns string | null
const expectedShortcode = extractShortcode(url);
// Line 1224: Pass to function expecting string | undefined
const captionData = extractCaptionFromGraphQL(json, expectedShortcode);
// Line 1084: Function signature
function extractCaptionFromGraphQL(data: any, expectedShortcode?: string): string | null;
Solution:
Convert null to undefined using nullish coalescing:
const captionData = extractCaptionFromGraphQL(json, expectedShortcode ?? undefined);
Why null vs undefined Matters:
- Optional parameters in TypeScript are
T | undefined, notT | null - Function signature uses
expectedShortcode?: stringwhich expands toexpectedShortcode: string | undefined extractShortcode()returnsstring | null, creating a type mismatch- Converting
null → undefinedaligns with TypeScript's optional parameter convention
Error 7: Invalid ExtractionMethod Literal 'graphql-intercept' (Line 1273)
Type '"graphql-intercept"' is not assignable to type 'ExtractionMethod | undefined'.
Context:
// Line 12: ExtractionMethod union type
export type ExtractionMethod =
| 'embedded-json'
| 'internal-state'
| 'html-section'
| 'dom-selector'
| 'graphql-api'
| 'legacy';
// Line 1273: Uses undeclared literal
onProgress?.({
type: 'complete',
message: 'Extraction completed via GraphQL interception',
method: 'graphql-intercept', // ❌ Not in union type
timestamp: new Date().toISOString()
});
Solution:
Add 'graphql-intercept' to ExtractionMethod union and getMethodDisplayName mapping:
// Line 12: Add to union
export type ExtractionMethod =
| 'embedded-json'
| 'internal-state'
| 'html-section'
| 'dom-selector'
| 'graphql-api'
| 'graphql-intercept'
| 'legacy';
// Line 117-125: Add to display name mapping
function getMethodDisplayName(method: ExtractionMethod): string {
const names: Record<ExtractionMethod, string> = {
'embedded-json': 'Embedded JSON',
'internal-state': 'Internal State',
'html-section': 'HTML Section',
'dom-selector': 'DOM Selector',
'graphql-api': 'GraphQL API',
'graphql-intercept': 'GraphQL Intercept', // Add this line
legacy: 'Legacy Parser'
};
return names[method];
}
Why This Method Exists:
- Line 1217-1233: Sets up GraphQL response interception
- Line 1268-1276: Uses intercepted caption if available
- This is a legitimate extraction strategy separate from 'graphql-api'
- Should be properly typed in the union
npm Package Vulnerabilities Analysis
Research Date: 2026-02-17T22:15:00.000Z
Source: package.json dependencies analysis
Current Dependencies:
Production (9 dependencies):
@types/uuid@^10.0.0- Type definitions (no vulnerabilities expected)date-fns@^4.1.0- Date utilities (latest major version)openai@^4.20.0- OpenAI SDK (recent version)playwright@^1.56.1- Browser automation (recent version)playwright-extra@^4.3.6- Playwright extensionspuppeteer-extra-plugin-stealth@^2.11.2- Stealth pluginsharp@^0.34.5- Image processing (latest)uuid@^13.0.0- UUID generation (latest major)web-push@^3.6.7- Push notifications (latest)zod@^3.23.0- Schema validation (latest)
Development (24+ dependencies):
- All framework and tooling dependencies are recent versions
- SvelteKit 2.x, Svelte 5.x, Vite 6.x, Vitest 4.x - all latest major versions
- TypeScript 5.9.3, ESLint 9.x, Prettier 3.x - all current
Vulnerability Research Strategy:
- Run
npm auditto identify current vulnerabilities - Analyze severity levels (critical, high, moderate, low)
- Check for automated fixes:
npm audit fix - For breaking changes:
npm audit fix --force(requires testing) - Manual updates for unfixable vulnerabilities
- Verify all tests pass after fixes
Expected Vulnerabilities: Based on dependency age analysis:
playwright-extra@^4.3.6- Last updated 2024, may have known issuespuppeteer-extra-plugin-stealth@^2.11.2- Depends on older puppeteer versions- Most other dependencies are recent and actively maintained
No Direct Audit Results Available:
- Cannot run
npm auditduring planning phase (tool restrictions) - Developer agent must run audit as first step
- Plan assumes vulnerabilities exist and need fixing
Verification Steps:
npm audit- Identify vulnerabilitiesnpm audit fix- Apply automatic fixesnpm test- Verify tests passnpm run build- Verify build succeedsnpx tsc --noEmit- Verify TypeScript compilation with no errors
No Manual Package Updates Needed:
- Wait for
npm auditresults to guide specific version updates - Avoid premature optimization by upgrading packages unnecessarily
- Follow semantic versioning rules (^ allows minor/patch updates)
[Planner] Research Notes - RECIPE-0008 Iteration 1 (2026-02-18)
Task: Fix 9 remaining TypeScript strict mode errors after iteration 0 completion
TypeScript Strict Mode Analysis
Research Date: 2026-02-18
Source: Review report analysis, type definition inspection, codebase pattern comparison
Context: Iteration 0 fixed 3 errors in extraction.ts. TASK-5 verification revealed 9 additional errors.
Error Distribution:
- src/routes/api/tandoor/+server.ts — 1 error
- src/lib/server/queue/QueueProcessor.ts — 1 error
- src/lib/server/notifications/PushNotificationService.ts — 1 error
- src/lib/client/PushNotificationManager.ts — 1 error
- src/tests/queue-processor.spec.ts — 5 errors
Research Findings:
1. SvelteKit API Route Type Pattern
File: src/routes/api/tandoor/+server.ts
Issue: Missing RequestHandler type annotation on POST function
Pattern Analysis:
- Searched all API routes in src/routes/api/
- Found 10+ routes using pattern:
export const POST: RequestHandler = async ({ request }) => {...} - Type import:
import type { RequestHandler } from './$types' - src/routes/api/tandoor/+server.ts is ONLY route missing this pattern
- Using function export
export async function POST({ request })causes implicit any in strict mode
Solution: Convert to const export with RequestHandler type annotation
References:
- src/routes/api/queue/+server.ts — Reference implementation
- src/routes/api/notifications/subscribe/+server.ts — Another example
2. QueueItem Error Object Structure
File: src/lib/server/queue/QueueProcessor.ts
Issue: Treating error object as string
Type Definition: src/lib/server/queue/types.ts
error?: {
phase: ProcessingPhase;
message: string;
recoverable: boolean;
timestamp: string;
}
Current Code (incorrect):
// Line 425 in sendPushNotification method
const errorMessage = item.error || 'Processing failed';
Problem: item.error is an object, not a string. The code should access item.error.message.
Correct Implementation:
const errorMessage = item.error?.message || 'Processing failed';
Context Analysis:
- src/lib/server/queue/QueueManager.ts correctly sets error object with all 4 properties
- Error structure used in 3 places: QueueManager.updateStatus, QueueProcessor error handler, frontend display
- Frontend (src/routes/components/QueueItemCard.svelte) uses
item.error?.messagecorrectly (fixed in RECIPE-0001)
3. web-push Package Type Definitions
File: src/lib/server/notifications/PushNotificationService.ts
Issue: import webpush from 'web-push' causes TypeScript error in strict mode
Research:
- Package: web-push@3.6.7 (current in package.json)
- npm search: No @types/web-push package exists
- DefinitelyTyped: No type definitions available
- Library actively maintained but lacks TypeScript support
Community Pattern:
- src/tests/push-notification-service.spec.ts already uses:
// @ts-expect-error - web-push doesn't have TypeScript types, but we mock it anyway import webpush from 'web-push'; - Pattern accepted: Use
@ts-expect-errorcomment to suppress import error - Justification: Package is stable, widely used, tested in production
Alternative Considered: Custom type definitions
Rejected: Out of scope for this JIRA. Would require:
- Defining interfaces for webpush.setVapidDetails, webpush.sendNotification
- PushSubscription structure mapping
- Error types (410 Gone, etc.)
- Estimated 50+ lines of type definitions
Solution: Add // @ts-expect-error comment above import, matching test file pattern
4. Mock Type Safety in Vitest Strict Mode
File: src/tests/queue-processor.spec.ts
Issue: Mock return values use as any or incorrect types
Specific Errors:
Error 1 (line 15): web-push sendNotification return type
// Current (incorrect)
sendNotification: vi.fn().mockResolvedValue({} as any);
// Actual signature: webpush.sendNotification returns Promise<void>
// Solution
sendNotification: vi.fn().mockResolvedValue(undefined);
Error 2 (line 209): extractRecipe null return violation
// Current (incorrect)
vi.mocked(extractRecipe).mockResolvedValue(null);
// Actual signature: extractRecipe(text: string): Promise<Recipe>
// Does not explicitly allow null return
// Solution: Reject promise instead of returning null
vi.mocked(extractRecipe).mockRejectedValue(new Error('Failed to parse recipe from extracted text'));
Remaining 3 errors: Similar pattern (mock return types not matching function signatures)
- Lines to be identified: Likely other .mockResolvedValue calls with type mismatches
- Pattern: Replace
as anywith proper types, ensure mocks match actual signatures
5. Parallelization Analysis All 5 files are independent:
- Different modules: API routes, queue processor, notifications, client, tests
- No shared compilation state
- No cross-file type dependencies for these specific changes
- Safe for parallel implementation
Verification Commands:
npx tsc --noEmit # Must show 0 errors
npm run build # Must succeed
npm test # 267/279 pass (10 pre-existing failures in extractFromDOM)
npm audit # Must show 0 vulnerabilities (preserved from iteration 0)
Files to Modify - RECIPE-0008 Iteration 0
Primary Changes:
-
src/lib/server/extraction.ts — Fix TypeScript strict mode errors
- Add
CaptionCandidatetype definition (module-level) - Fix
bestCandidatetype narrowing with explicit assertion - Fix
extractCaptionFromGraphQLparameter type (null → undefined) - Add
'graphql-intercept'toExtractionMethodunion - Add
'graphql-intercept'mapping togetMethodDisplayName()
- Add
-
package-lock.json (if needed) — Update after
npm audit fix- Depends on npm audit results
- May require manual version updates
- Regenerate lockfile if breaking changes needed
No Changes Needed:
tsconfig.json- strict mode already enabledpackage.json- dependencies are recent, await audit results- Test files - existing tests should validate fixes
Dependencies:
- extraction.ts TypeScript fixes are independent
- npm audit fixes depend on audit output (sequential)
- Build/test must run after all fixes
Parallelization:
- TypeScript error fixes: All 3 changes in extraction.ts are independent
- npm audit: Sequential (must run audit first, then apply fixes)
- Verification: Sequential (after all fixes applied)
[Planner] Research Notes - RECIPE-0009 (2026-02-18)
Task: Implement URL deduplication, automatic notification subscription, UI improvements, and notification redirect fix
Web Push API Permission Requirements - RECIPE-0009
Research Date: 2026-02-18
Source: W3C Push API Specification, MDN Web Docs, browser security policies, existing PushNotificationManager.ts implementation
Security Requirement:
Per W3C Push API specification, Notification.requestPermission() requires user gesture - cannot be called programmatically without user interaction.
Browser Behavior:
- Permission States:
"default"(not requested),"granted"(allowed),"denied"(blocked) - User Gesture Required: Click, tap, keypress triggers permission prompt
- No Automatic Subscription: Calling
requestPermission()on page load fails silently or throws error in strict mode - Best Practice: Attach to meaningful user action (button click preferred)
Implementation Pattern for "Automatic" Subscription:
Since true automatic subscription violates browser security policy, the approach is:
- Listen for first user interaction (click/touch) anywhere on page
- Check notification state: supported, not denied, not subscribed
- Call
pushNotificationManager.subscribe()on first interaction - Remove listener after first attempt (one-shot behavior)
Code Pattern:
function setupAutoSubscribe() {
const attemptSubscribe = async () => {
const state = pushNotificationManager.getState();
if (state.supported && state.permission !== 'denied' && !state.subscribed) {
await pushNotificationManager.subscribe();
}
};
// Listen for first user interaction
document.addEventListener('click', attemptSubscribe, { once: true });
document.addEventListener('touchstart', attemptSubscribe, { once: true });
}
Why This is "Best Practice" Automatic:
- Requires minimal user action (any click/touch, not explicit "Enable" button)
- Non-intrusive (happens in background after natural interaction)
- Complies with W3C security requirements
- Avoids annoying permission prompts on page load
- Mobile-friendly (touchstart event)
Alternative Approaches Considered:
- Prompt on page load — REJECTED: Violates security policy, creates poor UX
- Delay with setTimeout — REJECTED: Still violates user gesture requirement
- IntersectionObserver trick — REJECTED: Does not satisfy user gesture requirement
- Explicit "Enable Notifications" button — VALID but less automatic than requested
Conclusion: First-interaction subscription is the most automatic approach allowed by browser standards while maintaining user control.
References:
- W3C Push API: https://www.w3.org/TR/push-api/
- MDN Notification.requestPermission: https://developer.mozilla.org/en-US/docs/Web/API/Notification/requestPermission
- Existing implementation: src/lib/client/PushNotificationManager.ts
Queue URL Deduplication Strategy - RECIPE-0009
Research Date: 2026-02-18
Source: QueueManager.ts architecture analysis, types.ts interface definitions, existing queue operations
Current Queue Structure:
// QueueManager.ts line 44-45
private items: Map<string, QueueItem> = new Map();
- Storage:
Map<string, QueueItem>with UUID keys - No secondary index: URL lookups require linear search through values
- In-memory only: No persistence across server restarts
- Typical size: < 100 items (based on usage patterns)
Deduplication Requirements:
- Check if URL already exists in queue before creating new item
- If duplicate found: Return existing item, do NOT create new entry
- API layer: Respond with
duplicate: trueand existing item details - Message level: Info (not error) - duplicate is expected behavior
Implementation Approach:
Option A - Linear Search (Chosen):
findByUrl(url: string): QueueItem | undefined {
for (const item of this.items.values()) {
if (item.url === url) {
return item;
}
}
return undefined;
}
- Complexity: O(n) where n = queue size
- Performance: Acceptable for n < 100 (~1-2ms on modern hardware)
- Simplicity: No additional data structures, no risk of index desync
- Consistency: Single source of truth (items Map)
Option B - Secondary URL Index (Rejected):
private items: Map<string, QueueItem> = new Map();
private urlIndex: Map<string, string> = new Map(); // url -> id
- Complexity: O(1) lookup, but requires maintaining two structures
- Risk: Index desync if remove() doesn't clean both Maps
- Overhead: 2x memory for keys, more complex implementation
- Benefit: Marginal for queue size < 1000
Design Decision: Option A (linear search) chosen for simplicity and reliability at current scale.
API Response Format:
// Duplicate detected
{
duplicate: true,
message: "This recipe is already in the queue",
item: { id, url, status, enqueuedAt }
}
// New item
{
duplicate: false,
item: { id, url, status, enqueuedAt }
}
User Experience:
- Frontend checks
response.duplicate === true - Shows info toast: "This recipe is already in queue [View]"
- No error state, no failed request
- Links to existing queue item
Edge Cases Handled:
- Multiple rapid requests: First wins, rest return duplicate
- URL normalization: URLs compared as-is (no normalization in v1)
- Completed items: Duplicates found even if status is success/error
- Retry scenario: Retry uses existing queue item ID, not new URL submission
Future Considerations:
- URL normalization (trailing slash, query params, fragments)
- Time-based deduplication window (only check items from last N hours)
- Content-based deduplication (recipe fingerprint from parsed data)
References:
- QueueManager implementation: src/lib/server/queue/QueueManager.ts
- QueueItem type definition: src/lib/server/queue/types.ts
Service Worker Notification Data Flow - RECIPE-0009
Research Date: 2026-02-18
Source: Code analysis of notification pipeline from QueueProcessor → PushNotificationService → Service Worker
Notification Payload Journey:
Step 1: QueueProcessor sends notification (Line 418-420)
await pushNotificationService.notifySuccess(
item.id,
item.results?.recipe?.name,
item.results?.tandoorUrl // ← tandoorUrl passed here
);
Step 2: PushNotificationService creates payload (Lines 162-181)
const payload: NotificationPayload = {
type: 'success',
itemId,
recipeName,
body: recipeName ? `Recipe "${recipeName}" has been extracted...` : ...,
tag: `recipe-success-${itemId}`,
requireInteraction: true,
analytics: { ... }
};
if (tandoorUrl) {
payload.body += ' View it in Tandoor.';
// Note: tandoorUrl NOT explicitly added to payload object
}
Issue Found: tandoorUrl parameter received but not stored in payload object!
Step 3: Service Worker receives push event (Line 123)
data = event.data.json(); // ← Payload becomes data object
Step 4: Notification created with data (Lines 130-136)
const options: NotificationOptions = {
body: data.body,
data: data, // ← Full payload stored in data field
// ...
};
Step 5: Click handler accesses data (Line 183-191)
const data = event.notification.data;
const action = event.action;
if (action === 'view' && data?.itemId) {
url = `/?highlight=${data.itemId}`;
}
Current Bug: data.tandoorUrl is undefined because PushNotificationService.notifySuccess() doesn't add it to payload.
Fix Required in PushNotificationService.ts (Line 162-181):
const payload: NotificationPayload = {
type: 'success',
itemId,
recipeName,
tandoorUrl, // ← Add this line
body: recipeName ? `Recipe "${recipeName}" has been extracted...` : ...,
tag: `recipe-success-${itemId}`,
requireInteraction: true,
analytics: { ... }
};
Then Service Worker Can Use It:
if (action === 'view' && data?.tandoorUrl) {
url = data.tandoorUrl; // Redirect to Tandoor
} else if (action === 'view' && data?.itemId) {
url = `/?highlight=${data.itemId}`; // Fallback to dashboard
}
NotificationPayload Interface Update Required:
// Line 20-28 in PushNotificationService.ts
interface NotificationPayload {
title?: string;
body: string;
type: 'success' | 'error' | 'progress';
itemId: string;
recipeName?: string;
tandoorUrl?: string; // ← Add this line
tag?: string;
requireInteraction?: boolean;
analytics?: any;
}
Verification:
- QueueProcessor already passes
item.results?.tandoorUrlcorrectly item.results.tandoorUrlis set by QueueProcessor line 329-331 when Tandoor upload succeeds- Format:
${TANDOOR_BASE_URL}/view/recipe/${recipeId} - Example:
https://tandoor.example.com/view/recipe/123
References:
- QueueProcessor notification call: src/lib/server/queue/QueueProcessor.ts
- PushNotificationService: src/lib/server/notifications/PushNotificationService.ts
- Service Worker push handler: src/service-worker.ts
- Service Worker click handler: src/service-worker.ts
Homepage UI Component Visibility Analysis - RECIPE-0009
Research Date: 2026-02-18
Source: +page.svelte component structure analysis
Current Behavior:
Add Recipe Component Locations:
- Empty State (Lines 280-302): Shows when
!loading && filteredItems.length === 0
{#if !loading && filteredItems.length === 0}
<div class="text-center py-12">
<!-- ... -->
<a href="/share" class="...">
Add Recipe URL
</a>
</div>
{/if}
- No Persistent Component: When queue has items, no "Add Recipe" button visible
User Complaint: "Do not hide the add recipe component when there are items in the queue"
Issue: Add recipe link only appears in empty state conditional block.
Solution: Add persistent "Add Recipe" button to action bar (always visible)
Implementation Location: Lines 224-254 (Action Bar section)
Before:
<div class="mb-6 flex flex-col sm:flex-row gap-4 justify-between items-start sm:items-center">
<div class="flex flex-wrap gap-2">
<!-- Filter Tabs -->
{#each filters as filterOption}
<button>...</button>
{/each}
</div>
<!-- Refresh Button -->
<button>...</button>
</div>
After:
<div class="mb-6 flex flex-col sm:flex-row gap-4 justify-between items-start sm:items-center">
<div class="flex items-center gap-4">
<!-- Filter Dropdown -->
<select>...</select>
<!-- Refresh Button -->
<button>...</button>
</div>
<!-- Add Recipe Button (ALWAYS VISIBLE) -->
<a href="/share" class="...">
Add Recipe URL
</a>
</div>
Benefits:
- Always accessible regardless of queue state
- Consistent UI (no disappearing elements)
- Better UX for power users (add multiple recipes quickly)
- Maintains empty state link for discoverability
Filter Consolidation Rationale:
Current filter tabs take significant horizontal space (5 buttons). Consolidating to dropdown:
- Frees space for persistent "Add Recipe" button
- Keeps filter + refresh on same row (per requirement)
- Mobile-friendly (dropdown vs. wrapping buttons)
- Still shows item counts in dropdown options
References:
- Homepage component: src/routes/+page.svelte
- Empty state section: src/routes/+page.svelte
[Planner] Research Notes - RECIPE-0009 Iteration 1 (2026-02-18)
Task: UI enhancements - footer status bar, icon-only buttons, toggle Add Recipe visibility
Current Homepage UI Structure Analysis
Research Date: 2026-02-18
Source: Analysis of src/routes/+page.svelte, iteration 0 implementation
Current Implementation (Iteration 0):
-
Connection Status Widget (lines 369-383):
- Fixed position: bottom-right (
fixed bottom-4 right-4) - Shows connection status with colored dot + text label
- Shows last ping timestamp
- Will be REMOVED and replaced with footer bar
- Fixed position: bottom-right (
-
Action Bar (lines 263-297):
- Filter dropdown (lines 266-276)
- Refresh button with icon + text (lines 277-285)
- Add Recipe button with icon + text (lines 288-297)
- Currently: Add Recipe button ALWAYS visible (iteration 0 requirement)
-
Empty State (lines 310-342):
- Shows when
!loading && filteredItems.length === 0 - Contains "Add Recipe URL" link
- Shows when
Changes Required for Iteration 1:
- Remove floating connection status widget
- Add footer status bar (icons only)
- Convert refresh button to icon-only
- Convert Add Recipe button to icon-only
- Toggle Add Recipe button visibility (hide when empty, show when has items)
Footer Status Bar Design - RECIPE-0009 Iteration 1
Research Date: 2026-02-18
Source: Web PWA patterns, existing codebase styling patterns
Design Requirements:
- Position: Fixed at bottom (
fixed bottom-0 left-0 right-0) - Layout: Full width with max-width container matching page layout (
max-w-6xl) - Content: Two sections (notification status left, live updates right)
- Display: Icons only, no text labels
- Accessibility: title and aria-label attributes on interactive elements
- Z-index:
z-50to ensure visibility above all content - Visual: White background, top border, shadow for lift effect
State Integration:
Footer needs access to two state sources:
-
Notification Status: Via
pushNotificationManager.getState()- Need to add
notificationViewModelstate variable in +page.svelte - Subscribe to state changes in
onMount - Cleanup subscription in
onDestroy
- Need to add
-
Connection Status: Already exists as
connectionStatusstate- Reuse existing variable
- States: 'connecting' | 'connected' | 'disconnected'
Notification Icon Logic:
if (!supported || permission === 'denied') {
// Show bell with slash (not supported/denied)
icon = 'bell-slash';
color = 'text-gray-400';
} else if (subscribed) {
// Show bell icon (enabled)
icon = 'bell';
color = 'text-green-600';
} else {
// Show bell icon (available but not enabled)
icon = 'bell';
color = 'text-gray-400';
}
Live Update Indicator Logic:
if (connectionStatus === 'connected') {
dotColor = 'bg-green-400';
title = 'Live updates active';
} else if (connectionStatus === 'connecting') {
dotColor = 'bg-yellow-400';
title = 'Connecting to live updates...';
} else {
dotColor = 'bg-red-400';
title = 'Live updates disconnected';
}
Click Behavior:
Clicking notification icon scrolls to NotificationSettings component:
onclick={() => {
document.querySelector('[data-notification-settings]')?.scrollIntoView({ behavior: 'smooth' });
}}
Requires adding data-notification-settings attribute to NotificationSettings wrapper.
Icon-Only Button Patterns - RECIPE-0009 Iteration 1
Research Date: 2026-02-18
Source: Existing codebase button styles, Tailwind CSS documentation, WCAG 2.1 guidelines
Current Button Pattern (with text):
<button class="flex items-center space-x-2 px-4 py-2 ...">
<svg class="w-4 h-4" ... />
<span>Button Text</span>
</button>
- Padding:
px-4 py-2(horizontal + vertical) - Icon size:
w-4 h-4(16x16px) - Spacing:
space-x-2(gap between icon and text)
Icon-Only Button Pattern:
<button
title="Button description"
aria-label="Button description"
class="p-2 ..."
>
<svg class="w-5 h-5" ... />
</button>
Changes:
- Padding:
p-2(square/circular button) - Icon size:
w-5 h-5(20x20px - slightly larger for better visibility) - Remove:
space-x-2class (no text to space from) - Add:
titleattribute (tooltip on hover) - Add:
aria-labelattribute (screen reader accessibility)
Accessibility Requirements (WCAG 2.1):
- Title Attribute: Provides tooltip text for sighted users on hover
- Aria-label Attribute: Provides accessible name for screen readers
- Minimum Touch Target: 24x24px recommended (20x20px icon + 8px padding = 36x36px total ✓)
- Color Contrast: Must meet 3:1 ratio for non-text (icons)
Examples:
Refresh button:
<button
title="Refresh queue"
aria-label="Refresh queue"
class="p-2 bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200 ..."
>
<svg class="w-5 h-5" ... />
</button>
Add Recipe button:
<a
href="/share"
title="Add recipe URL"
aria-label="Add recipe URL"
class="p-2 bg-blue-600 text-white rounded-lg hover:bg-blue-700 ..."
>
<svg class="w-5 h-5" ... />
</a>
Add Recipe Button Visibility Logic - RECIPE-0009 Iteration 1
Research Date: 2026-02-18
Source: context_compact.yaml requirement analysis, UX patterns
Iteration 0 Implementation:
- Add Recipe button ALWAYS visible in controls bar
- Rationale: User complained "do not hide the add recipe component when there are items in the queue"
Iteration 1 Requirement:
"Toggle "Add Recipe" button visibility in controls bar (hide when queue empty, show when items exist - opposite of placeholder rule)"
Interpretation:
"Opposite of placeholder rule":
- Placeholder (empty state) shows when:
items.length === 0 - Add Recipe button in controls shows when:
items.length > 0(opposite condition)
Logic:
{#if items.length > 0}
<a href="/share" title="Add recipe URL" aria-label="Add recipe URL" ...>
<svg ... />
</a>
{/if}
Rationale:
- Empty State: When queue is empty, user sees empty state with centered "Add Recipe URL" link
- Non-Empty State: When queue has items, controls bar shows Add Recipe button (icon-only)
- No Redundancy: Button doesn't appear when empty state link is already visible
- Consistent Access: User always has access to "Add Recipe" via either empty state link OR controls bar button
UX Benefits:
- Cleaner UI when queue is empty (no redundant button)
- Convenient access when queue has items (quick add more recipes)
- Fulfills opposite condition of empty state placeholder
Svelte 5 Notification State Management
Research Date: 2026-02-18
Source: Existing iteration 0 implementation, PushNotificationManager.ts
NotificationState Type:
interface NotificationState {
supported: boolean;
permission: NotificationPermission; // 'default' | 'granted' | 'denied'
subscribed: boolean;
loading: boolean;
error: string | null;
}
State Subscription Pattern:
// Import type
import type { NotificationState } from '$lib/client/PushNotificationManager';
// Declare state
let notificationViewModel = $state<NotificationState | null>(null);
// Subscribe in onMount
onMount(() => {
// ... existing code ...
const unsubscribeNotifications = pushNotificationManager.onStateChange((newState) => {
notificationViewModel = newState;
});
return () => {
unsubscribeNotifications?.();
};
});
Cleanup in onDestroy:
Current onDestroy only cleans up eventSource. Need to also cleanup notification subscription:
onDestroy(() => {
if (eventSource) {
console.log('[SSE] Closing connection on component destroy');
eventSource.close();
connectionStatus = 'disconnected';
}
// No cleanup needed - handled by onMount return callback
});
Note: Svelte 5's onMount return function handles cleanup automatically when component unmounts.
State Access in Footer:
Footer component needs null-safe access since initial state is null:
{#if notificationViewModel}
{#if !notificationViewModel.supported || notificationViewModel.permission === 'denied'}
<!-- Show disabled icon -->
{:else if notificationViewModel.subscribed}
<!-- Show enabled icon -->
{:else}
<!-- Show available icon -->
{/if}
{:else}
<!-- Loading state - show gray icon -->
<svg class="w-5 h-5 text-gray-400" ... />
{/if}
Initial State Handling:
pushNotificationManager.onStateChange() sends initial state immediately on subscription, so notificationViewModel will be populated almost instantly after component mount.
Document Version: 3.0
Last Updated by: Planner Agent (RECIPE-0009 Iteration 1)
Next Update: Developer Agent
Session Findings: Instagram Extraction & Production Lessons
Recorded during active development sessions (2025–2026). These are hard-won discoveries from real debugging — not theoretical analysis.
Instagram: Caption Truncation in Web GraphQL API
Symptom: LLM says "no recipe found" even though the full recipe IS in the Instagram caption.
Root cause: Instagram's web GraphQL API (doc_id=8845758582119845) silently truncates captions in edge_media_to_caption.edges[0].node.text. Truncation is inconsistent:
- Sometimes ends with
….(Unicode U+2026 + period) - Sometimes cuts off mid-sentence with no marker at all
Known examples:
DWWxiymssxE: GraphQL returns 327 chars, full caption is 393 chars (no truncation marker)DXT73izCBoH: GraphQL returns 744 chars, cuts off mid-sentence"Versa nella tortiera co'"
Fix: Never trust the GraphQL-intercepted caption. Always use DOM extraction (extractWithStrategies → extractFromHTMLSection → tryExpandCaptionInHTMLSection clicks "… more" button). Keep the intercepted GraphQL caption only as an emergency fallback when DOM extraction fails entirely.
Key lesson: The …. suffix check is not sufficient to detect truncation. The only reliable approach is to always go through the DOM.
Instagram: Mobile API vs GraphQL API (yt-dlp behavior)
How yt-dlp selects which API to call:
- If
sessionidcookie present → callshttps://i.instagram.com/api/v1/media/{PK}/info/(mobile API) - If mobile API fails (or no sessionid) → falls back to GraphQL
doc_id=8845758582119845
Mobile API User-Agent:
- Desktop UA → HTTP 404
- Instagram Android UA → HTTP 200 with full response
- The
--user-agentCLI flag only affects video download requests, not API calls — yt-dlp uses its own hardcoded headers for API calls
Mobile API also truncates: Even with a valid sessionid and HTTP 200, caption.text in the mobile API response can still be truncated. DOM extraction is the only fully reliable source.
Shortcode → PK conversion:
def shortcode_to_pk(sc):
alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_'
n = 0
for c in sc: n = n * 64 + alphabet.index(c)
return n
Instagram: Creator-Written …. vs API Truncation
Gotcha: Some creators intentionally end their captions with …. or #seriesname…. as a signature or series marker. This is NOT API truncation.
Example: Reel DW5zH3xjY-_ ("5030 LOW CAL 💪") — the …. is written by the creator as a series signature. The reel has only 213 chars of real content and no recipe.
Implication: Never use …. suffix as the primary signal to fetch more content — always use DOM extraction regardless.
Instagram: cookies.txt vs auth.json — Session Management
Two auth formats coexist:
secrets/auth.json— PlaywrightstorageStateformat (JSON, cookies + origins)secrets/cookies.txt— Netscape format for yt-dlp
yt-dlp overwrites cookies.txt after each extraction, removing sessionid. The next run regenerates it from auth.json via maybeConvertAuthJson() before each call. This is safe in normal operation — but inspecting cookies.txt directly between runs will show a reduced file.
sessionid is critical. Without it:
- yt-dlp mobile API returns HTTP 404 (empty response)
- Falls back to GraphQL → truncated caption
Auth scheduler: scheduler.ts runs every 15 minutes to renew the session by navigating to Instagram. Verify with logs: [Scheduler] Instagram authentication renewed successfully.
Instagram: Playwright Browser Session Expiry (independent of cookies)
Symptom: Playwright navigates to Instagram, sees a profile selector ("Continue as …"), clicks Continue, gets redirected to /accounts/login/.
Root cause: The sessionid cookie is valid for API calls but the browser-level session can expire independently. Instagram shows the profile selector as a soft prompt which, when clicked, triggers a re-auth that fails with a stale session.
Diagnosis:
svg[aria-label="Home"]found → session valid ✅(N) Instagramin title with notifications count → logged in ✅- Profile selector visible → session expired, need to re-authenticate
Fix: Re-authenticate by updating auth.json with a fresh login from a real browser session and copying to the volume at /home/moze/Server/stacks/insta-recipe/data/secrets/auth.json.
Instagram: DOM Extraction Strategy Order (2025/2026)
extractWithStrategies tries 6 approaches in order. Only one reliably works now:
| Strategy | Status | Reason |
|---|---|---|
embedded-json |
❌ Fails | Instagram removed window.__additionalDataLoaded |
internal-state |
❌ Fails | Instagram removed window._sharedData |
html-section |
✅ Works | DOM extraction + "… more" button click |
dom-selector |
⚠️ Partial | Simpler DOM query, may miss truncated captions |
graphql-api |
⚠️ Truncated | Live interception but caption is still truncated |
legacy |
❌ Fails | Old format gone |
Note: Clicking "… more" triggers feed-loading GraphQL calls (xdt_api__v1__clips__home__connection_v2) as a side effect. The full text comes purely from the expanded DOM, not a network response.
LLM: phi4-mini Recipe Detection Too Strict
Problem: phi4-mini rejected valid Italian Instagram recipe posts as "no recipe found" during detection.
Root cause: Detection prompt required quantities + at least 2 steps. Italian Instagram posts often:
- Omit explicit quantities (just list ingredients by name)
- Say "full recipe at link in bio" with no steps at all
Detection prompt evolution:
- v1: title + 3 ingredients with quantities + 2 steps
- v2: title + 3 ingredients (no quantities) + 1 step
- v3 (current): title + 2 ingredients, NO step requirement
Lesson: If it reads like food content with at least 2 named ingredients, say yes.
LLM: gemma4 Thinking Models Behavior
gemma4 models on llama-swap (http://192.168.1.50:8080):
gemma4-e2b-q8_0— smaller/fastergemma4-e4b-q6k— better quality (production model)gemma4-26b-moe-iq4xs,granite-3.3-8b-q6k,deepseek-r1-8b-q6kalso available
gemma4 is a "thinking" model: Outputs internal reasoning before the actual answer.
With max_tokens: 1024: Model skips most reasoning and puts the answer directly in content. The reasoning_content fallback in parser.ts covers edge cases where content is empty.
vs phi4-mini: phi4-mini is more literal and strict. For permissive recipe detection of Italian informal posts, gemma4 is significantly better.
Tandoor: Steps Required to Save Ingredients
Symptom: Recipe saved to Tandoor has no ingredients even though parsing succeeded.
Root cause: Tandoor requires at least one Step for ingredients to be associated. When recipe.steps is null/empty:
// Old code — creates stepCount=1 but no actual step:
const stepCount = recipe.steps?.length || 1;
(recipe.steps || []).map(...) // returns [] → all ingredients lost
Fix in tandoor.ts buildTandoorRecipeDTO(): When recipe.steps is null or empty, create a placeholder:
const steps = (recipe.steps?.length ? recipe.steps : ['Vedi la ricetta completa al link in bio.']);
SvelteKit SSE: Phase Updates Never Reaching UI
Symptom: Processing animation showed "Prepping" throughout, then jumped straight to done.
Three root causes found:
-
updateQueueItemnever setcurrentPhase: Spreading...items[idx]but never applyingupdate.phase. Fix:currentPhase: update.phase ?? prev.currentPhase -
Progress events silently discarded: SSE
type: 'progress'messages received butprogressEventsarray never updated. Live messages (e.g. "Parsing with LLM…") were dropped. Fix: appenddata.eventtoprogressEvents. -
Initial SSE snapshot missing
phase: The initial broadcast of queued items omittedphase: item.currentPhase. Items already in-progress on page load showed the wrong phase. Fix: includephasein the initial snapshot.
Gitea CI: Common Failure Modes
Chromium not available in Alpine Docker:
vite.config.ts defines two vitest projects: client (browser, needs Chromium) and server (Node.js). Alpine CI has no Chromium. Always specify:
npm run test:unit -- --run --project=server
$env/dynamic/private throws in Docker build (no .env):
Any code reading SvelteKit env vars at module import time will throw during Docker RUN npm test because there's no .env file in the build. Fix: mock the module in affected tests:
vi.mock('$env/dynamic/private', () => ({
env: { OPENAI_BASE_URL: 'http://localhost:11434', OPENAI_MODEL: 'test-model' }
}));
Registry secrets must be set manually in Gitea:
REGISTRY_USERNAME and REGISTRY_TOKEN must be created in repo Settings → Actions → Secrets. They are not automatically available.
TypeScript Quirk: Async Callback Closure Narrowing
let interceptedCaption: string | null = null;
page.on('response', async () => { interceptedCaption = 'value'; }); // assigned in async callback
// TypeScript may narrow `interceptedCaption` to `never` outside the callback
// if no other assignment exists in the outer scope.
const capturedCaption = interceptedCaption as string | null; // explicit cast required
Production Architecture: yt-dlp + Playwright Split
Current split (as of commit c9f5300+):
- Playwright → caption extraction (DOM, always full text)
- yt-dlp → thumbnail URL only (fast, no browser overhead)
- Both run in parallel in
QueueProcessor.ts
Why not yt-dlp for caption? Both mobile API and GraphQL responses can be truncated even with a valid session. DOM is the only reliable source.
Why not Playwright for thumbnail? yt-dlp extracts thumbnail cleanly and quickly. Playwright-based thumbnail extraction was fragile.
Infrastructure Reference
| Resource | Value |
|---|---|
| App URL | https://insta-recipe.sal.giize.com |
| SSH | ssh -o IdentitiesOnly=yes -i ~/.ssh/id_rsa_ideapad moze@192.168.1.50 |
| Compose file | /home/moze/Server/stacks/insta-recipe/compose.yaml |
| Env file | /home/moze/Server/stacks/insta-recipe/.env |
| Docker registry | git.sal.giize.com/mozempk/insta-recipe:latest |
| Build | docker buildx build --platform linux/amd64 -t git.sal.giize.com/mozempk/insta-recipe:latest --push . |
| Deploy | docker compose pull && docker compose up -d |
| LLM (internal) | http://chat_llama-cpp:8080/v1 |
| LLM (external) | http://192.168.1.50:8080 |
| Current LLM model | gemma4-e4b-q6k (via LLM_MODEL in .env) |
| Auth file (host) | /home/moze/Server/stacks/insta-recipe/data/secrets/auth.json |
| Auth file (container) | /app/secrets/auth.json |