insta-recipe/docs/FINDINGS.md

# Findings & Research Documentation

**Last Updated:** 2026-02-15T00:00:00.000Z
**JIRA:** RECIPE-0001
**Status:** Initialized

---

## Purpose

This document tracks research findings, analysis results, and technical discoveries made during development. Each agent (Planner, Developer, Reviewer) appends findings as they work through the pipeline.

---

## Initial Codebase Analysis

### Language & Framework

- **Primary Language**: TypeScript 5.9.3
- **Framework**: SvelteKit 2.48.5 with Svelte 5.43.8
- **Runtime**: Node.js 22+
- **Package Manager**: npm

### Project Type

Progressive Web Application (PWA) for extracting recipes from Instagram posts and uploading them to Tandoor Recipe Manager.

### Architecture Style

**Hexagonal Architecture** (Ports and Adapters):

- Domain logic in `src/lib/server/`
- External system adapters: Instagram, Tandoor, LLM, Browser
- Clear separation between client and server code

### Key Technical Components

1. **Queue Management System**: In-memory FIFO queue with async processing
2. **Three-Phase Pipeline**: Extraction → Parsing → Uploading
3. **Real-Time Updates**: Server-Sent Events (SSE) for progress tracking
4. **Push Notifications**: Web Push API for background notifications
5. **PWA Features**: Service worker, manifest, install prompts

### Design Patterns Identified

- **Singleton**: QueueManager, QueueProcessor, PushNotificationService
- **Factory**: createLLM(), createBrowserContext(), initializeBrowser()
- **Observer**: Queue subscription system, SSE streaming
- **Adapter**: Instagram, Tandoor, LLM, Browser adapters
- **Strategy**: Multiple extraction methods with fallback

### Dependencies Overview

**Production** (6 dependencies):

- Browser automation: `playwright`
- LLM integration: `openai`
- Utilities: `uuid`, `date-fns`, `zod`

**Development** (26+ dependencies):

- Framework: `@sveltejs/kit`, `svelte`, `vite`
- Testing: `vitest`, `@vitest/browser-playwright`
- Styling: `tailwindcss`
- Tooling: `typescript`, `eslint`, `prettier`

### File Structure

```
52 total TypeScript/JavaScript files
├── 39 TypeScript files (.ts)
├── 10+ Svelte components (.svelte)
├── 3 JavaScript config files (.js)
└── Multiple test files (.spec.ts)
```

### Code Quality Indicators

- **Strict TypeScript**: `strict: true` enabled
- **Comprehensive Testing**: 138 tests across unit, integration, and browser tests
- **Linting**: ESLint with TypeScript and Svelte plugins
- **Formatting**: Prettier with Svelte and Tailwind plugins
- **Type Safety**: Zod schemas for runtime validation

### Environment Configuration

Required variables:

- `OPENAI_API_KEY` - LLM access
- `TANDOOR_URL` - Recipe manager URL (optional)
- `TANDOOR_TOKEN` - API authentication (optional)
- `QUEUE_CONCURRENCY` - Processing limit (default: 2)
- `QUEUE_MAX_RETRIES` - Retry attempts (default: 3)

### Deployment Setup

- **Docker**: Dockerfile with Node.js 22 Alpine + Chromium
- **HTTPS**: Local SSL certificates for PWA features
- **Production**: Node.js adapter for SvelteKit

### Notable Features

1. **Multi-Method Extraction**: 4-strategy cascade with intelligent fallback
2. **Progress Tracking**: Real-time callbacks throughout extraction pipeline
3. **Thumbnail Validation**: HTTP status code checking for image URLs
4. **Retry Logic**: Configurable retry attempts for failed extractions
5. **Scheduler**: Background task execution with authentication

---

## Technical Debt & Opportunities

### Identified Issues

1. **Deprecated Endpoints**: `/api/extract` returns 410 Gone (migration helper)
2. **In-Memory Queue**: No persistence - items lost on server restart
3. **Single Instance**: Queue state not shared across multiple server instances

### Potential Improvements

1. **Queue Persistence**: Redis or database-backed queue for durability
2. **Horizontal Scaling**: Shared queue state for multi-instance deployments
3. **Rate Limiting**: Instagram request throttling to avoid blocks
4. **Caching**: Extracted content caching to reduce redundant processing

---

## Research Findings

_This section will be populated by the Planner agent during task analysis._

### [Planner] Research Notes - RECIPE-0001 (2026-02-15)

**Task:** Fix model loading issue and frontend error display

#### Issue 1: Model Loading - "400 No models loaded"

**Research Date:** 2026-02-15
**Source:** Stack trace analysis, OpenAI SDK documentation, LM Studio/LiteLLM API patterns

**Problem Analysis:**

- Error occurs at `detectRecipe()` in [src/lib/server/parser.ts](src/lib/server/parser.ts#L30)
- OpenAI-compatible APIs (LM Studio, LiteLLM, Ollama, etc.) often require models to be explicitly loaded
- Current implementation assumes model is already loaded
- Error message contains provider-specific instructions ("use the 'lms load' command")

**OpenAI-Compatible Model Loading Patterns:**

1. **LM Studio**: Uses `/v1/models` endpoint to list available models
   - Loaded models appear in response with `"id": "model-name"`
   - No programmatic loading endpoint (manual load in UI)
2. **LiteLLM**: Uses `/v1/models` to list loaded models
   - Models must be configured in server startup
   - No dynamic loading endpoint
3. **Ollama**: Uses `/api/tags` for model list and `/api/pull` for loading
   - Different API structure (not `/v1` prefix)
4. **Generic OpenAI-compatible**: Most follow OpenAI's `/v1/models` endpoint
   - No standard for dynamic model loading
   - Usually require pre-configuration

**Solution Approach:**

- Check if model exists via `client.models.list()`
- If model not found/loaded, provide clear user-facing error
- Remove provider-specific error messages
- Add notification when model check succeeds
- Consider future enhancement: detect provider type and attempt auto-load if supported

**Files Affected:**

- [src/lib/server/llm.ts](src/lib/server/llm.ts) - Add model availability check
- [src/lib/server/parser.ts](src/lib/server/parser.ts) - Handle model not loaded error
- [src/lib/server/queue/QueueProcessor.ts](src/lib/server/queue/QueueProcessor.ts) - User notification

---

#### Issue 2: Frontend Error Display - "[object Object]"

**Research Date:** 2026-02-15
**Source:** Code analysis of QueueItemCard.svelte, types.ts, QueueManager.ts

**Problem Analysis:**

- Error structure is an object: `{ phase, message, recoverable, timestamp }`
- Frontend displays `{item.error}` directly (line 205 of QueueItemCard.svelte)
- Svelte renders object.toString() → "[object Object]"

**Current Implementation:**

```typescript
// types.ts - Error is an object
error?: {
  phase: ProcessingPhase;
  message: string;
  recoverable: boolean;
  timestamp: string;
}

// QueueItemCard.svelte line 205 - Displays object directly
<div class="text-sm text-red-700 mt-1">{item.error}</div>
```

**Solution:**
Change to: `{item.error?.message || item.error}`

- Handles object error (gets .message)
- Handles legacy string errors (fallback)
- Type-safe with optional chaining

**Files Affected:**

- [src/routes/components/QueueItemCard.svelte](src/routes/components/QueueItemCard.svelte#L205) - Display error.message

---

#### Dependencies & Constraints (from ARCHITECTURE.md)

- Using `openai@^4.20.0` SDK
- Environment: `OPENAI_BASE_URL`, `OPENAI_API_KEY`, `LLM_MODEL`
- Current config example: `http://192.168.1.10:1234/v1` (LM Studio)
- Must maintain OpenAI-compatible API contract
- No assumption about specific provider implementation

#### Code Style Requirements (from CODE_STYLE.md)

- Use SvelteKit `$env/dynamic/private` for env vars (already correct)
- Error handling: try-catch with descriptive messages
- Console logging: `[Component] Message` format
- Type safety: TypeScript strict mode enabled

<!-- Planner appends findings here -->

---

### [Developer] Implementation Notes

<!-- Developer appends findings here -->

---

### [Reviewer] Review Notes

<!-- Reviewer appends findings here -->

---

## API Endpoint Catalog

### Active Endpoints

#### Queue Management

- `POST /api/queue` - Enqueue Instagram URL for processing
- `GET /api/queue` - List queue items (supports filtering, pagination)
- `GET /api/queue/stream` - SSE stream for real-time updates
- `GET /api/queue/{id}` - Get specific queue item details
- `DELETE /api/queue/{id}` - Remove item from queue
- `POST /api/queue/{id}/retry` - Retry failed extraction

#### Push Notifications

- `POST /api/notifications/subscribe` - Subscribe to push notifications
- `DELETE /api/notifications/subscribe` - Unsubscribe from notifications
- `GET /api/notifications/vapid-key` - Get VAPID public key

#### Health & Status

- `GET /api/health` - Application health check
- `GET /api/llm-health` - LLM service availability check

#### Tandoor Integration

- `POST /api/tandoor` - Upload recipe to Tandoor
- `GET /api/tandoor-config` - Get Tandoor configuration status

#### Legacy/Deprecated

- `POST /api/extract` - ⚠️ Deprecated (returns 410 Gone)

---

## Known Constraints

### Browser Automation

- Requires Chromium/Chrome installation
- Headless mode used in production
- Cookie handling for authenticated Instagram content

### LLM Integration

- Requires OpenAI-compatible API endpoint
- Configurable model selection
- Structured output using Zod schemas

### Tandoor Integration

- Optional feature (disabled without credentials)
- Requires Tandoor API token
- Supports ingredient partitioning across steps

### SSL Requirements

- HTTPS required for Service Worker registration
- Local development uses self-signed certificates
- Certificates managed via external Caddy CA

---

## Testing Coverage

### Test Distribution

- **Unit Tests**: Core logic validation
- **Integration Tests**: Multi-component workflows
- **API Tests**: Endpoint behavior verification
- **Browser Tests**: Svelte component rendering

### Test Files

- `queue-manager.spec.ts`
- `queue-processor.spec.ts`
- `queue-api.spec.ts`
- `queue-sse.spec.ts`
- `scheduler.spec.ts`
- `instagram-url-validation.spec.ts`
- `thumbnail-validation.spec.ts`
- `extraction-url-validation.integration.spec.ts`
- `page.svelte.spec.ts`

### Mock Strategy

- Environment variables mocked via `vi.mock('$env/dynamic/private')`
- External services mocked at module level
- Browser automation mocked for unit tests

---

## Documentation Inventory

### Existing Documentation

- `README.md` - Project overview and setup
- `docs/API.md` - API endpoint specifications
- `docs/MIGRATION.md` - Migration guides
- `docs/SVELTEKIT_SSR_GUIDE.md` - SSR implementation notes
- `docs/TESTING.md` - Testing guide and mocking patterns
- `docs/Tandoor (2.3.6).yaml` - OpenAPI spec for Tandoor

### Plan Documentation

`docs/plans/` contains 20+ implementation plans:

- Execution plans for completed features
- Technical specifications
- Story breakdowns with acceptance criteria

### Outcome Documentation

`docs/outcomes/` contains 20+ outcome reports:

- Implementation summaries
- Changes made
- Testing results
- Lessons learned

---

## Agent Pipeline Notes

### Build Commands

- **Build**: `npm run build`
- **Test**: `npm test` (alias for `npm run test:unit -- --run`)
- **Dev**: `npm run dev`
- **Lint**: `npm run lint`
- **Format**: `npm run format`

### Development Workflow

1. Make changes in `src/`
2. Run tests: `npm test`
3. Verify build: `npm run build`
4. Test locally: `npm run dev`

### Continuous Integration

- ESLint checks code quality
- Prettier enforces formatting
- TypeScript checks type safety
- Vitest runs test suite

---

## Next Steps

This document will be updated by subsequent agents:

1. **Planner**: Append research findings and analysis
2. **Developer**: Document implementation discoveries
3. **Reviewer**: Record review observations and recommendations

---

### [Planner] Research Notes - RECIPE-0002 (2026-02-16)

**Task:** Complete PWA implementation (installability, push notifications, share target)

#### PWA Documentation Research

**Research Date:** 2026-02-16
**Sources:** MDN Web Docs, web.dev, W3C specifications

**Progressive Web Apps (PWA) - Key Requirements:**

1. **Web App Manifest** (`manifest.json`)
   - Required members: `name` or `short_name`, `icons` (192x192 PNG minimum), `start_url`, `display`
   - Share target support via `share_target` member (method, action, params)
   - Icons should include 192x192 and 512x512 sizes for optimal display
   - Browser compatibility: Chrome/Edge (full), Firefox/Safari (limited for share_target)

2. **Service Worker**
   - Must be registered to enable offline functionality
   - Lifecycle: install → activate → fetch events
   - Required for push notifications
   - Must be served over HTTPS (or localhost)

3. **HTTPS Requirement**
   - Mandatory for service worker registration
   - Required for push notifications and other secure contexts
   - Local development: `http://localhost` is treated as secure

4. **Installability Criteria** (from MDN/web.dev):
   - Valid manifest with required members
   - Service worker registered with fetch event handler
   - Served over HTTPS
   - At least one 192x192 PNG or SVG icon
   - Display mode set (fullscreen, standalone, minimal-ui)

**Push Notifications (Web Push API):**

- Requires service worker to receive push events
- VAPID authentication (application server keys) required for Chrome
- Subscription process: permission → subscribe → store subscription → send push
- Push service (browser vendor controlled) routes messages
- Notification permissions: default, granted, denied
- Best practice: request permission after user interaction

**Web Share Target API:**

- Registers PWA as share destination
- Configuration via manifest `share_target` member
- Supports GET or POST methods
- `params` define query string mapping (title, text, url)
- Files can be shared via POST with `multipart/form-data`
- Currently Chrome/Edge only (experimental)
- App must be installed to appear in share sheet

#### Current Implementation Analysis

**Research Date:** 2026-02-16
**Files Analyzed:** manifest.json, service-worker.ts, app.html, svelte.config.js, PWAInstallManager.ts, PushNotificationManager.ts

**Manifest Analysis (`static/manifest.json`):**

- ✅ Has all required PWA members (name, short_name, start_url, display, scope, theme_color, background_color)
- ✅ Share target configured correctly (GET /share with title/text/url params)
- ⚠️ Icons reference `/favicon.png` but file does NOT exist in static folder
- ⚠️ Uses same icon path for both 192x192 and 512x512 sizes
- ℹ️ Missing optional but recommended members: `description`, `screenshots`, `categories`

**Service Worker Analysis (`src/service-worker.ts`):**

- ✅ Native SvelteKit service worker (migrated from vite-pwa plugin)
- ✅ Install event: caches all build assets and static files
- ✅ Activate event: cleans up old caches
- ✅ Fetch event: cache-first for assets, network-first with cache fallback for others
- ✅ Push event handler: processes push messages, shows notifications with actions
- ✅ Notification click handler: opens/focuses app, handles action buttons
- ✅ Notification close handler: tracks dismissals
- ✅ Background sync handler: supports retry operations
- ✅ Message handler: supports service worker communication
- ✅ Global error handlers present

**Service Worker Registration (`svelte.config.js`):**

- ✅ `serviceWorker.register: true` enabled
- ✅ SvelteKit handles registration automatically

**Manifest Link (`src/app.html`):**

- ✅ `<link rel="manifest" href="/manifest.json">` present in head

**Client-Side Managers:**

- ✅ `PushNotificationManager.ts`: Full implementation with permission, subscribe, unsubscribe
- ✅ `PWAInstallManager.ts`: beforeinstallprompt handling, install prompt triggering
- ✅ Both are SSR-safe with browser guards

**Share Target (`/share` route):**

- ✅ Route exists at `src/routes/share/+page.svelte`
- ✅ Parses query params (text, url) from share target
- ✅ Extracts Instagram URLs from shared text
- ✅ Auto-processes URLs on mount
- ✅ Enqueues items and redirects to dashboard

**Icons/Assets Issue:**

- ⚠️ **CRITICAL**: `manifest.json` references `/favicon.png` but file doesn't exist
- ✅ `src/lib/assets/favicon.svg` exists (used in layout)
- ⚠️ No PNG icons in `static/` folder
- ⚠️ Service worker references `/favicon.png` for notifications

**Push Notifications Infrastructure:**

- ✅ VAPID keys configured in `queueConfig.push` (uses env vars or defaults)
- ✅ Server endpoint: `/api/notifications/vapid-key` (GET)
- ✅ Server endpoint: `/api/notifications/subscribe` (POST/DELETE)
- ✅ PushNotificationService stores subscriptions in-memory
- ℹ️ Note: Subscriptions are not persisted (lost on restart)

#### What Works Already:

1. **PWA Structure**: Complete Native SvelteKit PWA implementation
2. **Service Worker**: Fully functional with caching, push, notifications
3. **Push Notifications**: Client and server infrastructure in place
4. **Share Target**: Configured in manifest and `/share` route working
5. **Install Prompts**: PWAInstallManager ready to trigger install
6. **HTTPS**: App served at https://localhost:5173/

#### What Needs Attention:

1. **Icons**: Create PNG icons (192x192, 512x512) from existing SVG
2. **Icon Verification**: Ensure icons are properly sized and optimized
3. **Installability Testing**: Verify all criteria met via chrome://pwa-internals
4. **Push Notification Testing**: Verify VAPID key generation and push flow
5. **Share Target Testing**: Test share from external apps (Instagram)
6. **Manifest Enhancement**: Add description, categories for better discoverability

#### Dependencies & Constraints (from ARCHITECTURE.md, CODE_STYLE.md):

- Using native SvelteKit PWA (no plugins needed)
- Service worker: `$service-worker` module provides build, files, version
- Environment: uses `$env/dynamic/private` for server configs
- HTTPS required (already configured at https://localhost:5173/)
- TypeScript strict mode enabled
- All file paths must use SvelteKit path aliases (`$lib`, `$service-worker`)

#### Code Style Requirements (from CODE_STYLE.md):

- FilesNaming: manifest.json, service-worker.ts, lowercase for utilities
- Type annotations required for public APIs
- SSR-safe code: all browser API usage must be guarded with `browser` check
- Error handling: try-catch with descriptive messages
- Comments: JSDoc for public APIs, inline for complex logic

---

### [Planner] Research Notes - RECIPE-0003 (2026-02-16)

**Task:** Update application icon and configure Docker deployment

#### PWA Icon Generation - icon-source.png

**Research Date:** 2026-02-16
**Source:** Project analysis, PWA best practices, sharp documentation

**Icon Source File:**

- Location: `static/icon-source.png`
- Size: 672KB PNG file
- Format: PNG with transparency (confirmed via file analysis)
- Destination sizes: 192x192 (favicon.png), 512x512 (icon-512.png)

**PWA Icon Requirements:**
From RECIPE-0002 research and W3C Web App Manifest specification:

1. **Minimum Size**: 192x192 pixels (required for PWA installability)
2. **Recommended Size**: 512x512 pixels (for splash screens, high-DPI displays)
3. **Format**: PNG with transparency support
4. **Purpose**: "any maskable" for optimal Android compatibility
5. **Location**: static/ directory (served at root path)

**Sharp Library Configuration:**

- Version: 0.34.5 (already in dependencies)
- Method: resize() with fit: 'contain' to preserve aspect ratio
- Background: transparent (rgba 0,0,0,0)
- Format: PNG with optimization
- Quality: Default compression for web delivery

**Implementation Pattern:**

```javascript
await sharp('static/icon-source.png')
	.resize(192, 192, {
		fit: 'contain',
		background: { r: 0, g: 0, b: 0, alpha: 0 }
	})
	.png()
	.toFile('static/favicon.png');
```

**Rationale:**

- `fit: 'contain'` preserves aspect ratio without cropping
- Transparent background maintains icon transparency
- PNG format required by Web App Manifest spec
- Same approach for both 192x192 and 512x512 variants

---

#### Docker Volume Configuration

**Research Date:** 2026-02-16
**Source:** Codebase analysis, Dockerfile, scheduler.ts, extraction.ts

**Volume Requirements Analysis:**
From code analysis, only one persistent volume is required:

**1. /app/secrets - Instagram Authentication Storage**

- **Purpose**: Persist Instagram session cookies across container restarts
- **File**: auth.json (Playwright storage state)
- **Usage**:
  - scheduler.ts: Checks `/app/secrets/auth.json` for Docker deployments
  - extraction.ts: Loads authentication from `/app/secrets/auth.json`
  - gen-auth.js: Browser automation saves session to secrets/auth.json
- **Rationale**: Prevents re-login on every container restart
- **Docker Path**: /app/secrets
- **Host Path**: ./secrets (relative to docker-compose.yml)

**Volumes NOT Required:**

- **Database**: Queue uses in-memory storage (QueueManager.ts)
- **Cache**: Service worker cache is ephemeral
- **Uploads**: No file upload functionality
- **Logs**: Console logs to stdout/stderr (Docker logging)
- **Build artifacts**: Built into image at build time

**VOLUME Directive:**

```dockerfile
VOLUME ["/app/secrets"]
```

**docker-compose.yml Volume Mount:**

```yaml
volumes:
  - ./secrets:/app/secrets
```

---

#### Environment Variable Inventory

**Research Date:** 2026-02-16
**Source:** queue/config.ts, llm.ts, tandoor-config.ts, scheduler.ts

**Comprehensive Variable List:**

**LLM Configuration (REQUIRED):**

- `OPENAI_BASE_URL` - OpenAI-compatible API endpoint
- `OPENAI_API_KEY` - API authentication key
- `LLM_MODEL` - Model identifier (default: gpt-4o)

**Queue Configuration (OPTIONAL):**

- `QUEUE_CONCURRENCY` - Parallel processing limit (default: 2)
- `QUEUE_MAX_RETRIES` - Retry attempts (default: 3)

**Tandoor Integration (OPTIONAL):**

- `TANDOOR_ENABLED` - Enable Tandoor upload (default: false)
- `TANDOOR_SERVER_URL` - Tandoor base URL
- `TANDOOR_SPACE` - Space ID (default: 1)
- `TANDOOR_TOKEN` - API token

**Push Notifications (OPTIONAL):**

- `VAPID_PUBLIC_KEY` - Web Push public key (has default)
- `VAPID_PRIVATE_KEY` - Web Push private key (has default)

**Authentication Scheduler (OPTIONAL):**

- `AUTH_SCHEDULER_ENABLED` - Enable auto-renewal (default: false)
- `AUTH_SCHEDULER_INTERVAL_MINUTES` - Renewal interval (default: 720)

**Runtime Configuration:**

- `NODE_ENV` - Environment mode (production/development)
- `PORT` - SvelteKit port (default: 3000)
- `DISPLAY` - X11 display for Playwright (set to :99 in docker-compose.yml)

**Default Values:**
All variables have sensible defaults except:

- OPENAI_BASE_URL (required)
- OPENAI_API_KEY (required)

**VAPID Keys:**
Current defaults in queue/config.ts:

- Public: BNextdcB_fQ0BVvyGioM5L8Tf9vKQjs-WnF-rUbnU8MdWIZQYfggIHxBnW21I-lq_0HykLCdMpYj8d5joavWdxQ
- Private: JwxI_KcsBcehYcTOufMcbVWJjCq1QbH5FJmSyQuG680
- Note: These should be regenerated for production deployments

**Variable Access Pattern:**

- Server-side only: Uses `$env/dynamic/private` from SvelteKit
- No client-side environment variable exposure
- Runtime configuration (no build-time substitution)

---

#### Docker Health Check Configuration

**Research Date:** 2026-02-16
**Source:** routes/api/health/+server.ts analysis

**Health Check Endpoint:**

- Path: `/api/health`
- Method: GET
- Response: 200 OK with JSON body
- Implementation: `src/routes/api/health/+server.ts`

**Health Check Response:**

```json
{
	"status": "ok",
	"timestamp": "2026-02-16T..."
}
```

**Docker Health Check Configuration:**

```yaml
healthcheck:
  test:
    [
      'CMD',
      'node',
      '-e',
      "fetch('http://localhost:3000/api/health').then(r => r.ok ? process.exit(0) : process.exit(1)).catch(() => process.exit(1))"
    ]
  interval: 30s
  timeout: 10s
  retries: 3
  start_period: 40s
```

**Rationale:**

- `interval: 30s` - Balance between responsiveness and overhead
- `timeout: 10s` - Sufficient for app initialization
- `retries: 3` - Allow transient failures
- `start_period: 40s` - Accounts for Playwright browser initialization
- Uses internal fetch to avoid curl dependency

---

#### Docker Deployment Constraints

**Research Date:** 2026-02-16
**Source:** Dockerfile, app.server.ts, browser.ts

**Current Dockerfile Analysis:**

- Base: node:22-alpine (minimal, production-ready)
- Chromium: Installed via apk (headless browser for Instagram extraction)
- Fonts: liberation-fonts, noto, noto-cjk (text rendering)
- Build: npm ci + npm run build
- Runtime: Node.js ESM import
- Port: 3000 (EXPOSE)
- Environment: NODE_ENV=production

**Browser Initialization:**
From app.server.ts:

- initializeBrowser() called on server start
- Graceful shutdown handlers (SIGTERM, SIGINT)
- Critical for extraction.ts Playwright usage

**Security Options:**

- `seccomp=unconfined` - Required for Chromium sandbox
- `--no-sandbox` in browser.ts launch args
- Necessary for containerized Chromium

**No Changes Required:**
Current Dockerfile is production-ready, only needs VOLUME addition.

---

### [Planner] Research Notes - RECIPE-0003 Iteration 1 (2026-02-16)

**Task:** Fix Docker deployment issues (Alpine packages, Playwright installation)

#### Alpine Linux Font Packages

**Research Date:** 2026-02-16
**Source:** https://wiki.alpinelinux.org/wiki/Fonts, Alpine package database

**Incorrect Package Names in Current Dockerfile:**

1. `liberation-fonts` → No such package (ERROR)
2. `noto` → No such package (ERROR)
3. `noto-cjk` → No such package (ERROR)

**Correct Alpine Font Package Names:**

1. `font-liberation` → Correct (already in Dockerfile)
2. `font-noto` → Correct name for Noto fonts
3. `font-noto-cjk` → Correct name for Noto CJK (Chinese, Japanese, Korean) fonts

**Rationale:**

- Alpine Linux uses `font-*` prefix for all font packages
- Common mistake: using Debian/Ubuntu package names which differ from Alpine
- These fonts are essential for rendering text in Instagram content extraction

**Recommended Font Installation:**

```dockerfile
RUN apk add --no-cache \
    chromium \
    font-liberation \
    font-noto \
    font-noto-cjk
```

---

#### Playwright on Alpine Linux

**Research Date:** 2026-02-16
**Source:** https://playwright.dev/docs/docker, Playwright GitHub issues

**Official Playwright + Alpine Status:**

- **Not officially supported**: Browser builds require glibc, Alpine uses musl
- **Firefox/WebKit**: Cannot run on Alpine (glibc dependency)
- **Chromium**: Can work using system chromium package

**Problem Analysis:**

- Current Dockerfile installs system chromium via `apk add chromium`
- Playwright's `chromium.launch()` expects Playwright's own Chromium binary
- Playwright's Chromium is built for glibc environments (Ubuntu/Debian)
- `npx playwright install chromium` will download glibc binary that won't run on Alpine

**Solution: Configure Playwright to Use System Chromium**

**Approach A - Use System Chromium (Recommended):**

```typescript
// src/lib/server/browser.ts
browser = await chromium.launch({
  executablePath: '/usr/bin/chromium-browser',
  headless: true,
  args: [...]
});
```

**Environment Variable Approach:**

```dockerfile
ENV PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1
ENV PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH=/usr/bin/chromium-browser
```

**Approach B - Switch to Debian Base:**

```dockerfile
FROM node:22-bookworm
RUN npx -y playwright@1.56.1 install --with-deps chromium
```

**Recommendation:**

- Use Approach A (system chromium with executablePath)
- Minimal changes to existing Alpine setup
- System chromium is already installed and working
- Avoids full base image migration

**Chromium System Dependencies:**
When using system chromium on Alpine, these packages are auto-installed as dependencies:

- ca-certificates, mesa-gbm, wayland-libs-server, libxkbcommon
- ffmpeg-libs, gtk+3.0, libexif, libevent, nss, etc. (64 total dependencies)

---

#### Playwright Version Compatibility

**Research Date:** 2026-02-16
**Source:** package.json analysis

**Current Version:** playwright@1.56.1 (production dependency)
**Chromium Version:** Bundled with Playwright 1.56.1

**System Chromium Compatibility:**

- Alpine edge: chromium 145.0.7632.75 (as of 2026-02-15)
- Playwright 1.56.1 expects: Chromium ~133.x
- **Version mismatch OK**: Playwright API is compatible across minor Chromium versions
- System chromium is newer, should work without issues

**executablePath Configuration:**

- Path on Alpine: `/usr/bin/chromium-browser`
- Must be set in browser.ts or via environment variable
- No additional Playwright installation needed when using system browser

---

#### Docker Compose Configuration for Playwright

**Research Date:** 2026-02-16
**Source:** resolution_context.yaml, docker-compose.yml analysis

**Current Configuration Analysis:**

```yaml
environment:
  - DISPLAY=:99 # X11 display (not needed for headless)
security_opt:
  - seccomp=unconfined # Required for Chromium sandbox
```

**Issues:**

- `DISPLAY=:99` set but no X11 server (Xvfb) running
- Headless mode doesn't need DISPLAY
- docker-compose.yml has DISPLAY but it's unused

**Recommendation:**

- Keep `DISPLAY=:99` as harmless fallback (no changes needed)
- `seccomp=unconfined` is necessary for Chromium sandbox (keep as-is)
- No additional configuration needed for Playwright

---

---

### [Planner] Node.js Versions and npm Lockfile Compatibility - RECIPE-0003 Iteration 2 (2026-02-16)

**Research Date:** 2026-02-16T17:00:00.000Z
**Source:** Node.js Release Schedule, npm documentation (v10 & v11), Docker Hub

#### Problem Analysis

Docker build fails at `npm ci` with error: "package-lock.json and package.json are out of sync"

- **Root Cause**: package.json updated to Tailwind v4, but package-lock.json still contains Tailwind v3 dependencies (@csstools/\*)
- **Secondary Issue**: npm version mismatch - local (npm 11.6.2) vs Docker (npm 10.9.4)

#### Node.js LTS Status Research

**Source:** https://github.com/nodejs/release, https://nodejs.org/en/about/previous-releases

**Currently Supported Versions:**

- **Node.js 20 (Iron)**: Maintenance LTS - EOL 2026-04-30
- **Node.js 22 (Jod)**: Maintenance LTS - EOL 2027-04-30 ← Current Dockerfile
- **Node.js 24 (Krypton)**: Active LTS - EOL 2028-04-30 ← Best choice
- **Node.js 25**: Current (not LTS) - EOL 2026-06-01

**LTS Phase Definitions:**

1. **Current**: Latest features, 6-month cycle for odd versions
2. **Active LTS**: Audited features and updates (18 months for even versions since v12)
3. **Maintenance**: Critical fixes only (12 months)

**Conclusion**: Node.js 24 is Active LTS (until Oct 2026) providing better support than Node.js 22 (already in Maintenance).

#### npm Lockfile Version Compatibility

**Source:** https://docs.npmjs.com/cli/v10/configuring-npm/package-lock-json, https://docs.npmjs.com/cli/v11/configuring-npm/package-lock-json

**Lockfile Version History:**

- `lockfileVersion: 1` - npm v5-v6
- `lockfileVersion: 2` - npm v7-v8 (backwards compatible with v1)
- `lockfileVersion: 3` - npm v9+ (backwards compatible with v7)

**npm Version Bundled with Node.js:**

- node:22-alpine → npm 10.9.4 (uses lockfileVersion: 3)
- node:24-alpine → npm 11.x (uses lockfileVersion: 3)
- Local environment → npm 11.6.2 (uses lockfileVersion: 3)

**Compatibility Analysis:**

- Current package-lock.json has `"lockfileVersion": 3` ✓
- npm 10 and npm 11 both support lockfileVersion: 3 ✓
- The issue is NOT version incompatibility but **stale dependency data**

**npm ci Strict Behavior:**
`npm ci` performs strict validation:

1. Requires exact match between package.json and package-lock.json
2. Does not update lockfile automatically (unlike `npm install`)
3. Fails if dependencies are missing or mismatched
4. This is intentional for reproducible builds in CI/CD

#### Tailwind CSS v3 → v4 Migration Impact

**Source:** package.json analysis, package-lock.json inspection

**Current State:**

```json
// package.json (Tailwind v4)
"@tailwindcss/vite": "^4.1.17",
"tailwindcss": "^4.1.17"

// package-lock.json (still has Tailwind v3 transitive deps)
"@csstools/css-parser-algorithms": "3.0.5",
"@csstools/css-tokenizer": "3.0.4"
```

**Why This Happened:**

- package.json was updated to Tailwind v4
- package-lock.json was NOT regenerated afterward
- Tailwind v4 has different dependency tree than v3 (no @csstools/\*)
- `npm ci` detects mismatch and fails

#### Solution Options Analysis

**Option A: Regenerate with Docker node:22-alpine (Review's RECOMMENDED)**

```bash
docker run --rm -v "$PWD":/app -w /app node:22-alpine sh -c "rm package-lock.json && npm install"
```

- ✓ Ensures exact npm version match with deployment
- ✗ Stays on Maintenance LTS (Node 22)
- ✗ Doesn't align with local development (node 24)

**Option B: Update to node:24-alpine**

```dockerfile
FROM node:24-alpine
```

```bash
rm package-lock.json && npm install
```

- ✓ Uses Active LTS (better support)
- ✓ Aligns Docker with local development
- ✗ Changes base image (minimal risk)

**Option C: Hybrid (BEST SOLUTION)**

1. Update Dockerfile to node:24-alpine
2. Regenerate package-lock.json locally (npm 11.x matches node:24)

- ✓ Active LTS with longer support window
- ✓ Perfect alignment between local dev and Docker
- ✓ Single lockfile regeneration
- ✓ Future-proof (Active LTS until Oct 2026)

**Chosen Approach: Option C**

#### Implementation Details

**Files to Modify:**

1. `Dockerfile` - Change FROM node:22-alpine → node:24-alpine
2. `package-lock.json` - Regenerate to sync with package.json

**Verification Steps:**

1. `npm install` - Regenerate lockfile
2. `npm run build` - Verify local build
3. `npm test` - Verify all tests pass
4. `docker build` - Verify Docker build succeeds
5. `docker compose up` - Verify runtime

**No Code Changes Needed:**

- All application code remains unchanged
- .env.example already complete (no new variables)
- docker-compose.yml does not need changes (node version transparent)

---

### [Planner] Research Notes - RECIPE-0004 (2026-02-16)

**Task:** Fix .dockerignore, favicon.ico, push notifications, e2e tests, and logging serialization

#### .dockerignore Research

**Research Date:** 2026-02-16
**Source:** Project analysis, .gitignore comparison, Docker best practices

**Current State:**

- No `.dockerignore` file exists in project root
- `.gitignore` exists and excludes: node_modules, build outputs, env files, SSL certs, symlinks, prompts/

**Docker Build Context Issues:**
Without `.dockerignore`, Docker sends entire workspace to build context including:

- `node_modules/` (if exists locally) - causes conflicts with `npm ci` in Dockerfile
- `build/` outputs - unnecessary
- `.git/` directory - large, unused in container
- `prompts/` directory - development artifacts
- `.env` files - should use environment variables instead

**Recommended .dockerignore Content:**
Based on `.gitignore` and Docker best practices:

```dockerignore
node_modules
.git
build
.output
.vercel
.netlify
.wrangler
.svelte-kit
.DS_Store
Thumbs.db
.env
.env.*
!.env.example
.ssl/
vite.config.*.timestamp-*
debug_page.txt
prompts/
*.md
!README.md
.github/
.vscode/
*.log
coverage/
.vitest/
```

**Rationale:**

- Exclude development dependencies and build artifacts
- Keep README.md for documentation
- Exclude version control metadata
- Reduce build context size significantly
- Prevent conflicts with Dockerfile's npm ci

---

#### Favicon 404 Error Research

**Research Date:** 2026-02-16
**Source:** Static folder analysis, browser behavior, PWA specifications

**Files Present:**

- `static/favicon.png` (192x192 PNG) ✓ exists
- `static/icon-512.png` (512x512 PNG) ✓ exists
- `static/icon-source.png` (source file) ✓ exists
- `static/manifest.json` references both PNG files ✓

**404 Source:**

- Browsers automatically request `/favicon.ico` (legacy format)
- SvelteKit serves from `static/` folder
- No `favicon.ico` file exists → 404 error

**Solution Options:**

**Option A - Create favicon.ico (Recommended):**
Use Sharp to generate ICO from PNG source:

```javascript
// New script: scripts/gen-favicon-ico.js
await sharp('static/icon-source.png').resize(32, 32).png().toFile('static/favicon.ico');
```

**Option B - SvelteKit Hook Redirect:**
Add server hook to redirect /favicon.ico → /favicon.png

- More complex
- Adds runtime overhead
- Not recommended

**Chosen Approach:** Option A (generate favicon.ico during build)

---

#### Push Notifications Implementation Research

**Research Date:** 2026-02-16
**Source:** PushNotificationService.ts, web-push library docs, Web Push Protocol RFC 8030

**Current Implementation Analysis:**

**Client-Side (Complete):**

- `PushNotificationManager.ts` - Full implementation ✓
  - Permission request ✓
  - VAPID key fetch ✓
  - pushManager.subscribe() ✓
  - Server subscription registration ✓
- `service-worker.ts` - Push event handler ✓
- `NotificationSettings.svelte` - UI toggle ✓

**Server-Side (Mock Only):**

```typescript
// Current PushNotificationService.ts line 106-125
private async sendToSubscription(subscription: PushSubscription, data: any): Promise<void> {
  // In production, use web-push library:
  // [COMMENTED OUT CODE]

  // For development, we'll log the notification
  console.log(`[PushService] Would send push notification:`, {
    endpoint: subscription.endpoint,
    data: data
  });

  await new Promise(resolve => setTimeout(resolve, 100)); // Simulate
}
```

**Problem:** Push notifications are logged but never actually sent to browser.

**Web Push Library Integration:**

**1. Install Dependency:**

```json
// package.json
{
	"dependencies": {
		"web-push": "^3.6.7"
	}
}
```

**2. Implementation Pattern:**

```typescript
import webpush from 'web-push';

// On init
webpush.setVapidDetails('mailto:your-email@example.com', vapidPublicKey, vapidPrivateKey);

// In sendToSubscription
await webpush.sendNotification(subscription, JSON.stringify(payload), {
	TTL: 60 * 60 * 24 // 24 hours
});
```

**3. Configuration Requirements:**

- VAPID keys already configured in `queueConfig.push`
- Default keys present (should regenerate for production)
- Email contact required by spec (add env var)

**Files to Modify:**

- `package.json` - add web-push dependency
- `src/lib/server/notifications/PushNotificationService.ts` - implement actual sending
- `src/lib/server/queue/config.ts` - add VAPID_EMAIL env var

---

#### Manual Push Notification Test Button Research

**Research Date:** 2026-02-16
**Source:** NotificationSettings.svelte, PushNotificationService API

**Current UI:**

- Only has enable/disable toggle
- No manual trigger for testing different notification types

**Test Button Requirements:**

1. Trigger different notification types:
   - Success notification (recipe completed)
   - Error notification (parsing failed)
   - Progress notification (extraction in progress)
2. Send to own subscription only
3. Debug output showing notification payload

**Implementation Approach:**

**Frontend Component:**
Add to `NotificationSettings.svelte`:

```svelte
<button onclick={testNotification('success')}>Test Success</button>
<button onclick={testNotification('error')}>Test Error</button>
<button onclick={testNotification('progress')}>Test Progress</button>

async function testNotification(type: 'success' | 'error' | 'progress') {
  await fetch('/api/notifications/test', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ type })
  });
}
```

**Backend Endpoint:**
New file: `src/routes/api/notifications/test/+server.ts`

```typescript
export const POST: RequestHandler = async ({ request }) => {
	const { type } = await request.json();

	const payload = {
		success: {
			/* ... */
		},
		error: {
			/* ... */
		},
		progress: {
			/* ... */
		}
	}[type];

	await pushNotificationService.sendNotification(payload);
	return json({ success: true });
};
```

---

#### Playwright E2E Push Notification Testing Research

**Research Date:** 2026-02-16
**Source:** Playwright API docs (BrowserContext.grantPermissions), existing test patterns

**Playwright Push Notification Testing Pattern:**

**Key Methods:**

1. `context.grantPermissions(['notifications'])` - Grant permission without prompt
2. `page.evaluate()` - Access PushManager in browser context
3. `page.waitForEvent()` - Wait for service worker events

**Test Structure:**

```typescript
// New file: src/tests/push-notifications.e2e.spec.ts
import { test, expect } from '@playwright/test';

test.describe('Push Notifications E2E', () => {
	test('should subscribe to push notifications', async ({ browser }) => {
		const context = await browser.newContext();
		await context.grantPermissions(['notifications']);

		const page = await context.newPage();
		await page.goto('http://localhost:5173');

		// Click notification toggle
		await page.getByRole('button', { name: /enable notifications/i }).click();

		// Verify subscription created
		const subscription = await page.evaluate(async () => {
			const reg = await navigator.serviceWorker.ready;
			return await reg.pushManager.getSubscription();
		});

		expect(subscription).toBeTruthy();
		expect(subscription.endpoint).toBeDefined();

		await context.close();
	});
});
```

**Test Coverage:**

1. Permission grant flow
2. Subscription creation via PushManager
3. Server registration (POST /api/notifications/subscribe)
4. Manual test notification trigger
5. Subscription persistence in localStorage
6. Unsubscribe flow

**Vitest Configuration:**
Current project uses Vitest with @vitest/browser-playwright:

- Already configured for browser tests
- Playwright already installed (playwright@^1.56.1)
- Pattern: `*.e2e.spec.ts` for e2e tests vs `*.spec.ts` for unit tests

---

#### Logging Serialization Research

**Research Date:** 2026-02-16
**Source:** Codebase grep analysis, Node.js console behavior, error object structure

**Problem Analysis:**

**Root Cause:**
JavaScript error objects logged directly show `[object Object]`:

```typescript
// Current pattern (WRONG)
console.error('[Label]', error); // Output: [Label] [object Object]
console.log('[Label]', data); // Output: [Label] [object Object]
```

**Affected Files (25 matches found):**

- `src/lib/server/extraction.ts` - 12 occurrences
- `src/lib/server/parser.ts` - 4 occurrences
- `src/lib/server/queue/QueueProcessor.ts` - 3 occurrences
- `src/lib/server/notifications/PushNotificationService.ts` - 1 occurrence
- `src/lib/server/api/errorHandler.ts` - 1 occurrence
- `src/lib/server/llm.ts` - 2 occurrences
- `src/lib/server/scheduler.ts` - 1 occurrence
- Others: QueueManager.ts, tandoor.ts

**Solution Patterns:**

**1. Error Objects:**

```typescript
// GOOD - Extract relevant properties
console.error('[Label]', error.message, error.stack);
console.error('[Label] Error:', {
	message: error.message,
	stack: error.stack,
	name: error.name
});
```

**2. Complex Objects:**

```typescript
// GOOD - JSON.stringify with formatting
console.log('[Label] Data:', JSON.stringify(data, null, 2));

// GOOD - Specific properties
console.log('[Label] Response:', {
	status: response.status,
	statusText: response.statusText,
	body: responseBody
});
```

**3. Utility Function:**
Create `src/lib/server/utils/logger.ts`:

```typescript
export function serializeError(error: unknown): string {
	if (error instanceof Error) {
		return JSON.stringify(
			{
				name: error.name,
				message: error.message,
				stack: error.stack,
				...error
			},
			null,
			2
		);
	}
	return JSON.stringify(error, null, 2);
}

console.error('[Label]', serializeError(error));
```

**Testing Impact:**

- Logs are visible in Docker deployments (stdout/stderr)
- JSON format easier for log aggregation tools
- Stack traces preserved for debugging
- Human-readable in console

---

### [Planner] Research Notes - RECIPE-0004 Iteration 1 (2026-02-17)

**Task:** Fix TypeScript type error - NodeJS.Timer should be NodeJS.Timeout in scheduler.ts

#### Node.js Timer Types Research

**Research Date:** 2026-02-17
**Source:** Node.js v25.6.1 Official Documentation (https://nodejs.org/docs/latest/api/timers.html)

**Problem Analysis:**
TypeScript compile error in `src/lib/server/scheduler.ts:180`:

```
Argument of type 'Timer' is not assignable to parameter of type 'Timeout'
Type 'Timer' is missing the following properties from type 'Timeout':
  close, _onTimeout, [Symbol.dispose]
```

**Root Cause:**
The `SchedulerState` interface incorrectly uses `NodeJS.Timer` type for `intervalId`, but `setInterval()` returns `NodeJS.Timeout` and `clearInterval()` expects `NodeJS.Timeout` parameter.

**Official Node.js API Documentation:**

**Class: Timeout**

- Returned by `setInterval()` and `setTimeout()`
- Can be passed to `clearInterval()` or `clearTimeout()`
- Has methods: `ref()`, `unref()`, `hasRef()`, `close()`, `refresh()`, `[Symbol.toPrimitive]()`, `[Symbol.dispose]()`
- TypeScript type: `NodeJS.Timeout`

**API Signatures:**

```typescript
// setInterval returns Timeout
function setInterval(callback: Function, delay?: number, ...args: any[]): NodeJS.Timeout;

// clearInterval expects Timeout
function clearInterval(timeout: NodeJS.Timeout | string | number): void;
```

**NodeJS.Timer Type:**

- Deprecated/incorrect type for timer return values
- Missing required properties: `close`, `_onTimeout`, `[Symbol.dispose]`
- Should NOT be used for `setInterval()`/`setTimeout()` return types
- Causes TypeScript strict mode errors when passed to `clearInterval()`

**Codebase Analysis:**

```
grep -r "NodeJS.Timer" src/
  src/lib/server/scheduler.ts:13    intervalId: NodeJS.Timer | null;
  src/tests/fixtures.ts:151         let timers: NodeJS.Timer[] = [];

grep -r "NodeJS.Timeout" src/
  src/routes/api/queue/stream/+server.ts:54    let keepAliveInterval: NodeJS.Timeout | null = null;
```

**Findings:**

1. **Incorrect usage (2 occurrences):**
   - `src/lib/server/scheduler.ts:13` — SchedulerState interface
   - `src/tests/fixtures.ts:151` — Timer array in test helper

2. **Correct usage (1 occurrence):**
   - `src/routes/api/queue/stream/+server.ts:54` — keepAliveInterval type

**Solution:**
Change all `NodeJS.Timer` to `NodeJS.Timeout` to align with Node.js official API contracts and TypeScript type definitions.

**Files to Modify:**

1. `src/lib/server/scheduler.ts:13` — Type in SchedulerState interface
2. `src/tests/fixtures.ts:151` — Type in createTimerSpy helper

**Impact:**

- Type-only change, no runtime behavior modification
- Fixes TypeScript strict mode compile error
- Aligns codebase with Node.js standard types
- Existing tests (260 total) already provide 100% coverage

**References:**

- Node.js Timers Documentation: https://nodejs.org/docs/latest/api/timers.html#class-timeout
- TypeScript @types/node package: Official Node.js type definitions
- Related Error: RECIPE-0004 iteration 0 review_report.yaml

---

**Document Version:** 1.7
**Last Updated by:** Planner Agent (RECIPE-0005 Iteration 0)
**Next Update:** Developer Agent

---

### [Planner] Research Notes - RECIPE-0005 (2026-02-17)

**Task:** Fix Playwright Docker dependencies and create LMStudio integration for E2E testing

#### Playwright Alpine Linux Docker Integration - RECIPE-0005

**Research Date:** 2026-02-17
**Source:** FINDINGS.md (RECIPE-0003), Dockerfile analysis, browser.ts, Playwright documentation

**Problem Analysis:**

- Container fails with: "Executable doesn't exist at /root/.cache/ms-playwright/chromium_headless_shell-1208/"
- Alpine Linux uses musl libc, Playwright's bundled browsers require glibc
- Current Dockerfile installs system chromium via `apk add chromium` but browser.ts doesn't specify executable path
- Playwright API defaults to searching for its own bundled browser binary (not present)

**Solution (Already Researched in RECIPE-0003):**
Configure Playwright to use system chromium installed by Alpine APK:

```typescript
// src/lib/server/browser.ts - initializeBrowser()
browser = await chromium.launch({
	executablePath: '/usr/bin/chromium-browser', // System chromium path
	headless: true,
	args: [
		'--disable-blink-features=AutomationControlled',
		'--disable-dev-shm-usage',
		'--no-sandbox',
		'--disable-setuid-sandbox',
		'--disable-gpu'
	]
});
```

**Files to Modify:**

- `src/lib/server/browser.ts` - Add `executablePath: '/usr/bin/chromium-browser'` to launch options

**No Changes Needed:**

- Dockerfile already has `chromium` and fonts installed correctly
- No need for `npx playwright install` (would fail on Alpine anyway)

---

#### LMStudio Docker Networking - RECIPE-0005

**Research Date:** 2026-02-17
**Source:** Docker networking documentation, LMStudio API patterns, OpenAI-compatible endpoints

**Problem:**

- LMStudio runs on host at `http://localhost:1234`
- Docker containers have isolated networking - `localhost` inside container != host `localhost`
- Container needs to access host services

**Docker Networking Solutions:**

**Option A - network_mode: host (Recommended for LMStudio):**

```yaml
services:
  app:
    network_mode: host
```

- Container shares host network stack
- `localhost:1234` inside container = host's `localhost:1234`
- **Trade-off**: Loses container network isolation, port mapping ignored
- **Best for**: Local development/testing with host services

**Option B - extra_hosts (Alternative):**

```yaml
services:
  app:
    extra_hosts:
      - 'host.docker.internal:host-gateway'
    environment:
      - OPENAI_BASE_URL=http://host.docker.internal:1234/v1
```

- Works on Docker Desktop (Mac/Windows) and Linux with Docker 20.10+
- Maintains container network isolation
- **Trade-off**: Requires changing OPENAI_BASE_URL from localhost

**Chosen Approach:** network_mode: host

- **Rationale**: Simplest for local LMStudio integration, no URL changes needed
- Tool mandate specifies "http://localhost:1234" must work
- Matches requirement for local development/testing setup

---

#### LMStudio + Gemma 3 Configuration - RECIPE-0005

**Research Date:** 2026-02-17
**Source:** .env.example, llm.ts, prompt.yaml tool mandates

**Current Configuration:**

```env
OPENAI_BASE_URL=http://localhost:1234/v1
OPENAI_API_KEY=your-api-key-here
LLM_MODEL=google/gemma-3-4b
```

**LMStudio API Compatibility:**

- LMStudio provides OpenAI-compatible endpoint at `/v1`
- Uses same API client: openai@^4.20.0
- Model identifiers match LMStudio's loaded model names
- API key can be any non-empty value (LMStudio doesn't validate in local mode)

**Model Availability Check:**
From prior research (RECIPE-0001), `llm.ts` already implements:

- `checkModelAvailability(model: string)` - verifies model loaded via `client.models.list()`
- Returns available models if specified model not found
- User must manually load model in LMStudio UI before running container

**No Code Changes Needed:**

- LLM integration already OpenAI-compatible
- Model check already implemented
- Only need environment variable configuration

---

#### Docker Compose Complete Configuration - RECIPE-0005

**Research Date:** 2026-02-17
**Source:** docker-compose.yml, .env.example, queueConfig, tandoorConfig

**Required Changes:**

1. Add `network_mode: host` for LMStudio access
2. Update LLM_MODEL default to `google/gemma-3-4b`
3. Update .env.example defaults to match tool mandates

**Current docker-compose.yml:**

- Already has all environment variables configured
- Already has `./secrets:/app/secrets` volume mount
- Already has healthcheck configured
- Already has `seccomp=unconfined` for Chromium

**Port Mapping with network_mode: host:**

- `ports:` section ignored when using `network_mode: host`
- App will bind directly to host port 3000
- No conflicts expected (LMStudio uses 1234, app uses 3000, Tandoor external)

---

#### End-to-End Testing Strategy - RECIPE-0005

**Research Date:** 2026-02-17
**Source:** Test URL from prompt, queue system architecture

**Test URL:** https://www.instagram.com/reel/DP6oN7JCEo8/?utm_source=ig_web_button_share_sheet

**Testing Workflow:**

1. Build Docker image: `docker-compose build`
2. Start container: `docker-compose up`
3. Verify LMStudio loaded Gemma 3 model: `http://localhost:1234/v1/models`
4. Verify app health: `http://localhost:3000/api/health`
5. Verify LLM health: `http://localhost:3000/api/llm-health`
6. Enqueue test URL: `POST http://localhost:3000/api/queue`
7. Monitor progress: `GET http://localhost:3000/api/queue/stream`
8. Verify extraction succeeds with Gemma 3
9. Check Tandoor upload (if configured)

**Success Indicators:**

- Chromium launches without "Executable doesn't exist" error
- LLM health check passes
- Extraction phase completes successfully
- Recipe parsing succeeds with Gemma 3
- All existing tests pass (`npm test`)

---

#### Files Summary - RECIPE-0005

**Modified Files:**

1. `src/lib/server/browser.ts` - Add executablePath for Alpine chromium
2. `docker-compose.yml` - Add network_mode: host, update LLM_MODEL default
3. `.env.example` - Update LLM_MODEL default to google/gemma-3-4b

**No Changes:**

- `Dockerfile` - Already correct (chromium + fonts installed)
- `src/lib/server/llm.ts` - Already OpenAI-compatible
- `src/lib/server/queue/config.ts` - Already reads env vars correctly
- Test files - All existing tests should pass

**Testing:**

- Manual E2E test with provided Instagram URL
- Verify in Docker container with LMStudio
- All unit tests must pass

**Dependencies:**

- User must have LMStudio running on host at localhost:1234
- User must manually load google/gemma-3-4b model in LMStudio
- Secrets volume must exist for Instagram auth (optional)

---

### [Planner] Research Notes - RECIPE-0006 Iteration 1 (2026-02-17)

**Task:** Transform E2E test to unit test with mocked fixtures and fix extraction logic iteratively

#### Problem Analysis

**Research Date:** 2026-02-17T10:00:00.000Z
**Source:** review_report.yaml, extraction.ts analysis, test fixtures

**Iteration 0 Failure:**

- E2E test created but never executed during development
- User manually ran test and it FAILED
- Current output: `"16K likes, 325 comments - chef.antonio.la.cava on October 17, 2025: "La cacio e pepe..."`
- Expected output: Full recipe starting with `"La cacio e pepe infallibile di Luciano Monosilio 🍝"`

**Root Cause Analysis:**

1. **DOM selectors failing**: Lines 331-341 of extraction.ts try selectors but none match Instagram's current structure
2. **Fallback to og:description**: Line 348-357 extracts from `<meta property="og:description">` which contains metadata prefix
3. **Regex cleanup insufficient**: Line 356 tries to clean metadata with regex `^\d+K?\s+likes,\s+\d+\s+comments\s+-\s+[\w.]+\s+on\s+[^:]+:\s+` but it's not removing the text properly

**Current extractFromDOM() Flow:**

```
1. Try selectors: article h1, article span[dir="auto"], article div[role="button"] + span, article span:not([aria-label])
   → All fail (return null or < 100 chars)
2. Fallback to og:description meta tag
   → Returns: "16K likes, 325 comments - username on date: caption..."
3. Apply metadata cleanup regex
   → Regex doesn't match properly (or matches but leaves quotes)
4. Pass to cleanText()
   → cleanText() removes hashtags but metadata prefix remains
```

---

#### Vitest Unit Testing for Playwright Mocking

**Research Date:** 2026-02-17T10:00:00.000Z
**Source:** TESTING.md, existing tests (queue-processor.spec.ts, scheduler.spec.ts)

**Mocking Strategy:**
From TESTING.md and existing test patterns, Vitest provides module-level mocking:

```typescript
// Mock entire module BEFORE imports
vi.mock('$lib/server/extraction', () => ({
	extractTextAndThumbnail: vi.fn().mockResolvedValue({
		bodyText: 'Mocked text',
		thumbnail: 'https://example.com/thumb.jpg'
	})
}));
```

**For Unit Testing extractFromDOM():**

- Cannot mock the entire `extraction.ts` module (we're testing functions inside it)
- Need to test internal functions directly (extractFromDOM, cleanText are not exported)
- Options:
  1. **Export functions for testing** (add `export` to extractFromDOM and cleanText)
  2. **Mock Playwright Page.evaluate()** (mock the browser automation layer)
  3. **Integration test with mocked browser context**

**Chosen Approach: Export Internal Functions**

- Cleanest separation of concerns
- Allows direct unit testing without browser overhead
- Follows existing pattern (extractTextAndThumbnail is already exported)
- Test Runtime: < 10ms (vs 30s for E2E test)

**Test Structure:**

```typescript
// Unit test with fixtures
import { extractFromDOM, cleanText } from '$lib/server/extraction';

describe('Instagram Caption Extraction Unit Tests', () => {
	it('should clean metadata prefix from og:description', async () => {
		const input =
			'16K likes, 325 comments - chef.antonio.la.cava on October 17, 2025: "La cacio e pepe...';
		const expected = 'La cacio e pepe infallibile di Luciano Monosilio...';

		// Create mock page that returns problematic og:description
		const mockPage = {
			evaluate: vi.fn().mockResolvedValue(input)
		};

		const result = await extractFromDOM(mockPage as any);
		expect(result.bodyText).toBe(expected);
	});
});
```

---

#### Metadata Prefix Regex Analysis

**Research Date:** 2026-02-17T10:00:00.000Z
**Source:** extraction.ts line 356, test fixtures

**Current Regex (Line 356):**

```typescript
const cleanedContent = content.replace(
	/^\d+K?\s+likes,\s+\d+\s+comments\s+-\s+[\w.]+\s+on\s+[^:]+:\s+/,
	''
);
```

**Test Against Actual Input:**

```
Input:    '16K likes, 325 comments - chef.antonio.la.cava on October 17, 2025: "La cacio e pepe...'
Pattern:  '^\d+K?\s+likes,\s+\d+\s+comments\s+-\s+[\w.]+\s+on\s+[^:]+:\s+'
          ^----- Should match "16K likes, 325 comments - chef.antonio.la.cava on October 17, 2025: "
```

**Issue:** Pattern matches but leaves opening quote `"` after the colon.

**Problems Identified:**

1. Pattern doesn't account for quotes after colon
2. Date pattern `[^:]+` is too greedy (matches "October 17, 2025")
3. Pattern assumes single space after colon, but actual format may have `": "` (colon-space-quote)

**Improved Regex:**

```typescript
// Match: "X likes, Y comments - username on date: " (with optional quote)
/^\d+K?\s+likes,\s+\d+\s+comments\s+-\s+[\w.]+\s+on\s+[^:]+:\s*["']?/;
```

**Breakdown:**

- `^\d+K?` - Matches "16K" or "16" (K is optional)
- `\s+likes,\s+\d+\s+comments` - Matches " likes, 325 comments"
- `\s+-\s+[\w.]+` - Matches " - chef.antonio.la.cava" (alphanumeric + dots)
- `\s+on\s+[^:]+:` - Matches " on October 17, 2025:" (anything before colon)
- `\s*` - Optional whitespace after colon
- `["']?` - Optional quote character (single or double)

**This should properly strip:**

- `"16K likes, 325 comments - chef.antonio.la.cava on October 17, 2025: "` → (empty)

---

#### Files to Modify - RECIPE-0006 Iteration 1

**Primary Changes:**

1. **src/lib/server/extraction.ts**
   - Export `extractFromDOM` for unit testing
   - Export `cleanText` for unit testing
   - Fix metadata prefix regex in extractFromDOM() (line 356)

2. **src/tests/instagram-caption-extraction.unit.spec.ts** (NEW)
   - Replace E2E test with unit test
   - Mock page.evaluate() to return test fixtures
   - Test both problematic and expected outputs
   - Runtime < 100ms

3. **src/tests/instagram-caption-extraction.e2e.spec.ts** (MODIFY)
   - Mark as `.skip` or remove (replaced by unit test)
   - Keep file for future real-world validation (optional)

**Dependencies:**

- Vitest mocking (vi.fn(), mockResolvedValue)
- Test fixtures from context_compact.yaml
- No external libraries needed

**Parallelization:**

- All changes are independent
- Unit test can be written in parallel with extraction.ts fix
- Test validates fix iteratively

---

**Document Version:** 1.9
**Last Updated by:** Planner Agent (RECIPE-0008 Iteration 0)
**Next Update:** Developer Agent

---

### [Planner] Research Notes - RECIPE-0008 (2026-02-17)

**Task:** Resolve npm package vulnerabilities and fix TypeScript strict mode errors

#### TypeScript Strict Mode Status Analysis

**Research Date:** 2026-02-17T22:15:00.000Z
**Source:** tsconfig.json, get_errors output, extraction.ts analysis

**Current Configuration:**

```json
// tsconfig.json line 11
"strict": true
```

**Status:** ✅ TypeScript strict mode is ALREADY ENABLED

The task description says "Enable TypeScript strict mode (if not already enabled)" - it is already enabled. The real issue is fixing the compilation errors that exist.

**Current TypeScript Errors:** 7 errors in `src/lib/server/extraction.ts`

**Error 1-5: bestCandidate Type Narrowing (Lines 632, 636, 641, 643)**

```
Property 'score' does not exist on type 'never'.
Property 'text' does not exist on type 'never'.
Property 'innerHTML' does not exist on type 'never'.
```

**Root Cause Analysis:**

```typescript
// Line 552-558: Type definition
let bestCandidate: {
	element: Element;
	text: string;
	score: number;
	innerHTML: string;
	brCount: number;
} | null = null;

// Line 624-630: Null guard
if (!bestCandidate) {
	return {
		success: false,
		error: 'No suitable caption span found',
		text: ''
	};
}

// Line 632: TypeScript cannot infer bestCandidate is non-null after guard
console.log(`[Extractor] Final caption candidate: score=${bestCandidate.score}, ...`);
// Error: Property 'score' does not exist on type 'never'
```

**Why TypeScript Infers 'never':**

- TypeScript's control flow analysis cannot track that `bestCandidate` is non-null after the early return
- The return statement exits the function, but TypeScript doesn't always narrow the type in the remaining scope
- This is a known limitation of TypeScript's type narrowing in complex control flow

**Previous Attempt (RECIPE-0007 Iteration 1):**
Attempted fix using type assertion:

```typescript
const candidate = bestCandidate as NonNullable<typeof bestCandidate>;
```

**Result:** FAILED - TypeScript still inferred 'candidate' as type 'never'

**Correct Solution:**
Extract the inline type to a named type and use explicit type assertion after the guard:

```typescript
// Define type at module level
type CaptionCandidate = {
	element: Element;
	text: string;
	score: number;
	innerHTML: string;
	brCount: number;
};

// In function
let bestCandidate: CaptionCandidate | null = null;

// After null guard
if (!bestCandidate) {
	return { success: false, error: 'No suitable caption span found', text: '' };
}

// Explicit assertion (TypeScript now knows it's safe)
const candidate: CaptionCandidate = bestCandidate;
// Use 'candidate' instead of 'bestCandidate' for remaining code
```

**Alternative Solution (simpler):**
Use non-null assertion operator since we know it's safe after the guard:

```typescript
console.log(`[Extractor] Final caption candidate: score=${bestCandidate!.score}, ...`);
```

**Recommended:** Use explicit typing to avoid `!` operator proliferation (better code clarity).

---

**Error 6: extractCaptionFromGraphQL Parameter Type Mismatch (Line 1224)**

```
Argument of type 'string | null' is not assignable to parameter of type 'string | undefined'.
Type 'null' is not assignable to type 'string | undefined'.
```

**Context:**

```typescript
// Line 1209: extractShortcode returns string | null
const expectedShortcode = extractShortcode(url);

// Line 1224: Pass to function expecting string | undefined
const captionData = extractCaptionFromGraphQL(json, expectedShortcode);

// Line 1084: Function signature
function extractCaptionFromGraphQL(data: any, expectedShortcode?: string): string | null;
```

**Solution:**
Convert `null` to `undefined` using nullish coalescing:

```typescript
const captionData = extractCaptionFromGraphQL(json, expectedShortcode ?? undefined);
```

**Why `null` vs `undefined` Matters:**

- Optional parameters in TypeScript are `T | undefined`, not `T | null`
- Function signature uses `expectedShortcode?: string` which expands to `expectedShortcode: string | undefined`
- `extractShortcode()` returns `string | null`, creating a type mismatch
- Converting `null → undefined` aligns with TypeScript's optional parameter convention

---

**Error 7: Invalid ExtractionMethod Literal 'graphql-intercept' (Line 1273)**

```
Type '"graphql-intercept"' is not assignable to type 'ExtractionMethod | undefined'.
```

**Context:**

```typescript
// Line 12: ExtractionMethod union type
export type ExtractionMethod =
	| 'embedded-json'
	| 'internal-state'
	| 'html-section'
	| 'dom-selector'
	| 'graphql-api'
	| 'legacy';

// Line 1273: Uses undeclared literal
onProgress?.({
	type: 'complete',
	message: 'Extraction completed via GraphQL interception',
	method: 'graphql-intercept', // ❌ Not in union type
	timestamp: new Date().toISOString()
});
```

**Solution:**
Add `'graphql-intercept'` to ExtractionMethod union and getMethodDisplayName mapping:

```typescript
// Line 12: Add to union
export type ExtractionMethod =
	| 'embedded-json'
	| 'internal-state'
	| 'html-section'
	| 'dom-selector'
	| 'graphql-api'
	| 'graphql-intercept'
	| 'legacy';

// Line 117-125: Add to display name mapping
function getMethodDisplayName(method: ExtractionMethod): string {
	const names: Record<ExtractionMethod, string> = {
		'embedded-json': 'Embedded JSON',
		'internal-state': 'Internal State',
		'html-section': 'HTML Section',
		'dom-selector': 'DOM Selector',
		'graphql-api': 'GraphQL API',
		'graphql-intercept': 'GraphQL Intercept', // Add this line
		legacy: 'Legacy Parser'
	};
	return names[method];
}
```

**Why This Method Exists:**

- Line 1217-1233: Sets up GraphQL response interception
- Line 1268-1276: Uses intercepted caption if available
- This is a legitimate extraction strategy separate from 'graphql-api'
- Should be properly typed in the union

---

#### npm Package Vulnerabilities Analysis

**Research Date:** 2026-02-17T22:15:00.000Z
**Source:** package.json dependencies analysis

**Current Dependencies:**

**Production (9 dependencies):**

- `@types/uuid@^10.0.0` - Type definitions (no vulnerabilities expected)
- `date-fns@^4.1.0` - Date utilities (latest major version)
- `openai@^4.20.0` - OpenAI SDK (recent version)
- `playwright@^1.56.1` - Browser automation (recent version)
- `playwright-extra@^4.3.6` - Playwright extensions
- `puppeteer-extra-plugin-stealth@^2.11.2` - Stealth plugin
- `sharp@^0.34.5` - Image processing (latest)
- `uuid@^13.0.0` - UUID generation (latest major)
- `web-push@^3.6.7` - Push notifications (latest)
- `zod@^3.23.0` - Schema validation (latest)

**Development (24+ dependencies):**

- All framework and tooling dependencies are recent versions
- SvelteKit 2.x, Svelte 5.x, Vite 6.x, Vitest 4.x - all latest major versions
- TypeScript 5.9.3, ESLint 9.x, Prettier 3.x - all current

**Vulnerability Research Strategy:**

1. Run `npm audit` to identify current vulnerabilities
2. Analyze severity levels (critical, high, moderate, low)
3. Check for automated fixes: `npm audit fix`
4. For breaking changes: `npm audit fix --force` (requires testing)
5. Manual updates for unfixable vulnerabilities
6. Verify all tests pass after fixes

**Expected Vulnerabilities:**
Based on dependency age analysis:

- `playwright-extra@^4.3.6` - Last updated 2024, may have known issues
- `puppeteer-extra-plugin-stealth@^2.11.2` - Depends on older puppeteer versions
- Most other dependencies are recent and actively maintained

**No Direct Audit Results Available:**

- Cannot run `npm audit` during planning phase (tool restrictions)
- Developer agent must run audit as first step
- Plan assumes vulnerabilities exist and need fixing

**Verification Steps:**

1. `npm audit` - Identify vulnerabilities
2. `npm audit fix` - Apply automatic fixes
3. `npm test` - Verify tests pass
4. `npm run build` - Verify build succeeds
5. `npx tsc --noEmit` - Verify TypeScript compilation with no errors

**No Manual Package Updates Needed:**

- Wait for `npm audit` results to guide specific version updates
- Avoid premature optimization by upgrading packages unnecessarily
- Follow semantic versioning rules (^ allows minor/patch updates)

---

### [Planner] Research Notes - RECIPE-0008 Iteration 1 (2026-02-18)

**Task:** Fix 9 remaining TypeScript strict mode errors after iteration 0 completion

#### TypeScript Strict Mode Analysis

**Research Date:** 2026-02-18
**Source:** Review report analysis, type definition inspection, codebase pattern comparison
**Context:** Iteration 0 fixed 3 errors in extraction.ts. TASK-5 verification revealed 9 additional errors.

**Error Distribution:**

1. [src/routes/api/tandoor/+server.ts](src/routes/api/tandoor/+server.ts) — 1 error
2. [src/lib/server/queue/QueueProcessor.ts](src/lib/server/queue/QueueProcessor.ts) — 1 error
3. [src/lib/server/notifications/PushNotificationService.ts](src/lib/server/notifications/PushNotificationService.ts) — 1 error
4. [src/lib/client/PushNotificationManager.ts](src/lib/client/PushNotificationManager.ts) — 1 error
5. [src/tests/queue-processor.spec.ts](src/tests/queue-processor.spec.ts) — 5 errors

**Research Findings:**

**1. SvelteKit API Route Type Pattern**
**File:** [src/routes/api/tandoor/+server.ts](src/routes/api/tandoor/+server.ts#L5)
**Issue:** Missing RequestHandler type annotation on POST function
**Pattern Analysis:**

- Searched all API routes in [src/routes/api/](src/routes/api/)
- Found 10+ routes using pattern: `export const POST: RequestHandler = async ({ request }) => {...}`
- Type import: `import type { RequestHandler } from './$types'`
- [src/routes/api/tandoor/+server.ts](src/routes/api/tandoor/+server.ts) is ONLY route missing this pattern
- Using function export `export async function POST({ request })` causes implicit any in strict mode

**Solution:** Convert to const export with RequestHandler type annotation
**References:**

- [src/routes/api/queue/+server.ts](src/routes/api/queue/+server.ts#L14-L25) — Reference implementation
- [src/routes/api/notifications/subscribe/+server.ts](src/routes/api/notifications/subscribe/+server.ts#L10-L29) — Another example

**2. QueueItem Error Object Structure**
**File:** [src/lib/server/queue/QueueProcessor.ts](src/lib/server/queue/QueueProcessor.ts#L425)
**Issue:** Treating error object as string
**Type Definition:** [src/lib/server/queue/types.ts](src/lib/server/queue/types.ts#L133-L140)

```typescript
error?: {
  phase: ProcessingPhase;
  message: string;
  recoverable: boolean;
  timestamp: string;
}
```

**Current Code (incorrect):**

```typescript
// Line 425 in sendPushNotification method
const errorMessage = item.error || 'Processing failed';
```

**Problem:** `item.error` is an object, not a string. The code should access `item.error.message`.

**Correct Implementation:**

```typescript
const errorMessage = item.error?.message || 'Processing failed';
```

**Context Analysis:**

- [src/lib/server/queue/QueueManager.ts](src/lib/server/queue/QueueManager.ts#L174) correctly sets error object with all 4 properties
- Error structure used in 3 places: QueueManager.updateStatus, QueueProcessor error handler, frontend display
- Frontend ([src/routes/components/QueueItemCard.svelte](src/routes/components/QueueItemCard.svelte)) uses `item.error?.message` correctly (fixed in RECIPE-0001)

**3. web-push Package Type Definitions**
**File:** [src/lib/server/notifications/PushNotificationService.ts](src/lib/server/notifications/PushNotificationService.ts#L8)
**Issue:** `import webpush from 'web-push'` causes TypeScript error in strict mode
**Research:**

- Package: web-push@3.6.7 (current in package.json)
- npm search: No @types/web-push package exists
- DefinitelyTyped: No type definitions available
- Library actively maintained but lacks TypeScript support

**Community Pattern:**

- [src/tests/push-notification-service.spec.ts](src/tests/push-notification-service.spec.ts#L3) already uses:
  ```typescript
  // @ts-expect-error - web-push doesn't have TypeScript types, but we mock it anyway
  import webpush from 'web-push';
  ```
- Pattern accepted: Use `@ts-expect-error` comment to suppress import error
- Justification: Package is stable, widely used, tested in production

**Alternative Considered:** Custom type definitions
**Rejected:** Out of scope for this JIRA. Would require:

- Defining interfaces for webpush.setVapidDetails, webpush.sendNotification
- PushSubscription structure mapping
- Error types (410 Gone, etc.)
- Estimated 50+ lines of type definitions

**Solution:** Add `// @ts-expect-error` comment above import, matching test file pattern

**4. Mock Type Safety in Vitest Strict Mode**
**File:** [src/tests/queue-processor.spec.ts](src/tests/queue-processor.spec.ts)
**Issue:** Mock return values use `as any` or incorrect types
**Specific Errors:**

**Error 1 (line 15):** web-push sendNotification return type

```typescript
// Current (incorrect)
sendNotification: vi.fn().mockResolvedValue({} as any);

// Actual signature: webpush.sendNotification returns Promise<void>
// Solution
sendNotification: vi.fn().mockResolvedValue(undefined);
```

**Error 2 (line 209):** extractRecipe null return violation

```typescript
// Current (incorrect)
vi.mocked(extractRecipe).mockResolvedValue(null);

// Actual signature: extractRecipe(text: string): Promise<Recipe>
// Does not explicitly allow null return
// Solution: Reject promise instead of returning null
vi.mocked(extractRecipe).mockRejectedValue(new Error('Failed to parse recipe from extracted text'));
```

**Remaining 3 errors:** Similar pattern (mock return types not matching function signatures)

- Lines to be identified: Likely other .mockResolvedValue calls with type mismatches
- Pattern: Replace `as any` with proper types, ensure mocks match actual signatures

**5. Parallelization Analysis**
**All 5 files are independent:**

- Different modules: API routes, queue processor, notifications, client, tests
- No shared compilation state
- No cross-file type dependencies for these specific changes
- Safe for parallel implementation

**Verification Commands:**

```bash
npx tsc --noEmit  # Must show 0 errors
npm run build     # Must succeed
npm test          # 267/279 pass (10 pre-existing failures in extractFromDOM)
npm audit         # Must show 0 vulnerabilities (preserved from iteration 0)
```

---

#### Files to Modify - RECIPE-0008 Iteration 0

**Primary Changes:**

1. **src/lib/server/extraction.ts** — Fix TypeScript strict mode errors
   - Add `CaptionCandidate` type definition (module-level)
   - Fix `bestCandidate` type narrowing with explicit assertion
   - Fix `extractCaptionFromGraphQL` parameter type (null → undefined)
   - Add `'graphql-intercept'` to `ExtractionMethod` union
   - Add `'graphql-intercept'` mapping to `getMethodDisplayName()`

2. **package-lock.json** (if needed) — Update after `npm audit fix`
   - Depends on npm audit results
   - May require manual version updates
   - Regenerate lockfile if breaking changes needed

**No Changes Needed:**

- `tsconfig.json` - strict mode already enabled
- `package.json` - dependencies are recent, await audit results
- Test files - existing tests should validate fixes

**Dependencies:**

- extraction.ts TypeScript fixes are independent
- npm audit fixes depend on audit output (sequential)
- Build/test must run after all fixes

**Parallelization:**

- TypeScript error fixes: All 3 changes in extraction.ts are independent
- npm audit: Sequential (must run audit first, then apply fixes)
- Verification: Sequential (after all fixes applied)

---

### [Planner] Research Notes - RECIPE-0009 (2026-02-18)

**Task:** Implement URL deduplication, automatic notification subscription, UI improvements, and notification redirect fix

#### Web Push API Permission Requirements - RECIPE-0009

**Research Date:** 2026-02-18
**Source:** W3C Push API Specification, MDN Web Docs, browser security policies, existing PushNotificationManager.ts implementation

**Security Requirement:**

Per W3C Push API specification, `Notification.requestPermission()` **requires user gesture** - cannot be called programmatically without user interaction.

**Browser Behavior:**

- **Permission States**: `"default"` (not requested), `"granted"` (allowed), `"denied"` (blocked)
- **User Gesture Required**: Click, tap, keypress triggers permission prompt
- **No Automatic Subscription**: Calling `requestPermission()` on page load fails silently or throws error in strict mode
- **Best Practice**: Attach to meaningful user action (button click preferred)

**Implementation Pattern for "Automatic" Subscription:**

Since true automatic subscription violates browser security policy, the approach is:

1. Listen for **first user interaction** (click/touch) anywhere on page
2. Check notification state: supported, not denied, not subscribed
3. Call `pushNotificationManager.subscribe()` on first interaction
4. Remove listener after first attempt (one-shot behavior)

**Code Pattern:**

```typescript
function setupAutoSubscribe() {
    const attemptSubscribe = async () => {
        const state = pushNotificationManager.getState();
        if (state.supported && state.permission !== 'denied' && !state.subscribed) {
            await pushNotificationManager.subscribe();
        }
    };

    // Listen for first user interaction
    document.addEventListener('click', attemptSubscribe, { once: true });
    document.addEventListener('touchstart', attemptSubscribe, { once: true });
}
```

**Why This is "Best Practice" Automatic:**

- Requires minimal user action (any click/touch, not explicit "Enable" button)
- Non-intrusive (happens in background after natural interaction)
- Complies with W3C security requirements
- Avoids annoying permission prompts on page load
- Mobile-friendly (touchstart event)

**Alternative Approaches Considered:**

1. **Prompt on page load** — REJECTED: Violates security policy, creates poor UX
2. **Delay with setTimeout** — REJECTED: Still violates user gesture requirement
3. **IntersectionObserver trick** — REJECTED: Does not satisfy user gesture requirement
4. **Explicit "Enable Notifications" button** — VALID but less automatic than requested

**Conclusion:** First-interaction subscription is the most automatic approach allowed by browser standards while maintaining user control.

**References:**

- W3C Push API: https://www.w3.org/TR/push-api/
- MDN Notification.requestPermission: https://developer.mozilla.org/en-US/docs/Web/API/Notification/requestPermission
- Existing implementation: [src/lib/client/PushNotificationManager.ts](src/lib/client/PushNotificationManager.ts#L123-161)

---

#### Queue URL Deduplication Strategy - RECIPE-0009

**Research Date:** 2026-02-18
**Source:** QueueManager.ts architecture analysis, types.ts interface definitions, existing queue operations

**Current Queue Structure:**

```typescript
// QueueManager.ts line 44-45
private items: Map<string, QueueItem> = new Map();
```

- Storage: `Map<string, QueueItem>` with UUID keys
- No secondary index: URL lookups require linear search through values
- In-memory only: No persistence across server restarts
- Typical size: < 100 items (based on usage patterns)

**Deduplication Requirements:**

1. Check if URL already exists in queue before creating new item
2. If duplicate found: Return existing item, do NOT create new entry
3. API layer: Respond with `duplicate: true` and existing item details
4. Message level: Info (not error) - duplicate is expected behavior

**Implementation Approach:**

**Option A - Linear Search (Chosen):**

```typescript
findByUrl(url: string): QueueItem | undefined {
    for (const item of this.items.values()) {
        if (item.url === url) {
            return item;
        }
    }
    return undefined;
}
```

- **Complexity**: O(n) where n = queue size
- **Performance**: Acceptable for n < 100 (~1-2ms on modern hardware)
- **Simplicity**: No additional data structures, no risk of index desync
- **Consistency**: Single source of truth (items Map)

**Option B - Secondary URL Index (Rejected):**

```typescript
private items: Map<string, QueueItem> = new Map();
private urlIndex: Map<string, string> = new Map(); // url -> id
```

- **Complexity**: O(1) lookup, but requires maintaining two structures
- **Risk**: Index desync if remove() doesn't clean both Maps
- **Overhead**: 2x memory for keys, more complex implementation
- **Benefit**: Marginal for queue size < 1000

**Design Decision:** Option A (linear search) chosen for simplicity and reliability at current scale.

**API Response Format:**

```typescript
// Duplicate detected
{
    duplicate: true,
    message: "This recipe is already in the queue",
    item: { id, url, status, enqueuedAt }
}

// New item
{
    duplicate: false,
    item: { id, url, status, enqueuedAt }
}
```

**User Experience:**

- Frontend checks `response.duplicate === true`
- Shows info toast: "This recipe is already in queue [View]"
- No error state, no failed request
- Links to existing queue item

**Edge Cases Handled:**

1. **Multiple rapid requests**: First wins, rest return duplicate
2. **URL normalization**: URLs compared as-is (no normalization in v1)
3. **Completed items**: Duplicates found even if status is success/error
4. **Retry scenario**: Retry uses existing queue item ID, not new URL submission

**Future Considerations:**

- URL normalization (trailing slash, query params, fragments)
- Time-based deduplication window (only check items from last N hours)
- Content-based deduplication (recipe fingerprint from parsed data)

**References:**

- QueueManager implementation: [src/lib/server/queue/QueueManager.ts](src/lib/server/queue/QueueManager.ts#L44-95)
- QueueItem type definition: [src/lib/server/queue/types.ts](src/lib/server/queue/types.ts#L57-100)

---

#### Service Worker Notification Data Flow - RECIPE-0009

**Research Date:** 2026-02-18
**Source:** Code analysis of notification pipeline from QueueProcessor → PushNotificationService → Service Worker

**Notification Payload Journey:**

**Step 1: QueueProcessor sends notification (Line 418-420)**

```typescript
await pushNotificationService.notifySuccess(
    item.id,
    item.results?.recipe?.name,
    item.results?.tandoorUrl  // ← tandoorUrl passed here
);
```

**Step 2: PushNotificationService creates payload (Lines 162-181)**

```typescript
const payload: NotificationPayload = {
    type: 'success',
    itemId,
    recipeName,
    body: recipeName ? `Recipe "${recipeName}" has been extracted...` : ...,
    tag: `recipe-success-${itemId}`,
    requireInteraction: true,
    analytics: { ... }
};

if (tandoorUrl) {
    payload.body += ' View it in Tandoor.';
    // Note: tandoorUrl NOT explicitly added to payload object
}
```

**Issue Found:** `tandoorUrl` parameter received but **not stored in payload object**!

**Step 3: Service Worker receives push event (Line 123)**

```typescript
data = event.data.json(); // ← Payload becomes data object
```

**Step 4: Notification created with data (Lines 130-136)**

```typescript
const options: NotificationOptions = {
    body: data.body,
    data: data, // ← Full payload stored in data field
    // ...
};
```

**Step 5: Click handler accesses data (Line 183-191)**

```typescript
const data = event.notification.data;
const action = event.action;

if (action === 'view' && data?.itemId) {
    url = `/?highlight=${data.itemId}`;
}
```

**Current Bug:** `data.tandoorUrl` is undefined because `PushNotificationService.notifySuccess()` doesn't add it to payload.

**Fix Required in PushNotificationService.ts (Line 162-181):**

```typescript
const payload: NotificationPayload = {
    type: 'success',
    itemId,
    recipeName,
    tandoorUrl, // ← Add this line
    body: recipeName ? `Recipe "${recipeName}" has been extracted...` : ...,
    tag: `recipe-success-${itemId}`,
    requireInteraction: true,
    analytics: { ... }
};
```

**Then Service Worker Can Use It:**

```typescript
if (action === 'view' && data?.tandoorUrl) {
    url = data.tandoorUrl; // Redirect to Tandoor
} else if (action === 'view' && data?.itemId) {
    url = `/?highlight=${data.itemId}`; // Fallback to dashboard
}
```

**NotificationPayload Interface Update Required:**

```typescript
// Line 20-28 in PushNotificationService.ts
interface NotificationPayload {
    title?: string;
    body: string;
    type: 'success' | 'error' | 'progress';
    itemId: string;
    recipeName?: string;
    tandoorUrl?: string; // ← Add this line
    tag?: string;
    requireInteraction?: boolean;
    analytics?: any;
}
```

**Verification:**

- QueueProcessor already passes `item.results?.tandoorUrl` correctly
- `item.results.tandoorUrl` is set by QueueProcessor line 329-331 when Tandoor upload succeeds
- Format: `${TANDOOR_BASE_URL}/view/recipe/${recipeId}`
- Example: `https://tandoor.example.com/view/recipe/123`

**References:**

- QueueProcessor notification call: [src/lib/server/queue/QueueProcessor.ts](src/lib/server/queue/QueueProcessor.ts#L418-420)
- PushNotificationService: [src/lib/server/notifications/PushNotificationService.ts](src/lib/server/notifications/PushNotificationService.ts#L158-183)
- Service Worker push handler: [src/service-worker.ts](src/service-worker.ts#L112-170)
- Service Worker click handler: [src/service-worker.ts](src/service-worker.ts#L176-207)

---

#### Homepage UI Component Visibility Analysis - RECIPE-0009

**Research Date:** 2026-02-18
**Source:** +page.svelte component structure analysis

**Current Behavior:**

**Add Recipe Component Locations:**

1. **Empty State** (Lines 280-302): Shows when `!loading && filteredItems.length === 0`

```svelte
{#if !loading && filteredItems.length === 0}
    <div class="text-center py-12">
        <!-- ... -->
        <a href="/share" class="...">
            Add Recipe URL
        </a>
    </div>
{/if}
```

2. **No Persistent Component**: When queue has items, no "Add Recipe" button visible

**User Complaint:** "Do not hide the add recipe component when there are items in the queue"

**Issue:** Add recipe link only appears in empty state conditional block.

**Solution:** Add persistent "Add Recipe" button to action bar (always visible)

**Implementation Location:** Lines 224-254 (Action Bar section)

**Before:**

```svelte
<div class="mb-6 flex flex-col sm:flex-row gap-4 justify-between items-start sm:items-center">
    <div class="flex flex-wrap gap-2">
        <!-- Filter Tabs -->
        {#each filters as filterOption}
            <button>...</button>
        {/each}
    </div>

    <!-- Refresh Button -->
    <button>...</button>
</div>
```

**After:**

```svelte
<div class="mb-6 flex flex-col sm:flex-row gap-4 justify-between items-start sm:items-center">
    <div class="flex items-center gap-4">
        <!-- Filter Dropdown -->
        <select>...</select>

        <!-- Refresh Button -->
        <button>...</button>
    </div>

    <!-- Add Recipe Button (ALWAYS VISIBLE) -->
    <a href="/share" class="...">
        Add Recipe URL
    </a>
</div>
```

**Benefits:**

- Always accessible regardless of queue state
- Consistent UI (no disappearing elements)
- Better UX for power users (add multiple recipes quickly)
- Maintains empty state link for discoverability

**Filter Consolidation Rationale:**

Current filter tabs take significant horizontal space (5 buttons). Consolidating to dropdown:

- Frees space for persistent "Add Recipe" button
- Keeps filter + refresh on same row (per requirement)
- Mobile-friendly (dropdown vs. wrapping buttons)
- Still shows item counts in dropdown options

**References:**

- Homepage component: [src/routes/+page.svelte](src/routes/+page.svelte#L215-302)
- Empty state section: [src/routes/+page.svelte](src/routes/+page.svelte#L280-302)

---

### [Planner] Research Notes - RECIPE-0009 Iteration 1 (2026-02-18)

**Task:** UI enhancements - footer status bar, icon-only buttons, toggle Add Recipe visibility

#### Current Homepage UI Structure Analysis

**Research Date:** 2026-02-18
**Source:** Analysis of [src/routes/+page.svelte](src/routes/+page.svelte), iteration 0 implementation

**Current Implementation (Iteration 0)**:

1. **Connection Status Widget** (lines 369-383):
   - Fixed position: bottom-right (`fixed bottom-4 right-4`)
   - Shows connection status with colored dot + text label
   - Shows last ping timestamp
   - Will be REMOVED and replaced with footer bar

2. **Action Bar** (lines 263-297):
   - Filter dropdown (lines 266-276)
   - Refresh button with icon + text (lines 277-285)
   - Add Recipe button with icon + text (lines 288-297)
   - Currently: Add Recipe button ALWAYS visible (iteration 0 requirement)

3. **Empty State** (lines 310-342):
   - Shows when `!loading && filteredItems.length === 0`
   - Contains "Add Recipe URL" link

**Changes Required for Iteration 1**:

1. Remove floating connection status widget
2. Add footer status bar (icons only)
3. Convert refresh button to icon-only
4. Convert Add Recipe button to icon-only
5. Toggle Add Recipe button visibility (hide when empty, show when has items)

---

#### Footer Status Bar Design - RECIPE-0009 Iteration 1

**Research Date:** 2026-02-18
**Source:** Web PWA patterns, existing codebase styling patterns

**Design Requirements**:

- **Position**: Fixed at bottom (`fixed bottom-0 left-0 right-0`)
- **Layout**: Full width with max-width container matching page layout (`max-w-6xl`)
- **Content**: Two sections (notification status left, live updates right)
- **Display**: Icons only, no text labels
- **Accessibility**: title and aria-label attributes on interactive elements
- **Z-index**: `z-50` to ensure visibility above all content
- **Visual**: White background, top border, shadow for lift effect

**State Integration**:

Footer needs access to two state sources:

1. **Notification Status**: Via `pushNotificationManager.getState()`
   - Need to add `notificationViewModel` state variable in +page.svelte
   - Subscribe to state changes in `onMount`
   - Cleanup subscription in `onDestroy`

2. **Connection Status**: Already exists as `connectionStatus` state
   - Reuse existing variable
   - States: 'connecting' | 'connected' | 'disconnected'

**Notification Icon Logic**:

```typescript
if (!supported || permission === 'denied') {
  // Show bell with slash (not supported/denied)
  icon = 'bell-slash';
  color = 'text-gray-400';
} else if (subscribed) {
  // Show bell icon (enabled)
  icon = 'bell';
  color = 'text-green-600';
} else {
  // Show bell icon (available but not enabled)
  icon = 'bell';
  color = 'text-gray-400';
}
```

**Live Update Indicator Logic**:

```typescript
if (connectionStatus === 'connected') {
  dotColor = 'bg-green-400';
  title = 'Live updates active';
} else if (connectionStatus === 'connecting') {
  dotColor = 'bg-yellow-400';
  title = 'Connecting to live updates...';
} else {
  dotColor = 'bg-red-400';
  title = 'Live updates disconnected';
}
```

**Click Behavior**:

Clicking notification icon scrolls to NotificationSettings component:
```typescript
onclick={() => {
  document.querySelector('[data-notification-settings]')?.scrollIntoView({ behavior: 'smooth' });
}}
```

Requires adding `data-notification-settings` attribute to NotificationSettings wrapper.

---

#### Icon-Only Button Patterns - RECIPE-0009 Iteration 1

**Research Date:** 2026-02-18
**Source:** Existing codebase button styles, Tailwind CSS documentation, WCAG 2.1 guidelines

**Current Button Pattern (with text)**:

```svelte
<button class="flex items-center space-x-2 px-4 py-2 ...">
  <svg class="w-4 h-4" ... />
  <span>Button Text</span>
</button>
```

- Padding: `px-4 py-2` (horizontal + vertical)
- Icon size: `w-4 h-4` (16x16px)
- Spacing: `space-x-2` (gap between icon and text)

**Icon-Only Button Pattern**:

```svelte
<button
  title="Button description"
  aria-label="Button description"
  class="p-2 ..."
>
  <svg class="w-5 h-5" ... />
</button>
```

**Changes**:
- Padding: `p-2` (square/circular button)
- Icon size: `w-5 h-5` (20x20px - slightly larger for better visibility)
- Remove: `space-x-2` class (no text to space from)
- Add: `title` attribute (tooltip on hover)
- Add: `aria-label` attribute (screen reader accessibility)

**Accessibility Requirements** (WCAG 2.1):

1. **Title Attribute**: Provides tooltip text for sighted users on hover
2. **Aria-label Attribute**: Provides accessible name for screen readers
3. **Minimum Touch Target**: 24x24px recommended (20x20px icon + 8px padding = 36x36px total ✓)
4. **Color Contrast**: Must meet 3:1 ratio for non-text (icons)

**Examples**:

Refresh button:
```svelte
<button
  title="Refresh queue"
  aria-label="Refresh queue"
  class="p-2 bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200 ..."
>
  <svg class="w-5 h-5" ... />
</button>
```

Add Recipe button:
```svelte
<a
  href="/share"
  title="Add recipe URL"
  aria-label="Add recipe URL"
  class="p-2 bg-blue-600 text-white rounded-lg hover:bg-blue-700 ..."
>
  <svg class="w-5 h-5" ... />
</a>
```

---

#### Add Recipe Button Visibility Logic - RECIPE-0009 Iteration 1

**Research Date:** 2026-02-18
**Source:** context_compact.yaml requirement analysis, UX patterns

**Iteration 0 Implementation**:
- Add Recipe button ALWAYS visible in controls bar
- Rationale: User complained "do not hide the add recipe component when there are items in the queue"

**Iteration 1 Requirement**:
> "Toggle "Add Recipe" button visibility in controls bar (hide when queue empty, show when items exist - opposite of placeholder rule)"

**Interpretation**:

"Opposite of placeholder rule":
- Placeholder (empty state) shows when: `items.length === 0`
- Add Recipe button in controls shows when: `items.length > 0` (opposite condition)

**Logic**:

```svelte
{#if items.length > 0}
  <a href="/share" title="Add recipe URL" aria-label="Add recipe URL" ...>
    <svg ... />
  </a>
{/if}
```

**Rationale**:

1. **Empty State**: When queue is empty, user sees empty state with centered "Add Recipe URL" link
2. **Non-Empty State**: When queue has items, controls bar shows Add Recipe button (icon-only)
3. **No Redundancy**: Button doesn't appear when empty state link is already visible
4. **Consistent Access**: User always has access to "Add Recipe" via either empty state link OR controls bar button

**UX Benefits**:

- Cleaner UI when queue is empty (no redundant button)
- Convenient access when queue has items (quick add more recipes)
- Fulfills opposite condition of empty state placeholder

---

#### Svelte 5 Notification State Management

**Research Date:** 2026-02-18
**Source:** Existing iteration 0 implementation, [PushNotificationManager.ts](src/lib/client/PushNotificationManager.ts)

**NotificationState Type**:

```typescript
interface NotificationState {
  supported: boolean;
  permission: NotificationPermission; // 'default' | 'granted' | 'denied'
  subscribed: boolean;
  loading: boolean;
  error: string | null;
}
```

**State Subscription Pattern**:

```typescript
// Import type
import type { NotificationState } from '$lib/client/PushNotificationManager';

// Declare state
let notificationViewModel = $state<NotificationState | null>(null);

// Subscribe in onMount
onMount(() => {
  // ... existing code ...

  const unsubscribeNotifications = pushNotificationManager.onStateChange((newState) => {
    notificationViewModel = newState;
  });

  return () => {
    unsubscribeNotifications?.();
  };
});
```

**Cleanup in onDestroy**:

Current onDestroy only cleans up `eventSource`. Need to also cleanup notification subscription:

```typescript
onDestroy(() => {
  if (eventSource) {
    console.log('[SSE] Closing connection on component destroy');
    eventSource.close();
    connectionStatus = 'disconnected';
  }
  // No cleanup needed - handled by onMount return callback
});
```

**Note**: Svelte 5's `onMount` return function handles cleanup automatically when component unmounts.

**State Access in Footer**:

Footer component needs null-safe access since initial state is `null`:

```svelte
{#if notificationViewModel}
  {#if !notificationViewModel.supported || notificationViewModel.permission === 'denied'}
    <!-- Show disabled icon -->
  {:else if notificationViewModel.subscribed}
    <!-- Show enabled icon -->
  {:else}
    <!-- Show available icon -->
  {/if}
{:else}
  <!-- Loading state - show gray icon -->
  <svg class="w-5 h-5 text-gray-400" ... />
{/if}
```

**Initial State Handling**:

`pushNotificationManager.onStateChange()` sends initial state immediately on subscription, so `notificationViewModel` will be populated almost instantly after component mount.

---

**Document Version:** 3.0
**Last Updated by:** Planner Agent (RECIPE-0009 Iteration 1)
**Next Update:** Developer Agent

---

---

# Session Findings: Instagram Extraction & Production Lessons

*Recorded during active development sessions (2025–2026). These are hard-won discoveries from real debugging — not theoretical analysis.*

---

## Instagram: Caption Truncation in Web GraphQL API

**Symptom:** LLM says "no recipe found" even though the full recipe IS in the Instagram caption.

**Root cause:** Instagram's web GraphQL API (`doc_id=8845758582119845`) silently truncates captions in `edge_media_to_caption.edges[0].node.text`. Truncation is **inconsistent**:
- Sometimes ends with `….` (Unicode U+2026 + period)
- Sometimes cuts off mid-sentence with no marker at all

Known examples:
- `DWWxiymssxE`: GraphQL returns 327 chars, full caption is 393 chars (no truncation marker)
- `DXT73izCBoH`: GraphQL returns 744 chars, cuts off mid-sentence `"Versa nella tortiera co'"`

**Fix:** Never trust the GraphQL-intercepted caption. Always use DOM extraction (`extractWithStrategies` → `extractFromHTMLSection` → `tryExpandCaptionInHTMLSection` clicks "… more" button). Keep the intercepted GraphQL caption only as an emergency fallback when DOM extraction fails entirely.

**Key lesson:** The `….` suffix check is **not sufficient** to detect truncation. The only reliable approach is to always go through the DOM.

---

## Instagram: Mobile API vs GraphQL API (yt-dlp behavior)

**How yt-dlp selects which API to call:**
1. If `sessionid` cookie present → calls `https://i.instagram.com/api/v1/media/{PK}/info/` (mobile API)
2. If mobile API fails (or no sessionid) → falls back to GraphQL `doc_id=8845758582119845`

**Mobile API User-Agent:**
- Desktop UA → HTTP 404
- Instagram Android UA → HTTP 200 with full response
- The `--user-agent` CLI flag only affects video download requests, **not** API calls — yt-dlp uses its own hardcoded headers for API calls

**Mobile API also truncates:** Even with a valid sessionid and HTTP 200, `caption.text` in the mobile API response can still be truncated. DOM extraction is the only fully reliable source.

**Shortcode → PK conversion:**
```python
def shortcode_to_pk(sc):
    alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_'
    n = 0
    for c in sc: n = n * 64 + alphabet.index(c)
    return n
```

---

## Instagram: Creator-Written `….` vs API Truncation

**Gotcha:** Some creators intentionally end their captions with `….` or `#seriesname….` as a signature or series marker. This is NOT API truncation.

**Example:** Reel `DW5zH3xjY-_` ("5030 LOW CAL 💪") — the `….` is written by the creator as a series signature. The reel has only 213 chars of real content and no recipe.

**Implication:** Never use `….` suffix as the primary signal to fetch more content — always use DOM extraction regardless.

---

## Instagram: cookies.txt vs auth.json — Session Management

**Two auth formats coexist:**
- `secrets/auth.json` — Playwright `storageState` format (JSON, cookies + origins)
- `secrets/cookies.txt` — Netscape format for yt-dlp

**yt-dlp overwrites cookies.txt** after each extraction, removing `sessionid`. The next run regenerates it from `auth.json` via `maybeConvertAuthJson()` before each call. This is safe in normal operation — but inspecting cookies.txt directly between runs will show a reduced file.

**`sessionid` is critical.** Without it:
- yt-dlp mobile API returns HTTP 404 (empty response)
- Falls back to GraphQL → truncated caption

**Auth scheduler:** `scheduler.ts` runs every 15 minutes to renew the session by navigating to Instagram. Verify with logs: `[Scheduler] Instagram authentication renewed successfully`.

---

## Instagram: Playwright Browser Session Expiry (independent of cookies)

**Symptom:** Playwright navigates to Instagram, sees a profile selector ("Continue as …"), clicks Continue, gets redirected to `/accounts/login/`.

**Root cause:** The `sessionid` cookie is valid for API calls but the browser-level session can expire independently. Instagram shows the profile selector as a soft prompt which, when clicked, triggers a re-auth that fails with a stale session.

**Diagnosis:**
- `svg[aria-label="Home"]` found → session valid ✅
- `(N) Instagram` in title with notifications count → logged in ✅
- Profile selector visible → session expired, need to re-authenticate

**Fix:** Re-authenticate by updating `auth.json` with a fresh login from a real browser session and copying to the volume at `/home/moze/Server/stacks/insta-recipe/data/secrets/auth.json`.

---

## Instagram: DOM Extraction Strategy Order (2025/2026)

`extractWithStrategies` tries 6 approaches in order. Only one reliably works now:

| Strategy | Status | Reason |
|---|---|---|
| `embedded-json` | ❌ Fails | Instagram removed `window.__additionalDataLoaded` |
| `internal-state` | ❌ Fails | Instagram removed `window._sharedData` |
| `html-section` | ✅ Works | DOM extraction + "… more" button click |
| `dom-selector` | ⚠️ Partial | Simpler DOM query, may miss truncated captions |
| `graphql-api` | ⚠️ Truncated | Live interception but caption is still truncated |
| `legacy` | ❌ Fails | Old format gone |

**Note:** Clicking "… more" triggers feed-loading GraphQL calls (`xdt_api__v1__clips__home__connection_v2`) as a side effect. The full text comes purely from the expanded DOM, not a network response.

---

## LLM: phi4-mini Recipe Detection Too Strict

**Problem:** phi4-mini rejected valid Italian Instagram recipe posts as "no recipe found" during detection.

**Root cause:** Detection prompt required quantities + at least 2 steps. Italian Instagram posts often:
- Omit explicit quantities (just list ingredients by name)
- Say "full recipe at link in bio" with no steps at all

**Detection prompt evolution:**
- v1: title + 3 ingredients with quantities + 2 steps
- v2: title + 3 ingredients (no quantities) + 1 step
- v3 (current): title + 2 ingredients, NO step requirement

**Lesson:** If it reads like food content with at least 2 named ingredients, say yes.

---

## LLM: gemma4 Thinking Models Behavior

**gemma4 models on llama-swap (`http://192.168.1.50:8080`):**
- `gemma4-e2b-q8_0` — smaller/faster
- `gemma4-e4b-q6k` — better quality (production model)
- `gemma4-26b-moe-iq4xs`, `granite-3.3-8b-q6k`, `deepseek-r1-8b-q6k` also available

**gemma4 is a "thinking" model:** Outputs internal reasoning before the actual answer.

With `max_tokens: 1024`: Model skips most reasoning and puts the answer directly in `content`. The `reasoning_content` fallback in `parser.ts` covers edge cases where content is empty.

**vs phi4-mini:** phi4-mini is more literal and strict. For permissive recipe detection of Italian informal posts, gemma4 is significantly better.

---

## Tandoor: Steps Required to Save Ingredients

**Symptom:** Recipe saved to Tandoor has no ingredients even though parsing succeeded.

**Root cause:** Tandoor requires at least one Step for ingredients to be associated. When `recipe.steps` is null/empty:
```typescript
// Old code — creates stepCount=1 but no actual step:
const stepCount = recipe.steps?.length || 1;
(recipe.steps || []).map(...) // returns [] → all ingredients lost
```

**Fix in `tandoor.ts` `buildTandoorRecipeDTO()`:** When `recipe.steps` is null or empty, create a placeholder:
```typescript
const steps = (recipe.steps?.length ? recipe.steps : ['Vedi la ricetta completa al link in bio.']);
```

---

## SvelteKit SSE: Phase Updates Never Reaching UI

**Symptom:** Processing animation showed "Prepping" throughout, then jumped straight to done.

**Three root causes found:**

1. **`updateQueueItem` never set `currentPhase`:** Spreading `...items[idx]` but never applying `update.phase`. Fix:
   ```typescript
   currentPhase: update.phase ?? prev.currentPhase
   ```

2. **Progress events silently discarded:** SSE `type: 'progress'` messages received but `progressEvents` array never updated. Live messages (e.g. "Parsing with LLM…") were dropped. Fix: append `data.event` to `progressEvents`.

3. **Initial SSE snapshot missing `phase`:** The initial broadcast of queued items omitted `phase: item.currentPhase`. Items already in-progress on page load showed the wrong phase. Fix: include `phase` in the initial snapshot.

---

## Gitea CI: Common Failure Modes

**Chromium not available in Alpine Docker:**
`vite.config.ts` defines two vitest projects: `client` (browser, needs Chromium) and `server` (Node.js). Alpine CI has no Chromium. Always specify:
```bash
npm run test:unit -- --run --project=server
```

**`$env/dynamic/private` throws in Docker build (no `.env`):**
Any code reading SvelteKit env vars at module import time will throw during Docker `RUN npm test` because there's no `.env` file in the build. Fix: mock the module in affected tests:
```typescript
vi.mock('$env/dynamic/private', () => ({
  env: { OPENAI_BASE_URL: 'http://localhost:11434', OPENAI_MODEL: 'test-model' }
}));
```

**Registry secrets must be set manually in Gitea:**
`REGISTRY_USERNAME` and `REGISTRY_TOKEN` must be created in repo Settings → Actions → Secrets. They are not automatically available.

---

## TypeScript Quirk: Async Callback Closure Narrowing

```typescript
let interceptedCaption: string | null = null;
page.on('response', async () => { interceptedCaption = 'value'; }); // assigned in async callback
// TypeScript may narrow `interceptedCaption` to `never` outside the callback
// if no other assignment exists in the outer scope.
const capturedCaption = interceptedCaption as string | null; // explicit cast required
```

---

## Production Architecture: yt-dlp + Playwright Split

**Current split (as of commit `c9f5300`+):**
- **Playwright** → caption extraction (DOM, always full text)
- **yt-dlp** → thumbnail URL only (fast, no browser overhead)
- Both run **in parallel** in `QueueProcessor.ts`

**Why not yt-dlp for caption?** Both mobile API and GraphQL responses can be truncated even with a valid session. DOM is the only reliable source.

**Why not Playwright for thumbnail?** yt-dlp extracts thumbnail cleanly and quickly. Playwright-based thumbnail extraction was fragile.

---

## Infrastructure Reference

| Resource | Value |
|---|---|
| App URL | `https://insta-recipe.sal.giize.com` |
| SSH | `ssh -o IdentitiesOnly=yes -i ~/.ssh/id_rsa_ideapad moze@192.168.1.50` |
| Compose file | `/home/moze/Server/stacks/insta-recipe/compose.yaml` |
| Env file | `/home/moze/Server/stacks/insta-recipe/.env` |
| Docker registry | `git.sal.giize.com/mozempk/insta-recipe:latest` |
| Build | `docker buildx build --platform linux/amd64 -t git.sal.giize.com/mozempk/insta-recipe:latest --push .` |
| Deploy | `docker compose pull && docker compose up -d` |
| LLM (internal) | `http://chat_llama-cpp:8080/v1` |
| LLM (external) | `http://192.168.1.50:8080` |
| Current LLM model | `gemma4-e4b-q6k` (via `LLM_MODEL` in `.env`) |
| Auth file (host) | `/home/moze/Server/stacks/insta-recipe/data/secrets/auth.json` |
| Auth file (container) | `/app/secrets/auth.json` |