455 lines
13 KiB
Markdown
455 lines
13 KiB
Markdown
# Architecture Documentation
|
|
|
|
**Last Updated:** 2026-02-15T00:00:00.000Z
|
|
**JIRA:** RECIPE-0001
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
**Project Name:** InstaRecipe
|
|
**Type:** Progressive Web Application (PWA)
|
|
**Primary Language:** TypeScript
|
|
**Framework:** SvelteKit 2.x with Svelte 5
|
|
**Runtime:** Node.js 22+
|
|
|
|
### Purpose
|
|
|
|
A modern web application that extracts recipes from Instagram posts and saves them to Tandoor Recipe Manager using an async queue-based processing system.
|
|
|
|
---
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
insta-recipe/
|
|
├── src/ # Source code
|
|
│ ├── lib/ # Library code
|
|
│ │ ├── client/ # Client-side modules
|
|
│ │ │ ├── PushNotificationManager.ts
|
|
│ │ │ ├── PWAInstallManager.ts
|
|
│ │ │ └── ServiceWorkerMessageHandler.ts
|
|
│ │ ├── server/ # Server-side modules
|
|
│ │ │ ├── api/ # API utilities (errors, handlers)
|
|
│ │ │ ├── browser.ts # Playwright browser management
|
|
│ │ │ ├── extraction.ts # Instagram content extraction
|
|
│ │ │ ├── llm.ts # LLM integration (OpenAI)
|
|
│ │ │ ├── notifications/ # Push notification service
|
|
│ │ │ ├── parser.ts # Recipe parsing with LLM
|
|
│ │ │ ├── prompts/ # LLM prompts
|
|
│ │ │ ├── queue/ # Queue management system
|
|
│ │ │ │ ├── QueueManager.ts
|
|
│ │ │ │ ├── QueueProcessor.ts
|
|
│ │ │ │ ├── config.ts
|
|
│ │ │ │ └── types.ts
|
|
│ │ │ ├── scheduler.ts # Background task scheduler
|
|
│ │ │ ├── tandoor.ts # Tandoor API integration
|
|
│ │ │ ├── tandoor-config.ts
|
|
│ │ │ └── validation/ # Input validation
|
|
│ │ ├── assets/ # Static assets
|
|
│ │ └── index.ts
|
|
│ ├── routes/ # SvelteKit routes
|
|
│ │ ├── api/ # API endpoints
|
|
│ │ │ ├── extract/ # Legacy extraction endpoint (deprecated)
|
|
│ │ │ ├── health/ # Health check
|
|
│ │ │ ├── llm-health/ # LLM health check
|
|
│ │ │ ├── notifications/ # Push notification endpoints
|
|
│ │ │ ├── queue/ # Queue management API
|
|
│ │ │ │ ├── [id]/ # Individual queue item operations
|
|
│ │ │ │ └── stream/ # SSE for real-time updates
|
|
│ │ │ ├── tandoor/ # Tandoor integration
|
|
│ │ │ ├── tandoor-config/
|
|
│ │ │ └── thumbnail/
|
|
│ │ ├── components/ # Shared components
|
|
│ │ ├── share/ # Share target page
|
|
│ │ │ └── components/ # Share-specific components
|
|
│ │ ├── +layout.svelte # Root layout
|
|
│ │ └── +page.svelte # Queue dashboard (home)
|
|
│ ├── tests/ # Test files
|
|
│ ├── app.d.ts # Type definitions
|
|
│ ├── app.html # HTML template
|
|
│ ├── app.server.ts # Server initialization
|
|
│ ├── hooks.server.ts # SvelteKit server hooks
|
|
│ └── service-worker.ts # Service worker for PWA
|
|
├── build/ # Build output
|
|
├── docs/ # Documentation
|
|
│ ├── plans/ # Implementation plans
|
|
│ └── outcomes/ # Implementation outcomes
|
|
├── scripts/ # Utility scripts
|
|
├── static/ # Static files
|
|
├── .ssl/ # SSL certificates (local dev)
|
|
├── docker-compose.yml # Docker configuration
|
|
├── Dockerfile # Container image
|
|
├── package.json # Dependencies
|
|
├── svelte.config.js # SvelteKit configuration
|
|
├── tsconfig.json # TypeScript configuration
|
|
└── vite.config.ts # Vite configuration
|
|
```
|
|
|
|
---
|
|
|
|
## Key Directories
|
|
|
|
### `/src/lib/server/`
|
|
|
|
Server-side business logic following Hexagonal Architecture principles. Contains domain logic, adapters for external systems (Instagram, Tandoor, LLM), and port definitions.
|
|
|
|
### `/src/lib/client/`
|
|
|
|
Client-side utilities for PWA features (push notifications, install prompts, service worker messaging).
|
|
|
|
### `/src/routes/api/`
|
|
|
|
RESTful API endpoints implemented as SvelteKit server routes. Each directory contains `+server.ts` files exporting HTTP verb handlers.
|
|
|
|
### `/src/routes/share/`
|
|
|
|
Share target page allowing users to share Instagram URLs directly from their browser or mobile apps.
|
|
|
|
### `/src/lib/server/queue/`
|
|
|
|
Queue management system with in-memory storage, processor workers, and type definitions.
|
|
|
|
### `/docs/`
|
|
|
|
Comprehensive documentation including plans, outcomes, API specs, and migration guides.
|
|
|
|
---
|
|
|
|
## Design Patterns
|
|
|
|
### Singleton Pattern
|
|
|
|
Used for shared service instances:
|
|
|
|
- `QueueManager` (`queueManager` exported instance)
|
|
- `QueueProcessor` (`queueProcessor` exported instance)
|
|
- `PushNotificationService` (`pushNotificationService` exported instance)
|
|
- `ServiceWorkerMessageHandler` (`serviceWorkerMessageHandler` exported instance)
|
|
|
|
### Factory Pattern
|
|
|
|
Used for creating configured instances:
|
|
|
|
- `createLLM()` - Creates OpenAI client with environment configuration
|
|
- `createBrowserContext()` - Creates Playwright browser context with options
|
|
- `initializeBrowser()` - Initializes Chromium browser instance
|
|
|
|
### Observer Pattern
|
|
|
|
Implemented in QueueManager for real-time updates:
|
|
|
|
- Subscribers receive notifications on queue item changes
|
|
- Server-Sent Events (SSE) stream queue updates to clients
|
|
- Push notifications notify users of completion events
|
|
|
|
### Adapter Pattern (Hexagonal Architecture)
|
|
|
|
External systems accessed via adapters:
|
|
|
|
- **Instagram Adapter**: `extraction.ts` - Extracts content via Playwright
|
|
- **LLM Adapter**: `llm.ts`, `parser.ts` - Recipe parsing via OpenAI
|
|
- **Tandoor Adapter**: `tandoor.ts` - Recipe management system integration
|
|
- **Browser Adapter**: `browser.ts` - Playwright browser automation
|
|
|
|
### Strategy Pattern
|
|
|
|
Multiple extraction strategies with fallback:
|
|
|
|
1. Embedded JSON extraction
|
|
2. DOM selector extraction
|
|
3. GraphQL API extraction
|
|
4. Legacy extraction method
|
|
|
|
---
|
|
|
|
## Key Components
|
|
|
|
### Queue Management System
|
|
|
|
**Location**: `src/lib/server/queue/`
|
|
|
|
Three-phase processing pipeline:
|
|
|
|
1. **Extraction Phase**: Extract text and thumbnail from Instagram
|
|
2. **Parsing Phase**: Parse recipe using LLM
|
|
3. **Uploading Phase**: Upload to Tandoor (if enabled)
|
|
|
|
**Components**:
|
|
|
|
- `QueueManager`: In-memory FIFO queue with CRUD operations
|
|
- `QueueProcessor`: Worker that processes items with configurable concurrency
|
|
- `types.ts`: Comprehensive type definitions for queue items and updates
|
|
|
|
### API Layer
|
|
|
|
**Location**: `src/routes/api/`
|
|
|
|
RESTful endpoints for:
|
|
|
|
- Queue operations (`POST /api/queue`, `GET /api/queue`, `GET /api/queue/[id]`)
|
|
- Real-time updates (`GET /api/queue/stream` - SSE)
|
|
- Push notifications (`POST /api/notifications/subscribe`)
|
|
- Health checks (`GET /api/health`, `GET /api/llm-health`)
|
|
|
|
### Client-Side Services
|
|
|
|
**Location**: `src/lib/client/`
|
|
|
|
- **PushNotificationManager**: Manages Web Push API subscriptions
|
|
- **PWAInstallManager**: Handles PWA install prompts
|
|
- **ServiceWorkerMessageHandler**: Processes service worker messages
|
|
|
|
### Instagram Extraction
|
|
|
|
**Location**: `src/lib/server/extraction.ts`
|
|
|
|
Multi-method extraction with intelligent fallback:
|
|
|
|
- Progress callbacks for real-time feedback
|
|
- Retry logic with configurable attempts
|
|
- Thumbnail extraction and validation
|
|
|
|
### LLM Integration
|
|
|
|
**Location**: `src/lib/server/parser.ts`, `src/lib/server/llm.ts`
|
|
|
|
- Recipe detection endpoint
|
|
- Structured extraction using OpenAI with Zod schemas
|
|
- Configurable model and temperature settings
|
|
|
|
---
|
|
|
|
## Dependencies
|
|
|
|
### Production Dependencies
|
|
|
|
- **@types/uuid** (^10.0.0) - UUID type definitions
|
|
- **date-fns** (^4.1.0) - Date utility library
|
|
- **openai** (^4.20.0) - OpenAI API client
|
|
- **playwright** (^1.56.1) - Browser automation
|
|
- **uuid** (^13.0.0) - Unique ID generation
|
|
- **zod** (^3.23.0) - Schema validation
|
|
|
|
### Development Dependencies
|
|
|
|
- **@sveltejs/kit** (^2.48.5) - SvelteKit framework
|
|
- **@sveltejs/adapter-node** (^5.4.0) - Node.js adapter
|
|
- **svelte** (^5.43.8) - Svelte 5 framework
|
|
- **typescript** (^5.9.3) - TypeScript compiler
|
|
- **vite** (^6.0.0) - Build tool
|
|
- **vitest** (^4.0.10) - Testing framework
|
|
- **@vitest/browser-playwright** (^4.0.10) - Browser testing
|
|
- **tailwindcss** (^4.1.17) - CSS framework
|
|
- **eslint** (^9.39.1) - Linting
|
|
- **prettier** (^3.6.2) - Code formatting
|
|
- **typescript-eslint** (^8.47.0) - TypeScript ESLint
|
|
|
|
---
|
|
|
|
## Module Organization
|
|
|
|
### SvelteKit Path Aliases
|
|
|
|
- `$lib` → `src/lib/`
|
|
- `$lib/*` → `src/lib/*`
|
|
- `$app/*` → SvelteKit app imports
|
|
- `$env/dynamic/private` → Environment variables (server-side)
|
|
|
|
### Directory Structure Conventions
|
|
|
|
- **Server-only code**: `src/lib/server/` (not bundled to client)
|
|
- **Client-only code**: `src/lib/client/` (not executed on server)
|
|
- **Shared code**: `src/lib/` (available to both)
|
|
- **Routes**: `src/routes/` (file-based routing)
|
|
- **Tests**: Colocated with source files (`*.spec.ts`, `*.test.ts`)
|
|
|
|
---
|
|
|
|
## Data Flow
|
|
|
|
### Recipe Extraction Flow
|
|
|
|
```
|
|
User submits URL
|
|
↓
|
|
POST /api/queue
|
|
↓
|
|
QueueManager.enqueue()
|
|
↓
|
|
QueueProcessor picks up item
|
|
↓
|
|
Phase 1: extractTextAndThumbnail()
|
|
↓
|
|
Phase 2: extractRecipe() (LLM)
|
|
↓
|
|
Phase 3: uploadRecipeWithIngredientsDTO() (Tandoor)
|
|
↓
|
|
Push notification sent
|
|
↓
|
|
SSE updates notify client
|
|
```
|
|
|
|
### Real-time Updates Flow
|
|
|
|
```
|
|
Client connects to GET /api/queue/stream (SSE)
|
|
↓
|
|
QueueManager.subscribe(callback)
|
|
↓
|
|
Queue item changes trigger callback
|
|
↓
|
|
SSE sends event to client
|
|
↓
|
|
Client updates UI reactively
|
|
```
|
|
|
|
### Push Notification Flow
|
|
|
|
```
|
|
Client requests permission
|
|
↓
|
|
POST /api/notifications/subscribe (with subscription)
|
|
↓
|
|
PushNotificationService stores subscription
|
|
↓
|
|
Queue item completes
|
|
↓
|
|
PushNotificationService.sendNotification()
|
|
↓
|
|
Service worker receives push event
|
|
↓
|
|
Notification displayed to user
|
|
```
|
|
|
|
---
|
|
|
|
## Build System
|
|
|
|
### Build Command
|
|
|
|
```bash
|
|
npm run build
|
|
```
|
|
|
|
Generates production-ready build in `build/` directory using:
|
|
|
|
- Vite for bundling
|
|
- `@sveltejs/adapter-node` for Node.js deployment
|
|
- TypeScript compilation
|
|
- SvelteKit prerendering and optimization
|
|
|
|
### Test Command
|
|
|
|
```bash
|
|
npm test
|
|
```
|
|
|
|
Runs test suite using Vitest with two projects:
|
|
|
|
1. **Server tests**: Node environment for server-side code
|
|
2. **Client tests**: Playwright browser for Svelte components
|
|
|
|
### Development Server
|
|
|
|
```bash
|
|
npm run dev
|
|
```
|
|
|
|
Starts Vite dev server with:
|
|
|
|
- HTTPS enabled (certificates in `.ssl/`)
|
|
- Hot module replacement
|
|
- TypeScript checking
|
|
- File watching
|
|
|
|
### Linting & Formatting
|
|
|
|
```bash
|
|
npm run lint # ESLint + Prettier check
|
|
npm run format # Prettier write
|
|
```
|
|
|
|
---
|
|
|
|
## Deployment
|
|
|
|
### Docker Deployment
|
|
|
|
Dockerfile includes:
|
|
|
|
- Node.js 22 Alpine base image
|
|
- Playwright Chromium installation
|
|
- Production build
|
|
- Port 3000 exposure
|
|
|
|
Run with:
|
|
|
|
```bash
|
|
docker-compose up
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
Required configuration:
|
|
|
|
- `OPENAI_API_KEY` - LLM API access
|
|
- `TANDOOR_URL` - Tandoor instance URL (optional)
|
|
- `TANDOOR_TOKEN` - Tandoor API token (optional)
|
|
- `QUEUE_CONCURRENCY` - Concurrent processing limit (default: 2)
|
|
- `QUEUE_MAX_RETRIES` - Failed item retry limit (default: 3)
|
|
|
|
---
|
|
|
|
## Testing Architecture
|
|
|
|
### Test Categories
|
|
|
|
1. **Unit Tests**: Individual function testing
|
|
2. **Integration Tests**: Multi-component workflows
|
|
3. **API Tests**: Endpoint behavior validation
|
|
4. **Browser Tests**: Svelte component rendering
|
|
|
|
### Test Coverage
|
|
|
|
138 tests covering:
|
|
|
|
- Queue management operations
|
|
- Instagram URL validation
|
|
- SSE streaming
|
|
- API endpoints
|
|
- Scheduler functionality
|
|
- Notification service
|
|
|
|
### Test Configuration
|
|
|
|
- **Server tests**: Node environment with mocked dependencies
|
|
- **Client tests**: Playwright Chromium browser with Svelte testing library
|
|
|
|
---
|
|
|
|
## Security Considerations
|
|
|
|
### SSL/TLS
|
|
|
|
- Development uses local SSL certificates signed by external Caddy CA
|
|
- Certificates stored in `.ssl/` (git-ignored)
|
|
- Required for PWA features (Service Worker, Push API)
|
|
|
|
### Authentication
|
|
|
|
- Basic auth for scheduled tasks (username/password from environment)
|
|
- Tandoor integration uses bearer token authentication
|
|
|
|
### Input Validation
|
|
|
|
- Instagram URL validation with regex patterns
|
|
- Zod schema validation for API payloads
|
|
- Error handling with custom error classes
|
|
|
|
---
|
|
|
|
**Document Version:** 1.0
|
|
**Generated by:** Initializer Agent
|
|
**Next Review:** As needed for architectural changes
|