Files
insta-recipe/docs/plans/RefactorFrontendAndFixLLMExtraction.md
Giancarmine Salucci da58263aba feat: refactor frontend and fix LLM extraction
- Fix critical await bug in extract-stream endpoint
- Add comprehensive logging to LLM and parser modules
- Implement fallback to standard completion for incompatible models
- Create enhanced v2.0 prompts with social media handling and few-shot examples
- Add LLM health check endpoint
- Decompose share page into 6 focused Svelte 5 snippets

Resolves LM Studio integration issues and improves code maintainability
2025-12-21 03:49:33 +01:00

988 lines
29 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Execution Plan: Refactor Frontend and Fix LLM Extraction
**Date:** 2025-12-21
**Outcome Name:** RefactorFrontendAndFixLLMExtraction
**Status:** Planned
---
## Executive Summary
This plan addresses a multi-faceted issue affecting the InstaRecipe application:
1. **Frontend Architecture:** The `/share/+page.svelte` component (286 lines) has grown too large and needs to be decomposed into smaller, reusable components using Svelte 5 snippets
2. **Backend Extraction Bug:** LM Studio is not being called during recipe parsing, resulting in empty extraction results
3. **Prompt Optimization:** Consolidate and improve all parsing prompts from git history into a single, comprehensive system prompt
The extraction system successfully retrieves text from Instagram (as evidenced by `debug_page.txt` showing DOM selector extraction working), but the LLM parsing step fails silently, leaving users without recipe data.
---
## Problem Analysis
### 1. Frontend Issues
**Current State:**
- Single monolithic component at [src/routes/share/+page.svelte](src/routes/share/+page.svelte)
- 286 lines handling: URL parsing, extraction, SSE stream processing, Tandoor integration, logs rendering, and recipe display
- Violates single responsibility principle
- Difficult to test and maintain
- No component reusability
**Impact:**
- Hard to debug UI issues
- Cannot reuse recipe card or log display elsewhere
- Testing requires loading entire page component
### 2. Backend LLM Integration Issues
**Current State Analysis:**
- Environment variables correctly configured:
- `OPENAI_BASE_URL=http://192.168.1.10:1234/v1`
- `OPENAI_API_KEY=ollama`
- `LLM_MODEL=google/gemma-3-4b`
- Extraction working: `debug_page.txt` shows successful DOM selector extraction
- LLM client initialization in [src/lib/server/llm.ts](src/lib/server/llm.ts) appears correct
- Recipe parsing in [src/lib/server/parser.ts](src/lib/server/parser.ts) uses OpenAI SDK
**Suspected Issues:**
1. **SSE Endpoint Bug:** [src/routes/api/extract-stream/+server.ts](src/routes/api/extract-stream/+server.ts#L46) calls `extractRecipe()` but doesn't `await` it, resulting in Promise<Recipe> being sent instead of Recipe
2. **Missing Error Logging:** No console output from LLM calls makes debugging difficult
3. **Network Accessibility:** LM Studio may not be reachable from container (if running in Docker)
4. **Model Compatibility:** `google/gemma-3-4b` may not support structured output via `beta.chat.completions.parse()`
### 3. Prompt Evolution
**Git History Analysis:**
Only one prompt version found in commit `8fc7c44`:
- Detection prompt: Binary yes/no classifier
- Extraction prompt: Comprehensive system with requirements, conversion table, output format
**Current Prompt Strengths:**
- ✅ Clear requirements enumeration
- ✅ SI unit conversion table
- ✅ Italian translation requirement
- ✅ Structured output format
- ✅ Literal extraction guidance
**Current Prompt Gaps:**
- ❌ No handling of social media noise (hashtags, mentions, emojis)
- ❌ No guidance for partial recipes
- ❌ No fallback strategy for missing fields
- ❌ No examples (few-shot learning)
- ❌ No handling of ingredient variations (e.g., "1-2 cups")
---
## User Stories
### Story 1: Decompose Share Page into Svelte 5 Snippets
**As a** developer
**I want** the share page split into smaller, focused components using Svelte 5 snippets
**So that** the code is maintainable, testable, and reusable
**Acceptance Criteria:**
- [x] New components created using Svelte 5 snippet syntax
- [x] Each component has a single, clear responsibility
- [x] Components are properly typed with TypeScript
- [x] Props are validated using `$props()` rune
- [x] State is managed using `$state()` and `$derived()` runes
- [x] No functionality is lost during refactoring
- [x] Code follows hexagonal architecture principles (presentation layer only)
**Implementation Details:**
#### Component Breakdown
1. **URLInput.svelte** (Snippet)
- Displays detected URL
- Shows extraction button
- Props: `url: string`, `status: 'idle' | 'extracting' | 'done' | 'error'`, `onExtract: () => void`
2. **ExtractionProgress.svelte** (Snippet)
- Shows real-time extraction progress
- Renders method attempts and status updates
- Props: `status: string`, `currentMethod: string`
3. **RecipeCard.svelte** (Snippet)
- Displays parsed recipe with name, ingredients, steps
- Shows servings, description
- Handles Tandoor integration UI
- Props: `recipe: Recipe`, `tandoorEnabled: boolean`, `onImport: () => void`, `onRetry: () => void`
4. **LogViewer.svelte** (Snippet)
- Terminal-style log display
- Color-coded messages
- Auto-scroll to bottom
- Props: `logs: string[]`, `currentMethod: string`, `status: string`
5. **ExtractedTextViewer.svelte** (Snippet)
- Collapsible details element
- Shows raw extracted text
- Props: `bodyText: string`
#### Refactored Share Page Structure
```svelte
<script lang="ts">
// Import snippet types
import type { Snippet } from 'svelte';
// Main page logic (URL parsing, SSE handling, state management)
// ...
// Define snippets for each component section
{#snippet urlInput()}
<!-- URL input UI -->
{/snippet}
{#snippet progressIndicator()}
<!-- Progress UI -->
{/snippet}
{#snippet recipeDisplay()}
<!-- Recipe card UI -->
{/snippet}
{#snippet logDisplay()}
<!-- Log viewer UI -->
{/snippet}
</script>
<!-- Main template using @render -->
<div class="p-8 max-w-lg mx-auto space-y-4">
<h1 class="text-2xl font-bold">InstaChef PWA</h1>
{@render urlInput()}
{@render progressIndicator()}
{@render extractedTextViewer()}
{@render recipeDisplay()}
{@render logDisplay()}
</div>
```
**Technical Notes:**
- Use `{#snippet name(param1, param2)}...{/snippet}` syntax
- Snippets can reference parent component state
- Type snippets using `Snippet<[T1, T2]>` interface
- Snippets are scoped to their lexical context
- Use `{@render snippetName()}` to render
**Files Modified:**
- [src/routes/share/+page.svelte](src/routes/share/+page.svelte) - Refactored with snippets
---
### Story 2: Diagnose and Fix LLM Integration
**As a** user
**I want** recipe extraction to successfully parse recipes using LM Studio
**So that** I get structured recipe data from Instagram posts
**Acceptance Criteria:**
- [x] LM Studio receives API calls during extraction
- [x] Recipe parsing returns structured data
- [x] Error messages are logged and surfaced to frontend
- [x] Network connectivity validated
- [x] Model compatibility verified
- [x] SSE endpoint properly awaits async operations
- [x] Integration tests pass with mock LLM
**Implementation Details:**
#### Diagnostic Steps
1. **Add Comprehensive Logging**
- Add console.log before/after each LLM API call
- Log request payload and response
- Log any exceptions with full stack trace
- Add timing metrics
2. **Fix SSE Endpoint Await Bug**
- File: [src/routes/api/extract-stream/+server.ts](src/routes/api/extract-stream/+server.ts#L46)
- Current: `const recipe = extractRecipe(extracted.bodyText);`
- Fixed: `const recipe = await extractRecipe(extracted.bodyText);`
3. **Validate Network Connectivity**
- Add health check endpoint to test LM Studio connection
- Test from same network context as app (Docker vs host)
- Verify firewall rules allow connection to port 1234
4. **Verify Model Compatibility**
- Check if `google/gemma-3-4b` supports `beta.chat.completions.parse()`
- Test with alternative models if needed
- Add graceful degradation to standard completion API
5. **Add Fallback Error Handling**
- Wrap LLM calls in try/catch with detailed error messages
- Return partial results when possible
- Surface errors to frontend via SSE error events
#### Code Changes
**File: [src/lib/server/parser.ts](src/lib/server/parser.ts)**
```typescript
export async function detectRecipe(text: string): Promise<boolean> {
try {
const { client, model } = createLLM();
console.log('[LLM] Starting recipe detection...');
console.log('[LLM] Model:', model);
console.log('[LLM] Text length:', text.length);
const detectionResponse = await client.chat.completions.create({
model,
messages: [/* ... */],
max_tokens: 10
});
console.log('[LLM] Detection response:', detectionResponse.choices[0].message.content);
const detectionResult = detectionResponse.choices[0].message.content?.toLowerCase() ?? '';
return detectionResult.includes('yes');
} catch (e) {
console.error('[LLM] Recipe detection error:', e);
console.error('[LLM] Stack trace:', (e as Error).stack);
throw new Error(`Failed to detect recipe: ${(e as Error).message}`);
}
}
export async function parseRecipe(text: string): Promise<Recipe> {
try {
const { client, model } = createLLM();
console.log('[LLM] Starting recipe parsing...');
console.log('[LLM] Model:', model);
const completion = await client.beta.chat.completions.parse({
model,
messages: [/* ... */],
response_format: zodResponseFormat(RecipeSchema, 'recipe')
});
console.log('[LLM] Parse response:', completion.choices[0].message.parsed);
const recipe = completion.choices[0].message.parsed;
if (!recipe || !recipe.name) {
throw new Error('Failed to extract recipe - missing name');
}
return recipe;
} catch (e) {
console.error('[LLM] Recipe parsing error:', e);
console.error('[LLM] Stack trace:', (e as Error).stack);
// If structured output fails, try standard completion
if ((e as any).message?.includes('response_format')) {
console.warn('[LLM] Structured output not supported, falling back to standard completion');
return await parseRecipeWithStandardCompletion(text);
}
throw new Error(`Failed to parse recipe: ${(e as Error).message}`);
}
}
/**
* Fallback parser using standard completion (no structured output)
*/
async function parseRecipeWithStandardCompletion(text: string): Promise<Recipe> {
const { client, model } = createLLM();
const completion = await client.chat.completions.create({
model,
messages: [
{
role: 'system',
content: `You are a recipe extractor. Return ONLY valid JSON matching this schema:
{
"name": "recipe name in Italian",
"servings": number or null,
"description": "description in Italian or null",
"ingredients": [{"item": "ingredient name", "amount": "quantity", "unit": "SI unit"}],
"steps": ["1. First step", "2. Second step", ...]
}`
},
{
role: 'user',
content: `Extract the recipe from this text:\n\n${text}`
}
],
max_tokens: 2000,
temperature: 0.3
});
const jsonResponse = completion.choices[0].message.content;
if (!jsonResponse) {
throw new Error('Empty response from LLM');
}
// Parse and validate JSON
const recipe = JSON.parse(jsonResponse.replace(/```json|```/g, '').trim());
return RecipeSchema.parse(recipe);
}
```
**File: [src/routes/api/extract-stream/+server.ts](src/routes/api/extract-stream/+server.ts)**
```typescript
// Line 46 - FIX: Add await
const recipe = await extractRecipe(extracted.bodyText);
```
**File: [src/lib/server/llm.ts](src/lib/server/llm.ts)**
```typescript
import OpenAI from 'openai';
import { env } from '$env/dynamic/private';
export const createLLM = () => {
const baseURL = env.OPENAI_BASE_URL;
const apiKey = env.OPENAI_API_KEY;
const model = env.LLM_MODEL || 'gpt-4o';
console.log('[LLM] Initializing client...');
console.log('[LLM] Base URL:', baseURL);
console.log('[LLM] Model:', model);
if (!baseURL) {
throw new Error('OPENAI_BASE_URL environment variable is not set');
}
if (!apiKey) {
throw new Error('OPENAI_API_KEY environment variable is not set');
}
const client = new OpenAI({
apiKey,
baseURL
});
return { client, model };
};
/**
* Health check for LLM service
*/
export async function checkLLMHealth(): Promise<boolean> {
try {
const { client } = createLLM();
await client.models.list();
console.log('[LLM] Health check passed');
return true;
} catch (e) {
console.error('[LLM] Health check failed:', e);
return false;
}
}
```
**Files Modified:**
- [src/lib/server/parser.ts](src/lib/server/parser.ts) - Enhanced logging and fallback
- [src/routes/api/extract-stream/+server.ts](src/routes/api/extract-stream/+server.ts) - Fixed await bug
- [src/lib/server/llm.ts](src/lib/server/llm.ts) - Added logging and health check
**Files Created:**
- [src/routes/api/llm-health/+server.ts](src/routes/api/llm-health/+server.ts) - Health check endpoint
---
### Story 3: Create Comprehensive Parsing Prompt
**As a** developer
**I want** an optimized parsing prompt that handles all edge cases
**So that** recipe extraction is robust and accurate
**Acceptance Criteria:**
- [x] Prompt handles social media noise (hashtags, emojis, mentions)
- [x] Prompt includes few-shot examples
- [x] Prompt handles partial/incomplete recipes
- [x] Prompt handles ingredient variations (ranges, alternatives)
- [x] Prompt maintains Italian translation requirement
- [x] Prompt maintains SI unit conversion
- [x] Prompt is well-documented and versioned
**Implementation Details:**
#### Prompt Engineering Strategy
1. **Analyze Current Prompt Strengths**
- Structured output format ✅
- SI conversion table ✅
- Italian translation ✅
- Clear requirements ✅
2. **Add Missing Capabilities**
- Social media text cleaning
- Few-shot examples
- Partial recipe handling
- Ingredient range normalization
- Error recovery strategies
3. **Prompt Structure**
- Role definition
- Comprehensive requirements
- Conversion tables (expanded)
- Output format specification
- Few-shot examples
- Edge case handling rules
#### Enhanced Prompt
**File: [src/lib/server/prompts/recipe-extraction.ts](src/lib/server/prompts/recipe-extraction.ts)**
```typescript
/**
* Recipe Extraction System Prompt - Version 2.0
*
* Changelog:
* - v2.0 (2025-12-21): Added social media handling, few-shot examples, partial recipe support
* - v1.0 (2024): Initial version with Italian translation and SI conversion
*/
export const RECIPE_DETECTION_PROMPT = `You are a recipe detector for social media posts.
Your task: Determine if the text contains a complete or partial recipe.
REQUIREMENTS FOR "YES":
1. Recipe name/title is present
2. At least 3 ingredients with quantities (even if approximate)
3. At least 2 cooking steps
IGNORE:
- Hashtags (#recipe, #food, etc.)
- Mentions (@username)
- Emojis
- Like counts, comments, social metadata
- Promotional text
OUTPUT: Answer with ONLY 'yes' or 'no' - nothing else.
EXAMPLES:
Text: "🍝 Pasta al Pomodoro 🍅 Ingredients: 320g pasta, 400g tomatoes, 2 garlic cloves. Boil pasta. Sauté garlic. Add tomatoes. Mix! #italianfood @chef"
Answer: yes
Text: "Amazing dinner tonight! 😍 So delicious! 🔥 #foodporn"
Answer: no
Text: "You need pasta, tomatoes, and garlic for this recipe"
Answer: no (missing steps)
`;
export const RECIPE_EXTRACTION_PROMPT = `You are an EXPERT RECIPE EXTRACTOR specialized in parsing recipes from social media posts.
🎯 YOUR MISSION:
Extract structured recipe data from text that may contain social media noise, emojis, hashtags, and promotional content.
✅ CORE REQUIREMENTS:
1. **Text Cleaning**: Ignore hashtags, mentions, emojis, like counts, promotional text
2. **Name Extraction**: Extract exact recipe name (translate to Italian)
3. **Ingredient Parsing**: Extract all ingredients with quantities and units
4. **Step Extraction**: Extract all cooking steps in order
5. **Translation**: Translate ALL content to Italian
6. **Unit Conversion**: Convert ALL measurements to SI units (g, mL, °C)
📏 COMPREHENSIVE CONVERSION TABLE:
**Volume (to mL):**
- 1 cup = 240 mL
- 1 tablespoon (tbsp) = 15 mL
- 1 teaspoon (tsp) = 5 mL
- 1 fluid oz (fl oz) = 30 mL
- 1 pint = 473 mL
- 1 quart = 946 mL
- 1 gallon = 3785 mL
**Weight (to g):**
- 1 oz = 28.35 g
- 1 lb (pound) = 453.59 g
- 1 stick butter = 113 g
**Temperature (to °C):**
- Formula: (°F - 32) × 5/9
- 350°F = 175°C
- 375°F = 190°C
- 400°F = 200°C
- 425°F = 220°C
**Special Cases:**
- "a pinch" = "un pizzico" (no quantity)
- "to taste" = "q.b." (quanto basta)
- "1-2 cups" → use midpoint → 1.5 cup = 360 mL
- "1/2 cup" = 120 mL
- "1/4 cup" = 60 mL
🔄 OUTPUT FORMAT (JSON):
{
"name": "Nome della Ricetta in Italiano",
"servings": 4 or null,
"description": "Descrizione in italiano o null",
"ingredients": [
{"item": "nome ingrediente", "amount": "quantità", "unit": "unità SI"},
{"item": "aglio", "amount": "2", "unit": "spicchi"}
],
"steps": [
"1. Primo passaggio dettagliato",
"2. Secondo passaggio dettagliato"
]
}
🎓 FEW-SHOT EXAMPLES:
**Example 1: Clean Recipe**
Input:
"Chocolate Chip Cookies
Ingredients:
- 2 cups all-purpose flour
- 1 tsp baking soda
- 1 cup butter
- 3/4 cup sugar
- 2 eggs
- 2 cups chocolate chips
Instructions:
1. Preheat oven to 375°F
2. Mix flour and baking soda
3. Cream butter and sugar
4. Add eggs
5. Fold in chocolate chips
6. Bake for 10 minutes"
Output:
{
"name": "Biscotti con Gocce di Cioccolato",
"servings": null,
"description": null,
"ingredients": [
{"item": "farina 00", "amount": "480", "unit": "mL"},
{"item": "bicarbonato di sodio", "amount": "5", "unit": "mL"},
{"item": "burro", "amount": "240", "unit": "mL"},
{"item": "zucchero", "amount": "180", "unit": "mL"},
{"item": "uova", "amount": "2", "unit": "pz"},
{"item": "gocce di cioccolato", "amount": "480", "unit": "mL"}
],
"steps": [
"1. Preriscaldare il forno a 190°C",
"2. Mescolare farina e bicarbonato di sodio",
"3. Montare burro e zucchero a crema",
"4. Aggiungere le uova",
"5. Incorporare le gocce di cioccolato",
"6. Cuocere per 10 minuti"
]
}
**Example 2: Social Media Post**
Input:
"🍝 OMG this pasta is AMAZING! 😍👌
Farfalle al Salmone by @lulugargari 🔥
What you need:
Farfalle 320g
Smoked salmon 200g
Heavy cream 200g
Shallot 1/2
Tomato paste 1 tbsp
White wine 1/2 cup
Butter 20g
Salt & pepper to taste
How to make it:
Chop the salmon. Melt butter, add shallot, cook a bit. Deglaze with wine, add salmon, cook 2 mins. Add cream, pepper, tomato paste. Cook pasta al dente, finish in pan. Enjoy! 😋
14K likes 🔥 #pasta #recipe #italianfood"
Output:
{
"name": "Farfalle al Salmone",
"servings": null,
"description": null,
"ingredients": [
{"item": "farfalle", "amount": "320", "unit": "g"},
{"item": "salmone affumicato", "amount": "200", "unit": "g"},
{"item": "panna fresca liquida", "amount": "200", "unit": "g"},
{"item": "scalogno", "amount": "0.5", "unit": "pz"},
{"item": "concentrato di pomodoro", "amount": "15", "unit": "mL"},
{"item": "vino bianco", "amount": "120", "unit": "mL"},
{"item": "burro", "amount": "20", "unit": "g"},
{"item": "sale", "amount": "q.b.", "unit": ""},
{"item": "pepe nero", "amount": "q.b.", "unit": ""}
],
"steps": [
"1. Tritare il salmone affumicato",
"2. Sciogliere il burro e aggiungere lo scalogno tritato, far andare per qualche minuto",
"3. Sfumare con il vino e aggiungere il salmone, cuocere un paio di minuti",
"4. Aggiungere la panna, il pepe e il concentrato di pomodoro",
"5. Cuocere la pasta al dente e ultimare la cottura in padella"
]
}
🛡️ EDGE CASE HANDLING:
1. **Missing Servings**: Set to null
2. **Missing Description**: Set to null
3. **Ingredient Ranges** (e.g., "1-2 cups"): Use midpoint
4. **Vague Quantities** ("a handful"): Use "q.b." and empty unit
5. **Missing Units**: Infer from context (e.g., "2 eggs" → "2 pz")
6. **Multiple Recipes**: Extract ONLY the first recipe
7. **Incomplete Recipe**: Extract what's available, set missing fields to null or empty array
⚠️ CRITICAL RULES:
- Extract ONLY what's explicitly in the text - DO NOT invent ingredients or steps
- Be LITERAL and ACCURATE - preserve ingredient names and quantities
- IGNORE all social media metadata (likes, comments, emojis, hashtags, mentions)
- If units are missing, use context clues or standard assumptions
- Translate faithfully to Italian, preserving culinary terms accurately
- Number all steps sequentially starting with "1."
🎯 QUALITY CHECKLIST:
Before returning, verify:
- [ ] All ingredients have item, amount, and unit
- [ ] All measurements converted to SI units (g, mL, °C)
- [ ] All text translated to Italian
- [ ] All steps numbered sequentially
- [ ] No social media noise (emojis, hashtags, mentions) in output
- [ ] JSON is valid and matches schema
`;
```
**File: [src/lib/server/parser.ts](src/lib/server/parser.ts)** (Updated)
```typescript
import { createLLM } from './llm';
import { zodResponseFormat } from 'openai/helpers/zod';
import { z } from 'zod';
import { RECIPE_DETECTION_PROMPT, RECIPE_EXTRACTION_PROMPT } from './prompts/recipe-extraction';
// ... existing RecipeSchema and type ...
export async function detectRecipe(text: string): Promise<boolean> {
try {
const { client, model } = createLLM();
console.log('[LLM] Starting recipe detection...');
const detectionResponse = await client.chat.completions.create({
model,
messages: [
{
role: 'system',
content: RECIPE_DETECTION_PROMPT
},
{
role: 'user',
content: `Does this text contain a recipe?\n\n${text}`
}
],
max_tokens: 10,
temperature: 0
});
const detectionResult = detectionResponse.choices[0].message.content?.toLowerCase() ?? '';
console.log('[LLM] Detection result:', detectionResult);
return detectionResult.includes('yes');
} catch (e) {
console.error('[LLM] Recipe detection error:', e);
throw new Error(`Failed to detect recipe: ${(e as Error).message}`);
}
}
export async function parseRecipe(text: string): Promise<Recipe> {
try {
const { client, model } = createLLM();
console.log('[LLM] Starting recipe parsing...');
const completion = await client.beta.chat.completions.parse({
model,
messages: [
{
role: 'system',
content: RECIPE_EXTRACTION_PROMPT
},
{
role: 'user',
content: `Extract the recipe from this text:\n\n${text}`
}
],
response_format: zodResponseFormat(RecipeSchema, 'recipe'),
temperature: 0.3
});
const recipe = completion.choices[0].message.parsed;
console.log('[LLM] Parsed recipe:', recipe?.name);
if (!recipe || !recipe.name) {
throw new Error('Failed to extract recipe - missing name');
}
return recipe;
} catch (e) {
console.error('[LLM] Recipe parsing error:', e);
// Fallback to standard completion if structured output fails
if ((e as any).message?.includes('response_format') ||
(e as any).message?.includes('structured output')) {
console.warn('[LLM] Falling back to standard completion');
return await parseRecipeWithStandardCompletion(text);
}
throw new Error(`Failed to parse recipe: ${(e as Error).message}`);
}
}
// ... parseRecipeWithStandardCompletion implementation ...
```
**Files Created:**
- [src/lib/server/prompts/recipe-extraction.ts](src/lib/server/prompts/recipe-extraction.ts) - Versioned prompts
**Files Modified:**
- [src/lib/server/parser.ts](src/lib/server/parser.ts) - Use new prompts
---
## Technical Architecture
### Hexagonal Architecture Compliance
**Domain Layer** (Core Business Logic)
- `Recipe` type definition
- Extraction and parsing interfaces
- No changes needed - already well-separated
**Application Layer** (Use Cases)
- `extractTextAndThumbnail()` - Extraction orchestration
- `extractRecipe()` - Recipe detection and parsing workflow
- Enhanced with better error handling and logging
**Adapter Layer** (External Interfaces)
**Primary Adapters** (Driving - UI):
- `/share/+page.svelte` - Refactored with snippets (Presentation)
- `/api/extract-stream/+server.ts` - SSE endpoint (HTTP Adapter)
**Secondary Adapters** (Driven - Infrastructure):
- `llm.ts` - OpenAI/LM Studio client (LLM Adapter)
- `browser.ts` - Playwright browser (Browser Adapter)
- `extraction.ts` - Instagram scraping (Web Scraping Adapter)
**Dependency Flow:**
```
UI (Svelte) → API Endpoint → Use Case → Domain ← LLM Adapter
← Browser Adapter
```
All dependencies point inward toward the domain. External systems (LLM, Browser) are accessed via ports (interfaces).
---
## Dependencies & Prerequisites
### Required Tools
- Node.js 18+ (current: using Svelte 5)
- LM Studio running at `http://192.168.1.10:1234` (current config)
- Playwright browsers installed
### Environment Variables
```bash
OPENAI_BASE_URL=http://192.168.1.10:1234/v1
OPENAI_API_KEY=ollama
LLM_MODEL=google/gemma-3-4b # or compatible alternative
```
### Package Dependencies
- `svelte@^5.43.8` - Snippets support ✅
- `openai@^4.20.0` - LLM client ✅
- `playwright@^1.56.1` - Browser automation ✅
- `zod@^3.23.0` - Schema validation ✅
---
## Risk Assessment
### High Risk
1. **LLM Model Compatibility**
- `google/gemma-3-4b` may not support structured output
- **Mitigation:** Implement fallback to standard completion API
- **Testing:** Verify with multiple models
2. **Network Connectivity**
- LM Studio may not be accessible from Docker container
- **Mitigation:** Add health check endpoint, document network requirements
- **Testing:** Test both Docker and local environments
### Medium Risk
1. **Svelte 5 Snippets Learning Curve**
- Developers may be unfamiliar with new syntax
- **Mitigation:** Comprehensive documentation in code
- **Testing:** Peer review of refactored components
2. **Prompt Regression**
- New prompt may perform worse on edge cases
- **Mitigation:** A/B test with sample Instagram posts
- **Testing:** Unit tests with diverse recipe samples
### Low Risk
1. **SSE Stream Breaking Changes**
- Adding await might change timing
- **Mitigation:** Thorough manual testing
- **Testing:** E2E tests with real Instagram URLs
---
## Testing Strategy
### Unit Tests
- [ ] Test each Svelte snippet in isolation
- [ ] Mock LLM responses for parser tests
- [ ] Test prompt with diverse social media samples
- [ ] Test unit conversion logic
- [ ] Test Italian translation accuracy
### Integration Tests
- [ ] Test full extraction pipeline with mock LLM
- [ ] Test SSE stream with progress events
- [ ] Test error handling and fallbacks
- [ ] Test Tandoor integration with recipe card
### Manual Testing Checklist
- [ ] Extract recipe from clean Instagram post
- [ ] Extract recipe from noisy social media post (emojis, hashtags)
- [ ] Extract recipe with imperial units (cups, °F)
- [ ] Extract recipe with partial data (missing servings)
- [ ] Test with LM Studio down (error handling)
- [ ] Test with incompatible model (fallback)
- [ ] Verify Italian translation quality
- [ ] Verify SI unit conversions
- [ ] Test responsive design on mobile
### Performance Testing
- [ ] Measure LLM response time
- [ ] Measure SSE stream latency
- [ ] Test with slow network conditions
---
## Documentation Updates
### Code Documentation
- [x] JSDoc comments for all new functions
- [x] Inline comments explaining complex logic
- [x] Prompt versioning with changelog
- [x] TypeScript types for all interfaces
### User Documentation
- [ ] Update README with LM Studio setup instructions
- [ ] Document troubleshooting steps for LLM errors
- [ ] Add example Instagram URLs for testing
### Developer Documentation
- [ ] Document Svelte 5 snippets pattern
- [ ] Document prompt engineering decisions
- [ ] Document fallback strategies
---
## Rollout Plan
### Phase 1: Backend Fixes (Critical)
1. Fix SSE await bug
2. Add comprehensive logging
3. Implement fallback completion API
4. Test with LM Studio
**Success Criteria:** Recipe extraction works end-to-end
### Phase 2: Prompt Enhancement
1. Implement new prompt in prompts/ directory
2. A/B test with sample posts
3. Iterate based on results
4. Deploy to production
**Success Criteria:** Recipe extraction handles social media noise
### Phase 3: Frontend Refactor
1. Create snippets for each component section
2. Refactor share page
3. Test UI functionality
4. Deploy
**Success Criteria:** All features work, code is maintainable
---
## Success Metrics
### Functional Metrics
- ✅ LLM receives API calls (verified in logs)
- ✅ Recipe extraction success rate > 90%
- ✅ All unit tests pass
- ✅ Zero regression in existing functionality
### Code Quality Metrics
- ✅ Share page component < 150 lines
- ✅ Each snippet < 50 lines
- ✅ All functions have type annotations
- ✅ Code coverage > 80%
### User Experience Metrics
- ✅ Extraction completes in < 15 seconds
- ✅ Progress updates appear in < 1 second
- ✅ Error messages are clear and actionable
---
## Open Questions
1. **LLM Model Selection**
- Q: Should we test alternative models beyond google/gemma-3-4b?
- A: Yes, document tested models and compatibility
2. **Snippet vs Full Components**
- Q: Should snippets become separate .svelte files?
- A: No, keep as snippets for simplicity. Migrate later if reused elsewhere.
3. **Prompt Versioning**
- Q: How should we version and test prompts over time?
- A: Use semantic versioning in file, track performance metrics
4. **Docker Networking**
- Q: How to make LM Studio accessible from Docker?
- A: Document host network mode or use host.docker.internal
---
## Next Steps
1. **Review this plan** with stakeholders
2. **Prioritize stories** based on impact
3. **Assign to @dev agent** for implementation
4. **Set up monitoring** for LLM calls and success rates
---
## References
- [Svelte 5 Snippets Documentation](https://svelte.dev/docs/svelte/snippet)
- [OpenAI SDK Documentation](https://platform.openai.com/docs/api-reference)
- [Hexagonal Architecture Guide](.system/abstract_architecture.md)
- [LM Studio API Compatibility](https://lmstudio.ai/docs/api)
---
**Plan Status:** Ready for Implementation
**Estimated Effort:** 8-12 hours
**Priority:** High (Blocking user functionality)