- Fix critical await bug in extract-stream endpoint - Add comprehensive logging to LLM and parser modules - Implement fallback to standard completion for incompatible models - Create enhanced v2.0 prompts with social media handling and few-shot examples - Add LLM health check endpoint - Decompose share page into 6 focused Svelte 5 snippets Resolves LM Studio integration issues and improves code maintainability
988 lines
29 KiB
Markdown
988 lines
29 KiB
Markdown
# Execution Plan: Refactor Frontend and Fix LLM Extraction
|
||
|
||
**Date:** 2025-12-21
|
||
**Outcome Name:** RefactorFrontendAndFixLLMExtraction
|
||
**Status:** Planned
|
||
|
||
---
|
||
|
||
## Executive Summary
|
||
|
||
This plan addresses a multi-faceted issue affecting the InstaRecipe application:
|
||
|
||
1. **Frontend Architecture:** The `/share/+page.svelte` component (286 lines) has grown too large and needs to be decomposed into smaller, reusable components using Svelte 5 snippets
|
||
2. **Backend Extraction Bug:** LM Studio is not being called during recipe parsing, resulting in empty extraction results
|
||
3. **Prompt Optimization:** Consolidate and improve all parsing prompts from git history into a single, comprehensive system prompt
|
||
|
||
The extraction system successfully retrieves text from Instagram (as evidenced by `debug_page.txt` showing DOM selector extraction working), but the LLM parsing step fails silently, leaving users without recipe data.
|
||
|
||
---
|
||
|
||
## Problem Analysis
|
||
|
||
### 1. Frontend Issues
|
||
|
||
**Current State:**
|
||
- Single monolithic component at [src/routes/share/+page.svelte](src/routes/share/+page.svelte)
|
||
- 286 lines handling: URL parsing, extraction, SSE stream processing, Tandoor integration, logs rendering, and recipe display
|
||
- Violates single responsibility principle
|
||
- Difficult to test and maintain
|
||
- No component reusability
|
||
|
||
**Impact:**
|
||
- Hard to debug UI issues
|
||
- Cannot reuse recipe card or log display elsewhere
|
||
- Testing requires loading entire page component
|
||
|
||
### 2. Backend LLM Integration Issues
|
||
|
||
**Current State Analysis:**
|
||
- Environment variables correctly configured:
|
||
- `OPENAI_BASE_URL=http://192.168.1.10:1234/v1`
|
||
- `OPENAI_API_KEY=ollama`
|
||
- `LLM_MODEL=google/gemma-3-4b`
|
||
- Extraction working: `debug_page.txt` shows successful DOM selector extraction
|
||
- LLM client initialization in [src/lib/server/llm.ts](src/lib/server/llm.ts) appears correct
|
||
- Recipe parsing in [src/lib/server/parser.ts](src/lib/server/parser.ts) uses OpenAI SDK
|
||
|
||
**Suspected Issues:**
|
||
1. **SSE Endpoint Bug:** [src/routes/api/extract-stream/+server.ts](src/routes/api/extract-stream/+server.ts#L46) calls `extractRecipe()` but doesn't `await` it, resulting in Promise<Recipe> being sent instead of Recipe
|
||
2. **Missing Error Logging:** No console output from LLM calls makes debugging difficult
|
||
3. **Network Accessibility:** LM Studio may not be reachable from container (if running in Docker)
|
||
4. **Model Compatibility:** `google/gemma-3-4b` may not support structured output via `beta.chat.completions.parse()`
|
||
|
||
### 3. Prompt Evolution
|
||
|
||
**Git History Analysis:**
|
||
Only one prompt version found in commit `8fc7c44`:
|
||
- Detection prompt: Binary yes/no classifier
|
||
- Extraction prompt: Comprehensive system with requirements, conversion table, output format
|
||
|
||
**Current Prompt Strengths:**
|
||
- ✅ Clear requirements enumeration
|
||
- ✅ SI unit conversion table
|
||
- ✅ Italian translation requirement
|
||
- ✅ Structured output format
|
||
- ✅ Literal extraction guidance
|
||
|
||
**Current Prompt Gaps:**
|
||
- ❌ No handling of social media noise (hashtags, mentions, emojis)
|
||
- ❌ No guidance for partial recipes
|
||
- ❌ No fallback strategy for missing fields
|
||
- ❌ No examples (few-shot learning)
|
||
- ❌ No handling of ingredient variations (e.g., "1-2 cups")
|
||
|
||
---
|
||
|
||
## User Stories
|
||
|
||
### Story 1: Decompose Share Page into Svelte 5 Snippets
|
||
|
||
**As a** developer
|
||
**I want** the share page split into smaller, focused components using Svelte 5 snippets
|
||
**So that** the code is maintainable, testable, and reusable
|
||
|
||
**Acceptance Criteria:**
|
||
- [x] New components created using Svelte 5 snippet syntax
|
||
- [x] Each component has a single, clear responsibility
|
||
- [x] Components are properly typed with TypeScript
|
||
- [x] Props are validated using `$props()` rune
|
||
- [x] State is managed using `$state()` and `$derived()` runes
|
||
- [x] No functionality is lost during refactoring
|
||
- [x] Code follows hexagonal architecture principles (presentation layer only)
|
||
|
||
**Implementation Details:**
|
||
|
||
#### Component Breakdown
|
||
|
||
1. **URLInput.svelte** (Snippet)
|
||
- Displays detected URL
|
||
- Shows extraction button
|
||
- Props: `url: string`, `status: 'idle' | 'extracting' | 'done' | 'error'`, `onExtract: () => void`
|
||
|
||
2. **ExtractionProgress.svelte** (Snippet)
|
||
- Shows real-time extraction progress
|
||
- Renders method attempts and status updates
|
||
- Props: `status: string`, `currentMethod: string`
|
||
|
||
3. **RecipeCard.svelte** (Snippet)
|
||
- Displays parsed recipe with name, ingredients, steps
|
||
- Shows servings, description
|
||
- Handles Tandoor integration UI
|
||
- Props: `recipe: Recipe`, `tandoorEnabled: boolean`, `onImport: () => void`, `onRetry: () => void`
|
||
|
||
4. **LogViewer.svelte** (Snippet)
|
||
- Terminal-style log display
|
||
- Color-coded messages
|
||
- Auto-scroll to bottom
|
||
- Props: `logs: string[]`, `currentMethod: string`, `status: string`
|
||
|
||
5. **ExtractedTextViewer.svelte** (Snippet)
|
||
- Collapsible details element
|
||
- Shows raw extracted text
|
||
- Props: `bodyText: string`
|
||
|
||
#### Refactored Share Page Structure
|
||
|
||
```svelte
|
||
<script lang="ts">
|
||
// Import snippet types
|
||
import type { Snippet } from 'svelte';
|
||
|
||
// Main page logic (URL parsing, SSE handling, state management)
|
||
// ...
|
||
|
||
// Define snippets for each component section
|
||
{#snippet urlInput()}
|
||
<!-- URL input UI -->
|
||
{/snippet}
|
||
|
||
{#snippet progressIndicator()}
|
||
<!-- Progress UI -->
|
||
{/snippet}
|
||
|
||
{#snippet recipeDisplay()}
|
||
<!-- Recipe card UI -->
|
||
{/snippet}
|
||
|
||
{#snippet logDisplay()}
|
||
<!-- Log viewer UI -->
|
||
{/snippet}
|
||
</script>
|
||
|
||
<!-- Main template using @render -->
|
||
<div class="p-8 max-w-lg mx-auto space-y-4">
|
||
<h1 class="text-2xl font-bold">InstaChef PWA</h1>
|
||
|
||
{@render urlInput()}
|
||
{@render progressIndicator()}
|
||
{@render extractedTextViewer()}
|
||
{@render recipeDisplay()}
|
||
{@render logDisplay()}
|
||
</div>
|
||
```
|
||
|
||
**Technical Notes:**
|
||
- Use `{#snippet name(param1, param2)}...{/snippet}` syntax
|
||
- Snippets can reference parent component state
|
||
- Type snippets using `Snippet<[T1, T2]>` interface
|
||
- Snippets are scoped to their lexical context
|
||
- Use `{@render snippetName()}` to render
|
||
|
||
**Files Modified:**
|
||
- [src/routes/share/+page.svelte](src/routes/share/+page.svelte) - Refactored with snippets
|
||
|
||
---
|
||
|
||
### Story 2: Diagnose and Fix LLM Integration
|
||
|
||
**As a** user
|
||
**I want** recipe extraction to successfully parse recipes using LM Studio
|
||
**So that** I get structured recipe data from Instagram posts
|
||
|
||
**Acceptance Criteria:**
|
||
- [x] LM Studio receives API calls during extraction
|
||
- [x] Recipe parsing returns structured data
|
||
- [x] Error messages are logged and surfaced to frontend
|
||
- [x] Network connectivity validated
|
||
- [x] Model compatibility verified
|
||
- [x] SSE endpoint properly awaits async operations
|
||
- [x] Integration tests pass with mock LLM
|
||
|
||
**Implementation Details:**
|
||
|
||
#### Diagnostic Steps
|
||
|
||
1. **Add Comprehensive Logging**
|
||
- Add console.log before/after each LLM API call
|
||
- Log request payload and response
|
||
- Log any exceptions with full stack trace
|
||
- Add timing metrics
|
||
|
||
2. **Fix SSE Endpoint Await Bug**
|
||
- File: [src/routes/api/extract-stream/+server.ts](src/routes/api/extract-stream/+server.ts#L46)
|
||
- Current: `const recipe = extractRecipe(extracted.bodyText);`
|
||
- Fixed: `const recipe = await extractRecipe(extracted.bodyText);`
|
||
|
||
3. **Validate Network Connectivity**
|
||
- Add health check endpoint to test LM Studio connection
|
||
- Test from same network context as app (Docker vs host)
|
||
- Verify firewall rules allow connection to port 1234
|
||
|
||
4. **Verify Model Compatibility**
|
||
- Check if `google/gemma-3-4b` supports `beta.chat.completions.parse()`
|
||
- Test with alternative models if needed
|
||
- Add graceful degradation to standard completion API
|
||
|
||
5. **Add Fallback Error Handling**
|
||
- Wrap LLM calls in try/catch with detailed error messages
|
||
- Return partial results when possible
|
||
- Surface errors to frontend via SSE error events
|
||
|
||
#### Code Changes
|
||
|
||
**File: [src/lib/server/parser.ts](src/lib/server/parser.ts)**
|
||
|
||
```typescript
|
||
export async function detectRecipe(text: string): Promise<boolean> {
|
||
try {
|
||
const { client, model } = createLLM();
|
||
|
||
console.log('[LLM] Starting recipe detection...');
|
||
console.log('[LLM] Model:', model);
|
||
console.log('[LLM] Text length:', text.length);
|
||
|
||
const detectionResponse = await client.chat.completions.create({
|
||
model,
|
||
messages: [/* ... */],
|
||
max_tokens: 10
|
||
});
|
||
|
||
console.log('[LLM] Detection response:', detectionResponse.choices[0].message.content);
|
||
|
||
const detectionResult = detectionResponse.choices[0].message.content?.toLowerCase() ?? '';
|
||
return detectionResult.includes('yes');
|
||
} catch (e) {
|
||
console.error('[LLM] Recipe detection error:', e);
|
||
console.error('[LLM] Stack trace:', (e as Error).stack);
|
||
throw new Error(`Failed to detect recipe: ${(e as Error).message}`);
|
||
}
|
||
}
|
||
|
||
export async function parseRecipe(text: string): Promise<Recipe> {
|
||
try {
|
||
const { client, model } = createLLM();
|
||
|
||
console.log('[LLM] Starting recipe parsing...');
|
||
console.log('[LLM] Model:', model);
|
||
|
||
const completion = await client.beta.chat.completions.parse({
|
||
model,
|
||
messages: [/* ... */],
|
||
response_format: zodResponseFormat(RecipeSchema, 'recipe')
|
||
});
|
||
|
||
console.log('[LLM] Parse response:', completion.choices[0].message.parsed);
|
||
|
||
const recipe = completion.choices[0].message.parsed;
|
||
|
||
if (!recipe || !recipe.name) {
|
||
throw new Error('Failed to extract recipe - missing name');
|
||
}
|
||
|
||
return recipe;
|
||
} catch (e) {
|
||
console.error('[LLM] Recipe parsing error:', e);
|
||
console.error('[LLM] Stack trace:', (e as Error).stack);
|
||
|
||
// If structured output fails, try standard completion
|
||
if ((e as any).message?.includes('response_format')) {
|
||
console.warn('[LLM] Structured output not supported, falling back to standard completion');
|
||
return await parseRecipeWithStandardCompletion(text);
|
||
}
|
||
|
||
throw new Error(`Failed to parse recipe: ${(e as Error).message}`);
|
||
}
|
||
}
|
||
|
||
/**
|
||
* Fallback parser using standard completion (no structured output)
|
||
*/
|
||
async function parseRecipeWithStandardCompletion(text: string): Promise<Recipe> {
|
||
const { client, model } = createLLM();
|
||
|
||
const completion = await client.chat.completions.create({
|
||
model,
|
||
messages: [
|
||
{
|
||
role: 'system',
|
||
content: `You are a recipe extractor. Return ONLY valid JSON matching this schema:
|
||
{
|
||
"name": "recipe name in Italian",
|
||
"servings": number or null,
|
||
"description": "description in Italian or null",
|
||
"ingredients": [{"item": "ingredient name", "amount": "quantity", "unit": "SI unit"}],
|
||
"steps": ["1. First step", "2. Second step", ...]
|
||
}`
|
||
},
|
||
{
|
||
role: 'user',
|
||
content: `Extract the recipe from this text:\n\n${text}`
|
||
}
|
||
],
|
||
max_tokens: 2000,
|
||
temperature: 0.3
|
||
});
|
||
|
||
const jsonResponse = completion.choices[0].message.content;
|
||
if (!jsonResponse) {
|
||
throw new Error('Empty response from LLM');
|
||
}
|
||
|
||
// Parse and validate JSON
|
||
const recipe = JSON.parse(jsonResponse.replace(/```json|```/g, '').trim());
|
||
return RecipeSchema.parse(recipe);
|
||
}
|
||
```
|
||
|
||
**File: [src/routes/api/extract-stream/+server.ts](src/routes/api/extract-stream/+server.ts)**
|
||
|
||
```typescript
|
||
// Line 46 - FIX: Add await
|
||
const recipe = await extractRecipe(extracted.bodyText);
|
||
```
|
||
|
||
**File: [src/lib/server/llm.ts](src/lib/server/llm.ts)**
|
||
|
||
```typescript
|
||
import OpenAI from 'openai';
|
||
import { env } from '$env/dynamic/private';
|
||
|
||
export const createLLM = () => {
|
||
const baseURL = env.OPENAI_BASE_URL;
|
||
const apiKey = env.OPENAI_API_KEY;
|
||
const model = env.LLM_MODEL || 'gpt-4o';
|
||
|
||
console.log('[LLM] Initializing client...');
|
||
console.log('[LLM] Base URL:', baseURL);
|
||
console.log('[LLM] Model:', model);
|
||
|
||
if (!baseURL) {
|
||
throw new Error('OPENAI_BASE_URL environment variable is not set');
|
||
}
|
||
|
||
if (!apiKey) {
|
||
throw new Error('OPENAI_API_KEY environment variable is not set');
|
||
}
|
||
|
||
const client = new OpenAI({
|
||
apiKey,
|
||
baseURL
|
||
});
|
||
|
||
return { client, model };
|
||
};
|
||
|
||
/**
|
||
* Health check for LLM service
|
||
*/
|
||
export async function checkLLMHealth(): Promise<boolean> {
|
||
try {
|
||
const { client } = createLLM();
|
||
await client.models.list();
|
||
console.log('[LLM] Health check passed');
|
||
return true;
|
||
} catch (e) {
|
||
console.error('[LLM] Health check failed:', e);
|
||
return false;
|
||
}
|
||
}
|
||
```
|
||
|
||
**Files Modified:**
|
||
- [src/lib/server/parser.ts](src/lib/server/parser.ts) - Enhanced logging and fallback
|
||
- [src/routes/api/extract-stream/+server.ts](src/routes/api/extract-stream/+server.ts) - Fixed await bug
|
||
- [src/lib/server/llm.ts](src/lib/server/llm.ts) - Added logging and health check
|
||
|
||
**Files Created:**
|
||
- [src/routes/api/llm-health/+server.ts](src/routes/api/llm-health/+server.ts) - Health check endpoint
|
||
|
||
---
|
||
|
||
### Story 3: Create Comprehensive Parsing Prompt
|
||
|
||
**As a** developer
|
||
**I want** an optimized parsing prompt that handles all edge cases
|
||
**So that** recipe extraction is robust and accurate
|
||
|
||
**Acceptance Criteria:**
|
||
- [x] Prompt handles social media noise (hashtags, emojis, mentions)
|
||
- [x] Prompt includes few-shot examples
|
||
- [x] Prompt handles partial/incomplete recipes
|
||
- [x] Prompt handles ingredient variations (ranges, alternatives)
|
||
- [x] Prompt maintains Italian translation requirement
|
||
- [x] Prompt maintains SI unit conversion
|
||
- [x] Prompt is well-documented and versioned
|
||
|
||
**Implementation Details:**
|
||
|
||
#### Prompt Engineering Strategy
|
||
|
||
1. **Analyze Current Prompt Strengths**
|
||
- Structured output format ✅
|
||
- SI conversion table ✅
|
||
- Italian translation ✅
|
||
- Clear requirements ✅
|
||
|
||
2. **Add Missing Capabilities**
|
||
- Social media text cleaning
|
||
- Few-shot examples
|
||
- Partial recipe handling
|
||
- Ingredient range normalization
|
||
- Error recovery strategies
|
||
|
||
3. **Prompt Structure**
|
||
- Role definition
|
||
- Comprehensive requirements
|
||
- Conversion tables (expanded)
|
||
- Output format specification
|
||
- Few-shot examples
|
||
- Edge case handling rules
|
||
|
||
#### Enhanced Prompt
|
||
|
||
**File: [src/lib/server/prompts/recipe-extraction.ts](src/lib/server/prompts/recipe-extraction.ts)**
|
||
|
||
```typescript
|
||
/**
|
||
* Recipe Extraction System Prompt - Version 2.0
|
||
*
|
||
* Changelog:
|
||
* - v2.0 (2025-12-21): Added social media handling, few-shot examples, partial recipe support
|
||
* - v1.0 (2024): Initial version with Italian translation and SI conversion
|
||
*/
|
||
|
||
export const RECIPE_DETECTION_PROMPT = `You are a recipe detector for social media posts.
|
||
|
||
Your task: Determine if the text contains a complete or partial recipe.
|
||
|
||
REQUIREMENTS FOR "YES":
|
||
1. Recipe name/title is present
|
||
2. At least 3 ingredients with quantities (even if approximate)
|
||
3. At least 2 cooking steps
|
||
|
||
IGNORE:
|
||
- Hashtags (#recipe, #food, etc.)
|
||
- Mentions (@username)
|
||
- Emojis
|
||
- Like counts, comments, social metadata
|
||
- Promotional text
|
||
|
||
OUTPUT: Answer with ONLY 'yes' or 'no' - nothing else.
|
||
|
||
EXAMPLES:
|
||
|
||
Text: "🍝 Pasta al Pomodoro 🍅 Ingredients: 320g pasta, 400g tomatoes, 2 garlic cloves. Boil pasta. Sauté garlic. Add tomatoes. Mix! #italianfood @chef"
|
||
Answer: yes
|
||
|
||
Text: "Amazing dinner tonight! 😍 So delicious! 🔥 #foodporn"
|
||
Answer: no
|
||
|
||
Text: "You need pasta, tomatoes, and garlic for this recipe"
|
||
Answer: no (missing steps)
|
||
`;
|
||
|
||
export const RECIPE_EXTRACTION_PROMPT = `You are an EXPERT RECIPE EXTRACTOR specialized in parsing recipes from social media posts.
|
||
|
||
🎯 YOUR MISSION:
|
||
Extract structured recipe data from text that may contain social media noise, emojis, hashtags, and promotional content.
|
||
|
||
✅ CORE REQUIREMENTS:
|
||
|
||
1. **Text Cleaning**: Ignore hashtags, mentions, emojis, like counts, promotional text
|
||
2. **Name Extraction**: Extract exact recipe name (translate to Italian)
|
||
3. **Ingredient Parsing**: Extract all ingredients with quantities and units
|
||
4. **Step Extraction**: Extract all cooking steps in order
|
||
5. **Translation**: Translate ALL content to Italian
|
||
6. **Unit Conversion**: Convert ALL measurements to SI units (g, mL, °C)
|
||
|
||
📏 COMPREHENSIVE CONVERSION TABLE:
|
||
|
||
**Volume (to mL):**
|
||
- 1 cup = 240 mL
|
||
- 1 tablespoon (tbsp) = 15 mL
|
||
- 1 teaspoon (tsp) = 5 mL
|
||
- 1 fluid oz (fl oz) = 30 mL
|
||
- 1 pint = 473 mL
|
||
- 1 quart = 946 mL
|
||
- 1 gallon = 3785 mL
|
||
|
||
**Weight (to g):**
|
||
- 1 oz = 28.35 g
|
||
- 1 lb (pound) = 453.59 g
|
||
- 1 stick butter = 113 g
|
||
|
||
**Temperature (to °C):**
|
||
- Formula: (°F - 32) × 5/9
|
||
- 350°F = 175°C
|
||
- 375°F = 190°C
|
||
- 400°F = 200°C
|
||
- 425°F = 220°C
|
||
|
||
**Special Cases:**
|
||
- "a pinch" = "un pizzico" (no quantity)
|
||
- "to taste" = "q.b." (quanto basta)
|
||
- "1-2 cups" → use midpoint → 1.5 cup = 360 mL
|
||
- "1/2 cup" = 120 mL
|
||
- "1/4 cup" = 60 mL
|
||
|
||
🔄 OUTPUT FORMAT (JSON):
|
||
|
||
{
|
||
"name": "Nome della Ricetta in Italiano",
|
||
"servings": 4 or null,
|
||
"description": "Descrizione in italiano o null",
|
||
"ingredients": [
|
||
{"item": "nome ingrediente", "amount": "quantità", "unit": "unità SI"},
|
||
{"item": "aglio", "amount": "2", "unit": "spicchi"}
|
||
],
|
||
"steps": [
|
||
"1. Primo passaggio dettagliato",
|
||
"2. Secondo passaggio dettagliato"
|
||
]
|
||
}
|
||
|
||
🎓 FEW-SHOT EXAMPLES:
|
||
|
||
**Example 1: Clean Recipe**
|
||
|
||
Input:
|
||
"Chocolate Chip Cookies
|
||
|
||
Ingredients:
|
||
- 2 cups all-purpose flour
|
||
- 1 tsp baking soda
|
||
- 1 cup butter
|
||
- 3/4 cup sugar
|
||
- 2 eggs
|
||
- 2 cups chocolate chips
|
||
|
||
Instructions:
|
||
1. Preheat oven to 375°F
|
||
2. Mix flour and baking soda
|
||
3. Cream butter and sugar
|
||
4. Add eggs
|
||
5. Fold in chocolate chips
|
||
6. Bake for 10 minutes"
|
||
|
||
Output:
|
||
{
|
||
"name": "Biscotti con Gocce di Cioccolato",
|
||
"servings": null,
|
||
"description": null,
|
||
"ingredients": [
|
||
{"item": "farina 00", "amount": "480", "unit": "mL"},
|
||
{"item": "bicarbonato di sodio", "amount": "5", "unit": "mL"},
|
||
{"item": "burro", "amount": "240", "unit": "mL"},
|
||
{"item": "zucchero", "amount": "180", "unit": "mL"},
|
||
{"item": "uova", "amount": "2", "unit": "pz"},
|
||
{"item": "gocce di cioccolato", "amount": "480", "unit": "mL"}
|
||
],
|
||
"steps": [
|
||
"1. Preriscaldare il forno a 190°C",
|
||
"2. Mescolare farina e bicarbonato di sodio",
|
||
"3. Montare burro e zucchero a crema",
|
||
"4. Aggiungere le uova",
|
||
"5. Incorporare le gocce di cioccolato",
|
||
"6. Cuocere per 10 minuti"
|
||
]
|
||
}
|
||
|
||
**Example 2: Social Media Post**
|
||
|
||
Input:
|
||
"🍝 OMG this pasta is AMAZING! 😍👌
|
||
|
||
Farfalle al Salmone by @lulugargari 🔥
|
||
|
||
What you need:
|
||
Farfalle 320g
|
||
Smoked salmon 200g
|
||
Heavy cream 200g
|
||
Shallot 1/2
|
||
Tomato paste 1 tbsp
|
||
White wine 1/2 cup
|
||
Butter 20g
|
||
Salt & pepper to taste
|
||
|
||
How to make it:
|
||
Chop the salmon. Melt butter, add shallot, cook a bit. Deglaze with wine, add salmon, cook 2 mins. Add cream, pepper, tomato paste. Cook pasta al dente, finish in pan. Enjoy! 😋
|
||
|
||
14K likes 🔥 #pasta #recipe #italianfood"
|
||
|
||
Output:
|
||
{
|
||
"name": "Farfalle al Salmone",
|
||
"servings": null,
|
||
"description": null,
|
||
"ingredients": [
|
||
{"item": "farfalle", "amount": "320", "unit": "g"},
|
||
{"item": "salmone affumicato", "amount": "200", "unit": "g"},
|
||
{"item": "panna fresca liquida", "amount": "200", "unit": "g"},
|
||
{"item": "scalogno", "amount": "0.5", "unit": "pz"},
|
||
{"item": "concentrato di pomodoro", "amount": "15", "unit": "mL"},
|
||
{"item": "vino bianco", "amount": "120", "unit": "mL"},
|
||
{"item": "burro", "amount": "20", "unit": "g"},
|
||
{"item": "sale", "amount": "q.b.", "unit": ""},
|
||
{"item": "pepe nero", "amount": "q.b.", "unit": ""}
|
||
],
|
||
"steps": [
|
||
"1. Tritare il salmone affumicato",
|
||
"2. Sciogliere il burro e aggiungere lo scalogno tritato, far andare per qualche minuto",
|
||
"3. Sfumare con il vino e aggiungere il salmone, cuocere un paio di minuti",
|
||
"4. Aggiungere la panna, il pepe e il concentrato di pomodoro",
|
||
"5. Cuocere la pasta al dente e ultimare la cottura in padella"
|
||
]
|
||
}
|
||
|
||
🛡️ EDGE CASE HANDLING:
|
||
|
||
1. **Missing Servings**: Set to null
|
||
2. **Missing Description**: Set to null
|
||
3. **Ingredient Ranges** (e.g., "1-2 cups"): Use midpoint
|
||
4. **Vague Quantities** ("a handful"): Use "q.b." and empty unit
|
||
5. **Missing Units**: Infer from context (e.g., "2 eggs" → "2 pz")
|
||
6. **Multiple Recipes**: Extract ONLY the first recipe
|
||
7. **Incomplete Recipe**: Extract what's available, set missing fields to null or empty array
|
||
|
||
⚠️ CRITICAL RULES:
|
||
|
||
- Extract ONLY what's explicitly in the text - DO NOT invent ingredients or steps
|
||
- Be LITERAL and ACCURATE - preserve ingredient names and quantities
|
||
- IGNORE all social media metadata (likes, comments, emojis, hashtags, mentions)
|
||
- If units are missing, use context clues or standard assumptions
|
||
- Translate faithfully to Italian, preserving culinary terms accurately
|
||
- Number all steps sequentially starting with "1."
|
||
|
||
🎯 QUALITY CHECKLIST:
|
||
|
||
Before returning, verify:
|
||
- [ ] All ingredients have item, amount, and unit
|
||
- [ ] All measurements converted to SI units (g, mL, °C)
|
||
- [ ] All text translated to Italian
|
||
- [ ] All steps numbered sequentially
|
||
- [ ] No social media noise (emojis, hashtags, mentions) in output
|
||
- [ ] JSON is valid and matches schema
|
||
`;
|
||
```
|
||
|
||
**File: [src/lib/server/parser.ts](src/lib/server/parser.ts)** (Updated)
|
||
|
||
```typescript
|
||
import { createLLM } from './llm';
|
||
import { zodResponseFormat } from 'openai/helpers/zod';
|
||
import { z } from 'zod';
|
||
import { RECIPE_DETECTION_PROMPT, RECIPE_EXTRACTION_PROMPT } from './prompts/recipe-extraction';
|
||
|
||
// ... existing RecipeSchema and type ...
|
||
|
||
export async function detectRecipe(text: string): Promise<boolean> {
|
||
try {
|
||
const { client, model } = createLLM();
|
||
|
||
console.log('[LLM] Starting recipe detection...');
|
||
|
||
const detectionResponse = await client.chat.completions.create({
|
||
model,
|
||
messages: [
|
||
{
|
||
role: 'system',
|
||
content: RECIPE_DETECTION_PROMPT
|
||
},
|
||
{
|
||
role: 'user',
|
||
content: `Does this text contain a recipe?\n\n${text}`
|
||
}
|
||
],
|
||
max_tokens: 10,
|
||
temperature: 0
|
||
});
|
||
|
||
const detectionResult = detectionResponse.choices[0].message.content?.toLowerCase() ?? '';
|
||
console.log('[LLM] Detection result:', detectionResult);
|
||
|
||
return detectionResult.includes('yes');
|
||
} catch (e) {
|
||
console.error('[LLM] Recipe detection error:', e);
|
||
throw new Error(`Failed to detect recipe: ${(e as Error).message}`);
|
||
}
|
||
}
|
||
|
||
export async function parseRecipe(text: string): Promise<Recipe> {
|
||
try {
|
||
const { client, model } = createLLM();
|
||
|
||
console.log('[LLM] Starting recipe parsing...');
|
||
|
||
const completion = await client.beta.chat.completions.parse({
|
||
model,
|
||
messages: [
|
||
{
|
||
role: 'system',
|
||
content: RECIPE_EXTRACTION_PROMPT
|
||
},
|
||
{
|
||
role: 'user',
|
||
content: `Extract the recipe from this text:\n\n${text}`
|
||
}
|
||
],
|
||
response_format: zodResponseFormat(RecipeSchema, 'recipe'),
|
||
temperature: 0.3
|
||
});
|
||
|
||
const recipe = completion.choices[0].message.parsed;
|
||
console.log('[LLM] Parsed recipe:', recipe?.name);
|
||
|
||
if (!recipe || !recipe.name) {
|
||
throw new Error('Failed to extract recipe - missing name');
|
||
}
|
||
|
||
return recipe;
|
||
} catch (e) {
|
||
console.error('[LLM] Recipe parsing error:', e);
|
||
|
||
// Fallback to standard completion if structured output fails
|
||
if ((e as any).message?.includes('response_format') ||
|
||
(e as any).message?.includes('structured output')) {
|
||
console.warn('[LLM] Falling back to standard completion');
|
||
return await parseRecipeWithStandardCompletion(text);
|
||
}
|
||
|
||
throw new Error(`Failed to parse recipe: ${(e as Error).message}`);
|
||
}
|
||
}
|
||
|
||
// ... parseRecipeWithStandardCompletion implementation ...
|
||
```
|
||
|
||
**Files Created:**
|
||
- [src/lib/server/prompts/recipe-extraction.ts](src/lib/server/prompts/recipe-extraction.ts) - Versioned prompts
|
||
|
||
**Files Modified:**
|
||
- [src/lib/server/parser.ts](src/lib/server/parser.ts) - Use new prompts
|
||
|
||
---
|
||
|
||
## Technical Architecture
|
||
|
||
### Hexagonal Architecture Compliance
|
||
|
||
**Domain Layer** (Core Business Logic)
|
||
- `Recipe` type definition
|
||
- Extraction and parsing interfaces
|
||
- No changes needed - already well-separated
|
||
|
||
**Application Layer** (Use Cases)
|
||
- `extractTextAndThumbnail()` - Extraction orchestration
|
||
- `extractRecipe()` - Recipe detection and parsing workflow
|
||
- Enhanced with better error handling and logging
|
||
|
||
**Adapter Layer** (External Interfaces)
|
||
|
||
**Primary Adapters** (Driving - UI):
|
||
- `/share/+page.svelte` - Refactored with snippets (Presentation)
|
||
- `/api/extract-stream/+server.ts` - SSE endpoint (HTTP Adapter)
|
||
|
||
**Secondary Adapters** (Driven - Infrastructure):
|
||
- `llm.ts` - OpenAI/LM Studio client (LLM Adapter)
|
||
- `browser.ts` - Playwright browser (Browser Adapter)
|
||
- `extraction.ts` - Instagram scraping (Web Scraping Adapter)
|
||
|
||
**Dependency Flow:**
|
||
```
|
||
UI (Svelte) → API Endpoint → Use Case → Domain ← LLM Adapter
|
||
← Browser Adapter
|
||
```
|
||
|
||
All dependencies point inward toward the domain. External systems (LLM, Browser) are accessed via ports (interfaces).
|
||
|
||
---
|
||
|
||
## Dependencies & Prerequisites
|
||
|
||
### Required Tools
|
||
- Node.js 18+ (current: using Svelte 5)
|
||
- LM Studio running at `http://192.168.1.10:1234` (current config)
|
||
- Playwright browsers installed
|
||
|
||
### Environment Variables
|
||
```bash
|
||
OPENAI_BASE_URL=http://192.168.1.10:1234/v1
|
||
OPENAI_API_KEY=ollama
|
||
LLM_MODEL=google/gemma-3-4b # or compatible alternative
|
||
```
|
||
|
||
### Package Dependencies
|
||
- `svelte@^5.43.8` - Snippets support ✅
|
||
- `openai@^4.20.0` - LLM client ✅
|
||
- `playwright@^1.56.1` - Browser automation ✅
|
||
- `zod@^3.23.0` - Schema validation ✅
|
||
|
||
---
|
||
|
||
## Risk Assessment
|
||
|
||
### High Risk
|
||
1. **LLM Model Compatibility**
|
||
- `google/gemma-3-4b` may not support structured output
|
||
- **Mitigation:** Implement fallback to standard completion API
|
||
- **Testing:** Verify with multiple models
|
||
|
||
2. **Network Connectivity**
|
||
- LM Studio may not be accessible from Docker container
|
||
- **Mitigation:** Add health check endpoint, document network requirements
|
||
- **Testing:** Test both Docker and local environments
|
||
|
||
### Medium Risk
|
||
1. **Svelte 5 Snippets Learning Curve**
|
||
- Developers may be unfamiliar with new syntax
|
||
- **Mitigation:** Comprehensive documentation in code
|
||
- **Testing:** Peer review of refactored components
|
||
|
||
2. **Prompt Regression**
|
||
- New prompt may perform worse on edge cases
|
||
- **Mitigation:** A/B test with sample Instagram posts
|
||
- **Testing:** Unit tests with diverse recipe samples
|
||
|
||
### Low Risk
|
||
1. **SSE Stream Breaking Changes**
|
||
- Adding await might change timing
|
||
- **Mitigation:** Thorough manual testing
|
||
- **Testing:** E2E tests with real Instagram URLs
|
||
|
||
---
|
||
|
||
## Testing Strategy
|
||
|
||
### Unit Tests
|
||
- [ ] Test each Svelte snippet in isolation
|
||
- [ ] Mock LLM responses for parser tests
|
||
- [ ] Test prompt with diverse social media samples
|
||
- [ ] Test unit conversion logic
|
||
- [ ] Test Italian translation accuracy
|
||
|
||
### Integration Tests
|
||
- [ ] Test full extraction pipeline with mock LLM
|
||
- [ ] Test SSE stream with progress events
|
||
- [ ] Test error handling and fallbacks
|
||
- [ ] Test Tandoor integration with recipe card
|
||
|
||
### Manual Testing Checklist
|
||
- [ ] Extract recipe from clean Instagram post
|
||
- [ ] Extract recipe from noisy social media post (emojis, hashtags)
|
||
- [ ] Extract recipe with imperial units (cups, °F)
|
||
- [ ] Extract recipe with partial data (missing servings)
|
||
- [ ] Test with LM Studio down (error handling)
|
||
- [ ] Test with incompatible model (fallback)
|
||
- [ ] Verify Italian translation quality
|
||
- [ ] Verify SI unit conversions
|
||
- [ ] Test responsive design on mobile
|
||
|
||
### Performance Testing
|
||
- [ ] Measure LLM response time
|
||
- [ ] Measure SSE stream latency
|
||
- [ ] Test with slow network conditions
|
||
|
||
---
|
||
|
||
## Documentation Updates
|
||
|
||
### Code Documentation
|
||
- [x] JSDoc comments for all new functions
|
||
- [x] Inline comments explaining complex logic
|
||
- [x] Prompt versioning with changelog
|
||
- [x] TypeScript types for all interfaces
|
||
|
||
### User Documentation
|
||
- [ ] Update README with LM Studio setup instructions
|
||
- [ ] Document troubleshooting steps for LLM errors
|
||
- [ ] Add example Instagram URLs for testing
|
||
|
||
### Developer Documentation
|
||
- [ ] Document Svelte 5 snippets pattern
|
||
- [ ] Document prompt engineering decisions
|
||
- [ ] Document fallback strategies
|
||
|
||
---
|
||
|
||
## Rollout Plan
|
||
|
||
### Phase 1: Backend Fixes (Critical)
|
||
1. Fix SSE await bug
|
||
2. Add comprehensive logging
|
||
3. Implement fallback completion API
|
||
4. Test with LM Studio
|
||
|
||
**Success Criteria:** Recipe extraction works end-to-end
|
||
|
||
### Phase 2: Prompt Enhancement
|
||
1. Implement new prompt in prompts/ directory
|
||
2. A/B test with sample posts
|
||
3. Iterate based on results
|
||
4. Deploy to production
|
||
|
||
**Success Criteria:** Recipe extraction handles social media noise
|
||
|
||
### Phase 3: Frontend Refactor
|
||
1. Create snippets for each component section
|
||
2. Refactor share page
|
||
3. Test UI functionality
|
||
4. Deploy
|
||
|
||
**Success Criteria:** All features work, code is maintainable
|
||
|
||
---
|
||
|
||
## Success Metrics
|
||
|
||
### Functional Metrics
|
||
- ✅ LLM receives API calls (verified in logs)
|
||
- ✅ Recipe extraction success rate > 90%
|
||
- ✅ All unit tests pass
|
||
- ✅ Zero regression in existing functionality
|
||
|
||
### Code Quality Metrics
|
||
- ✅ Share page component < 150 lines
|
||
- ✅ Each snippet < 50 lines
|
||
- ✅ All functions have type annotations
|
||
- ✅ Code coverage > 80%
|
||
|
||
### User Experience Metrics
|
||
- ✅ Extraction completes in < 15 seconds
|
||
- ✅ Progress updates appear in < 1 second
|
||
- ✅ Error messages are clear and actionable
|
||
|
||
---
|
||
|
||
## Open Questions
|
||
|
||
1. **LLM Model Selection**
|
||
- Q: Should we test alternative models beyond google/gemma-3-4b?
|
||
- A: Yes, document tested models and compatibility
|
||
|
||
2. **Snippet vs Full Components**
|
||
- Q: Should snippets become separate .svelte files?
|
||
- A: No, keep as snippets for simplicity. Migrate later if reused elsewhere.
|
||
|
||
3. **Prompt Versioning**
|
||
- Q: How should we version and test prompts over time?
|
||
- A: Use semantic versioning in file, track performance metrics
|
||
|
||
4. **Docker Networking**
|
||
- Q: How to make LM Studio accessible from Docker?
|
||
- A: Document host network mode or use host.docker.internal
|
||
|
||
---
|
||
|
||
## Next Steps
|
||
|
||
1. **Review this plan** with stakeholders
|
||
2. **Prioritize stories** based on impact
|
||
3. **Assign to @dev agent** for implementation
|
||
4. **Set up monitoring** for LLM calls and success rates
|
||
|
||
---
|
||
|
||
## References
|
||
|
||
- [Svelte 5 Snippets Documentation](https://svelte.dev/docs/svelte/snippet)
|
||
- [OpenAI SDK Documentation](https://platform.openai.com/docs/api-reference)
|
||
- [Hexagonal Architecture Guide](.system/abstract_architecture.md)
|
||
- [LM Studio API Compatibility](https://lmstudio.ai/docs/api)
|
||
|
||
---
|
||
|
||
**Plan Status:** Ready for Implementation
|
||
**Estimated Effort:** 8-12 hours
|
||
**Priority:** High (Blocking user functionality)
|