feat(parser): remove step number prefixes from recipe extraction
- Update RECIPE_EXTRACTION_PROMPT to v2.1 - Remove instruction to number steps sequentially - Update OUTPUT FORMAT and both few-shot examples - Remove 'All steps numbered sequentially' from quality checklist - Update fallback parser system prompt in parseRecipeWithStandardCompletion - Frontend <ol> element already handles auto-numbering - Tandoor integration unaffected (uses array index for step numbers) Fixes double-numbering bug where steps appeared as '1. 1. Step text' All 34 tests passing Implementation follows execution plan in docs/plans/RemoveStepNumberPrefixes.md Documented in docs/outcomes/RemoveStepNumberPrefixes.md
This commit is contained in:
427
docs/plans/RemoveStepNumberPrefixes.md
Normal file
427
docs/plans/RemoveStepNumberPrefixes.md
Normal file
@@ -0,0 +1,427 @@
|
||||
# Execution Plan: Remove Step Number Prefixes from Recipe Parsing
|
||||
|
||||
**Outcome Name:** RemoveStepNumberPrefixes
|
||||
**Created:** 2025-12-21
|
||||
**Analyst:** Vi (Analyst Agent)
|
||||
|
||||
---
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The current recipe parsing system instructs the LLM to number all steps sequentially (e.g., "1. First step", "2. Second step"). However, the frontend displays steps using an HTML ordered list (`<ol class="list-decimal">`), which automatically adds numbering. This creates **redundant double-numbering** in the UI:
|
||||
|
||||
```
|
||||
1. 1. Preriscaldare il forno a 190°C
|
||||
2. 2. Mescolare farina e bicarbonato di sodio
|
||||
3. 3. Montare burro e zucchero a crema
|
||||
```
|
||||
|
||||
The LLM should provide clean step text without number prefixes, allowing the frontend to handle numbering presentation.
|
||||
|
||||
---
|
||||
|
||||
## Current State Analysis
|
||||
|
||||
### Affected Components
|
||||
|
||||
1. **LLM Prompt System** ([src/lib/server/prompts/recipe-extraction.ts](../../../src/lib/server/prompts/recipe-extraction.ts))
|
||||
- `RECIPE_EXTRACTION_PROMPT` explicitly instructs: "Number all steps sequentially starting with '1.'"
|
||||
- Few-shot examples show numbered steps
|
||||
- Quality checklist validates numbered steps
|
||||
|
||||
2. **Fallback Parser** ([src/lib/server/parser.ts](../../../src/lib/server/parser.ts))
|
||||
- `parseRecipeWithStandardCompletion()` system prompt includes: `"steps": ["1. First step", "2. Second step", ...]`
|
||||
- Used when structured output fails
|
||||
|
||||
3. **Frontend Display** ([src/routes/share/components/RecipeCard.svelte](../../../src/routes/share/components/RecipeCard.svelte))
|
||||
- Uses `<ol class="list-decimal">` which auto-numbers list items
|
||||
- No changes needed - already correct
|
||||
|
||||
4. **Tandoor Integration** ([src/lib/server/tandoor.ts](../../../src/lib/server/tandoor.ts))
|
||||
- Maps steps to `{ instruction, order: index, ingredients: [...] }`
|
||||
- Step number derived from array index, not instruction text
|
||||
- No changes needed
|
||||
|
||||
### Root Cause
|
||||
|
||||
The LLM prompt was designed before the frontend was refactored to use Svelte 5 with proper semantic HTML (`<ol>`). The prompt instructions were never updated to reflect that numbering is now a presentation concern, not a data concern.
|
||||
|
||||
---
|
||||
|
||||
## Desired State
|
||||
|
||||
### LLM Output (After)
|
||||
|
||||
```json
|
||||
{
|
||||
"steps": [
|
||||
"Preriscaldare il forno a 190°C",
|
||||
"Mescolare farina e bicarbonato di sodio",
|
||||
"Montare burro e zucchero a crema",
|
||||
"Aggiungere le uova",
|
||||
"Incorporare le gocce di cioccolato",
|
||||
"Cuocere per 10 minuti"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Frontend Rendering
|
||||
|
||||
```html
|
||||
<ol class="list-decimal pl-5 text-sm">
|
||||
<li>Preriscaldare il forno a 190°C</li>
|
||||
<li>Mescolare farina e bicarbonato di sodio</li>
|
||||
<li>Montare burro e zucchero a crema</li>
|
||||
<li>Aggiungere le uova</li>
|
||||
<li>Incorporare le gocce di cioccolato</li>
|
||||
<li>Cuocere per 10 minuti</li>
|
||||
</ol>
|
||||
```
|
||||
|
||||
**Result:** Clean, single numbering (1., 2., 3., etc.) without redundancy.
|
||||
|
||||
---
|
||||
|
||||
## Architecture Alignment
|
||||
|
||||
This change aligns with **Hexagonal Architecture** principles:
|
||||
|
||||
- **Separation of Concerns:** Data extraction (core domain) is separated from presentation logic (UI adapter)
|
||||
- **Domain Purity:** The LLM extracts semantic content (steps), not formatted text
|
||||
- **Adapter Independence:** Frontend can change numbering style (decimal, roman, etc.) without touching the core
|
||||
|
||||
---
|
||||
|
||||
## User Stories
|
||||
|
||||
### Story 1: Update Main LLM Extraction Prompt
|
||||
|
||||
**As a** system architect
|
||||
**I want** the LLM extraction prompt to produce clean, unnumbered step instructions
|
||||
**So that** the frontend can control step numbering presentation
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- [ ] `RECIPE_EXTRACTION_PROMPT` removes instruction: "Number all steps sequentially starting with '1.'"
|
||||
- [ ] Few-shot examples updated to show clean steps without number prefixes
|
||||
- [ ] Quality checklist removes: "All steps numbered sequentially"
|
||||
- [ ] Version updated to v2.1 with changelog entry
|
||||
|
||||
#### Technical Specification
|
||||
|
||||
**File:** [src/lib/server/prompts/recipe-extraction.ts](../../../src/lib/server/prompts/recipe-extraction.ts)
|
||||
|
||||
**Changes Required:**
|
||||
|
||||
1. **Remove Numbering Instruction** (Line ~206)
|
||||
- **Before:** `- Number all steps sequentially starting with "1."`
|
||||
- **After:** *(remove this line)*
|
||||
|
||||
2. **Update Quality Checklist** (Line ~211)
|
||||
- **Before:** `- [ ] All steps numbered sequentially`
|
||||
- **After:** *(remove this line)*
|
||||
|
||||
3. **Update Few-Shot Example 1** (Lines ~131-143)
|
||||
- **Before:**
|
||||
```json
|
||||
"steps": [
|
||||
"1. Preriscaldare il forno a 190°C",
|
||||
"2. Mescolare farina e bicarbonato di sodio",
|
||||
"3. Montare burro e zucchero a crema",
|
||||
"4. Aggiungere le uova",
|
||||
"5. Incorporare le gocce di cioccolato",
|
||||
"6. Cuocere per 10 minuti"
|
||||
]
|
||||
```
|
||||
- **After:**
|
||||
```json
|
||||
"steps": [
|
||||
"Preriscaldare il forno a 190°C",
|
||||
"Mescolare farina e bicarbonato di sodio",
|
||||
"Montare burro e zucchero a crema",
|
||||
"Aggiungere le uova",
|
||||
"Incorporare le gocce di cioccolato",
|
||||
"Cuocere per 10 minuti"
|
||||
]
|
||||
```
|
||||
|
||||
4. **Update Few-Shot Example 2** (Lines ~176-183)
|
||||
- **Before:**
|
||||
```json
|
||||
"steps": [
|
||||
"1. Tritare il salmone affumicato",
|
||||
"2. Sciogliere il burro e aggiungere lo scalogno tritato, far andare per qualche minuto",
|
||||
"3. Sfumare con il vino e aggiungere il salmone, cuocere un paio di minuti",
|
||||
"4. Aggiungere la panna, il pepe e il concentrato di pomodoro",
|
||||
"5. Cuocere la pasta al dente e ultimare la cottura in padella"
|
||||
]
|
||||
```
|
||||
- **After:**
|
||||
```json
|
||||
"steps": [
|
||||
"Tritare il salmone affumicato",
|
||||
"Sciogliere il burro e aggiungere lo scalogno tritato, far andare per qualche minuto",
|
||||
"Sfumare con il vino e aggiungere il salmone, cuocere un paio di minuti",
|
||||
"Aggiungere la panna, il pepe e il concentrato di pomodoro",
|
||||
"Cuocere la pasta al dente e ultimare la cottura in padella"
|
||||
]
|
||||
```
|
||||
|
||||
5. **Update OUTPUT FORMAT Section** (Line ~99-109)
|
||||
- **Before:**
|
||||
```json
|
||||
"steps": [
|
||||
"1. Primo passaggio dettagliato",
|
||||
"2. Secondo passaggio dettagliato"
|
||||
]
|
||||
```
|
||||
- **After:**
|
||||
```json
|
||||
"steps": [
|
||||
"Primo passaggio dettagliato",
|
||||
"Secondo passaggio dettagliato"
|
||||
]
|
||||
```
|
||||
|
||||
6. **Update Version Changelog** (Lines ~3-6)
|
||||
- **Before:**
|
||||
```typescript
|
||||
/**
|
||||
* Recipe Extraction System Prompts - Version 2.0
|
||||
*
|
||||
* Changelog:
|
||||
* - v2.0 (2025-12-21): Added social media handling, few-shot examples, partial recipe support
|
||||
* - v1.0 (2024): Initial version with Italian translation and SI conversion
|
||||
*/
|
||||
```
|
||||
- **After:**
|
||||
```typescript
|
||||
/**
|
||||
* Recipe Extraction System Prompts - Version 2.1
|
||||
*
|
||||
* Changelog:
|
||||
* - v2.1 (2025-12-21): Removed step number prefixes (now handled by frontend <ol>)
|
||||
* - v2.0 (2025-12-21): Added social media handling, few-shot examples, partial recipe support
|
||||
* - v1.0 (2024): Initial version with Italian translation and SI conversion
|
||||
*/
|
||||
```
|
||||
|
||||
#### Testing Strategy
|
||||
|
||||
1. **Manual Test:** Extract a recipe and verify steps don't have "1. ", "2. " prefixes
|
||||
2. **Visual Verification:** Confirm frontend displays proper numbering (not double-numbered)
|
||||
3. **LLM Response Check:** Inspect raw LLM JSON to confirm clean steps
|
||||
|
||||
---
|
||||
|
||||
### Story 2: Update Fallback Parser Prompt
|
||||
|
||||
**As a** system architect
|
||||
**I want** the fallback parser to produce consistent step format
|
||||
**So that** both structured and standard completions behave identically
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- [ ] `parseRecipeWithStandardCompletion()` system prompt updated to remove step numbering instruction
|
||||
- [ ] Fallback parser produces steps without number prefixes
|
||||
- [ ] Behavior matches main parser
|
||||
|
||||
#### Technical Specification
|
||||
|
||||
**File:** [src/lib/server/parser.ts](../../../src/lib/server/parser.ts)
|
||||
|
||||
**Changes Required:**
|
||||
|
||||
1. **Update System Prompt** (Lines ~143-150)
|
||||
- **Before:**
|
||||
```typescript
|
||||
content: `You are a recipe extractor. Return ONLY valid JSON matching this schema:
|
||||
{
|
||||
"name": "recipe name in Italian",
|
||||
"servings": number or null,
|
||||
"description": "description in Italian or null",
|
||||
"ingredients": [{"item": "ingredient name", "amount": "quantity", "unit": "SI unit"}],
|
||||
"steps": ["1. First step", "2. Second step", ...]
|
||||
}
|
||||
```
|
||||
- **After:**
|
||||
```typescript
|
||||
content: `You are a recipe extractor. Return ONLY valid JSON matching this schema:
|
||||
{
|
||||
"name": "recipe name in Italian",
|
||||
"servings": number or null,
|
||||
"description": "description in Italian or null",
|
||||
"ingredients": [{"item": "ingredient name", "amount": "quantity", "unit": "SI unit"}],
|
||||
"steps": ["First step", "Second step", ...]
|
||||
}
|
||||
```
|
||||
|
||||
#### Testing Strategy
|
||||
|
||||
1. **Force Fallback:** Temporarily break `beta.chat.completions.parse()` to trigger fallback
|
||||
2. **Verify Output:** Check that fallback produces clean steps
|
||||
3. **Integration Test:** Ensure recipe extraction works end-to-end with fallback
|
||||
|
||||
---
|
||||
|
||||
### Story 3: Verify Frontend and Tandoor Integration
|
||||
|
||||
**As a** QA engineer
|
||||
**I want** to confirm existing components work correctly with clean step data
|
||||
**So that** no regressions are introduced
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- [ ] RecipeCard displays steps with single numbering (1., 2., 3.)
|
||||
- [ ] Tandoor import successfully creates recipe with correct step numbering
|
||||
- [ ] No visual regressions in step display
|
||||
|
||||
#### Technical Specification
|
||||
|
||||
**Files to Verify (No Changes Needed):**
|
||||
|
||||
1. **[src/routes/share/components/RecipeCard.svelte](../../../src/routes/share/components/RecipeCard.svelte)** (Lines 40-45)
|
||||
```svelte
|
||||
<ol class="list-decimal pl-5 text-sm">
|
||||
{#each recipe.steps as step}
|
||||
<li>{step}</li>
|
||||
{/each}
|
||||
</ol>
|
||||
```
|
||||
- Uses `<ol>` which auto-numbers with CSS
|
||||
- No code changes needed
|
||||
|
||||
2. **[src/lib/server/tandoor.ts](../../../src/lib/server/tandoor.ts)** (Lines 231-260)
|
||||
```typescript
|
||||
const steps: TandoorRecipeDTO['steps'] = (recipe.steps || []).map((instruction, index) => {
|
||||
return {
|
||||
instruction,
|
||||
order: index,
|
||||
ingredients: mappedIngredients
|
||||
};
|
||||
});
|
||||
```
|
||||
- Step number comes from `index`, not instruction text
|
||||
- No code changes needed
|
||||
|
||||
#### Testing Strategy
|
||||
|
||||
1. **Frontend Test:**
|
||||
- Extract a recipe
|
||||
- Verify steps display as "1. Step text", "2. Step text" (not "1. 1. Step text")
|
||||
- Check that step numbering is visually correct
|
||||
|
||||
2. **Tandoor Test:**
|
||||
- Extract a recipe
|
||||
- Import to Tandoor
|
||||
- Verify Tandoor recipe shows correctly numbered steps
|
||||
- Confirm no parsing errors
|
||||
|
||||
3. **Visual Regression:**
|
||||
- Compare before/after screenshots
|
||||
- Ensure no layout changes except removal of duplicate numbers
|
||||
|
||||
---
|
||||
|
||||
### Story 4: Update Tests and Documentation
|
||||
|
||||
**As a** developer
|
||||
**I want** tests and docs to reflect the new step format
|
||||
**So that** future contributors understand the expected behavior
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- [ ] No tests fail due to expecting numbered steps
|
||||
- [ ] If test fixtures exist, they're updated to clean format
|
||||
- [ ] Changelog documents the change
|
||||
|
||||
#### Technical Specification
|
||||
|
||||
**Files to Check:**
|
||||
|
||||
1. **[src/tests/sse-extraction.spec.ts](../../../src/tests/sse-extraction.spec.ts)**
|
||||
- Line 78 has `steps: []` (empty, no impact)
|
||||
- No changes needed
|
||||
|
||||
2. **Other Test Files:**
|
||||
- [src/tests/scheduler.integration.spec.ts](../../../src/tests/scheduler.integration.spec.ts)
|
||||
- [src/tests/scheduler.spec.ts](../../../src/tests/scheduler.spec.ts)
|
||||
- [src/routes/page.svelte.spec.ts](../../../src/routes/page.svelte.spec.ts)
|
||||
- [src/demo.spec.ts](../../../src/demo.spec.ts)
|
||||
- Grep search shows no hardcoded step expectations
|
||||
|
||||
#### Testing Strategy
|
||||
|
||||
1. **Run Test Suite:**
|
||||
```bash
|
||||
npm test
|
||||
```
|
||||
- Verify all tests pass
|
||||
- No changes expected to be needed
|
||||
|
||||
2. **Update Documentation:**
|
||||
- Prompt changelog already updated in Story 1
|
||||
- No other documentation changes needed
|
||||
|
||||
---
|
||||
|
||||
## Dependencies and Risks
|
||||
|
||||
### Dependencies
|
||||
|
||||
- **None:** This change is self-contained within the LLM prompt layer
|
||||
|
||||
### Risks
|
||||
|
||||
| Risk | Likelihood | Impact | Mitigation |
|
||||
|------|------------|--------|------------|
|
||||
| LLM still adds numbers despite prompt change | Low | Medium | Test with multiple recipes; adjust prompt wording if needed |
|
||||
| Existing recipes in DB have numbered steps | N/A | None | Recipes are extracted fresh each time, not stored |
|
||||
| Tandoor integration breaks | Very Low | Medium | Tandoor uses array index for numbering, not text parsing |
|
||||
| Frontend numbering breaks | Very Low | High | `<ol>` is standard HTML; CSS controls numbering style |
|
||||
|
||||
### Rollback Plan
|
||||
|
||||
If issues arise:
|
||||
1. Revert prompt changes (all in one file: `recipe-extraction.ts`)
|
||||
2. Revert parser.ts system prompt change
|
||||
3. No database or infrastructure changes to revert
|
||||
|
||||
---
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
- [ ] **Story 1:** Update `RECIPE_EXTRACTION_PROMPT` in [recipe-extraction.ts](../../../src/lib/server/prompts/recipe-extraction.ts)
|
||||
- [ ] Remove numbering instruction
|
||||
- [ ] Update few-shot examples (2 instances)
|
||||
- [ ] Update OUTPUT FORMAT template
|
||||
- [ ] Remove quality checklist item
|
||||
- [ ] Update version to v2.1 with changelog
|
||||
- [ ] **Story 2:** Update `parseRecipeWithStandardCompletion` in [parser.ts](../../../src/lib/server/parser.ts)
|
||||
- [ ] Modify system prompt schema example
|
||||
- [ ] **Story 3:** Verify integrations
|
||||
- [ ] Test RecipeCard rendering
|
||||
- [ ] Test Tandoor import
|
||||
- [ ] Visual regression check
|
||||
- [ ] **Story 4:** Validate tests
|
||||
- [ ] Run `npm test`
|
||||
- [ ] Confirm no regressions
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
1. **Visual Correctness:** Steps display with single numbering (1., 2., 3.) in RecipeCard
|
||||
2. **LLM Compliance:** Raw LLM output contains steps without "1. ", "2. " prefixes
|
||||
3. **Tandoor Integration:** Recipe imports successfully with correct step ordering
|
||||
4. **Test Pass Rate:** 100% of existing tests pass without modification
|
||||
|
||||
---
|
||||
|
||||
## Outcome File
|
||||
|
||||
Upon completion, create `docs/outcomes/RemoveStepNumberPrefixes.md` documenting:
|
||||
- Implementation details
|
||||
- Test results
|
||||
- Before/after screenshots
|
||||
- Any deviations from plan
|
||||
Reference in New Issue
Block a user