Files
insta-recipe/docs/plans/RemoveStepNumberPrefixes.md
Giancarmine Salucci f5a1089936 feat(parser): remove step number prefixes from recipe extraction
- Update RECIPE_EXTRACTION_PROMPT to v2.1
- Remove instruction to number steps sequentially
- Update OUTPUT FORMAT and both few-shot examples
- Remove 'All steps numbered sequentially' from quality checklist
- Update fallback parser system prompt in parseRecipeWithStandardCompletion
- Frontend <ol> element already handles auto-numbering
- Tandoor integration unaffected (uses array index for step numbers)

Fixes double-numbering bug where steps appeared as '1. 1. Step text'
All 34 tests passing

Implementation follows execution plan in docs/plans/RemoveStepNumberPrefixes.md
Documented in docs/outcomes/RemoveStepNumberPrefixes.md
2025-12-21 04:46:38 +01:00

14 KiB

Execution Plan: Remove Step Number Prefixes from Recipe Parsing

Outcome Name: RemoveStepNumberPrefixes
Created: 2025-12-21
Analyst: Vi (Analyst Agent)


Problem Statement

The current recipe parsing system instructs the LLM to number all steps sequentially (e.g., "1. First step", "2. Second step"). However, the frontend displays steps using an HTML ordered list (<ol class="list-decimal">), which automatically adds numbering. This creates redundant double-numbering in the UI:

1. 1. Preriscaldare il forno a 190°C
2. 2. Mescolare farina e bicarbonato di sodio
3. 3. Montare burro e zucchero a crema

The LLM should provide clean step text without number prefixes, allowing the frontend to handle numbering presentation.


Current State Analysis

Affected Components

  1. LLM Prompt System (src/lib/server/prompts/recipe-extraction.ts)

    • RECIPE_EXTRACTION_PROMPT explicitly instructs: "Number all steps sequentially starting with '1.'"
    • Few-shot examples show numbered steps
    • Quality checklist validates numbered steps
  2. Fallback Parser (src/lib/server/parser.ts)

    • parseRecipeWithStandardCompletion() system prompt includes: "steps": ["1. First step", "2. Second step", ...]
    • Used when structured output fails
  3. Frontend Display (src/routes/share/components/RecipeCard.svelte)

    • Uses <ol class="list-decimal"> which auto-numbers list items
    • No changes needed - already correct
  4. Tandoor Integration (src/lib/server/tandoor.ts)

    • Maps steps to { instruction, order: index, ingredients: [...] }
    • Step number derived from array index, not instruction text
    • No changes needed

Root Cause

The LLM prompt was designed before the frontend was refactored to use Svelte 5 with proper semantic HTML (<ol>). The prompt instructions were never updated to reflect that numbering is now a presentation concern, not a data concern.


Desired State

LLM Output (After)

{
  "steps": [
    "Preriscaldare il forno a 190°C",
    "Mescolare farina e bicarbonato di sodio",
    "Montare burro e zucchero a crema",
    "Aggiungere le uova",
    "Incorporare le gocce di cioccolato",
    "Cuocere per 10 minuti"
  ]
}

Frontend Rendering

<ol class="list-decimal pl-5 text-sm">
  <li>Preriscaldare il forno a 190°C</li>
  <li>Mescolare farina e bicarbonato di sodio</li>
  <li>Montare burro e zucchero a crema</li>
  <li>Aggiungere le uova</li>
  <li>Incorporare le gocce di cioccolato</li>
  <li>Cuocere per 10 minuti</li>
</ol>

Result: Clean, single numbering (1., 2., 3., etc.) without redundancy.


Architecture Alignment

This change aligns with Hexagonal Architecture principles:

  • Separation of Concerns: Data extraction (core domain) is separated from presentation logic (UI adapter)
  • Domain Purity: The LLM extracts semantic content (steps), not formatted text
  • Adapter Independence: Frontend can change numbering style (decimal, roman, etc.) without touching the core

User Stories

Story 1: Update Main LLM Extraction Prompt

As a system architect
I want the LLM extraction prompt to produce clean, unnumbered step instructions
So that the frontend can control step numbering presentation

Acceptance Criteria

  • RECIPE_EXTRACTION_PROMPT removes instruction: "Number all steps sequentially starting with '1.'"
  • Few-shot examples updated to show clean steps without number prefixes
  • Quality checklist removes: "All steps numbered sequentially"
  • Version updated to v2.1 with changelog entry

Technical Specification

File: src/lib/server/prompts/recipe-extraction.ts

Changes Required:

  1. Remove Numbering Instruction (Line ~206)

    • Before: - Number all steps sequentially starting with "1."
    • After: (remove this line)
  2. Update Quality Checklist (Line ~211)

    • Before: - [ ] All steps numbered sequentially
    • After: (remove this line)
  3. Update Few-Shot Example 1 (Lines ~131-143)

    • Before:
      "steps": [
        "1. Preriscaldare il forno a 190°C",
        "2. Mescolare farina e bicarbonato di sodio",
        "3. Montare burro e zucchero a crema",
        "4. Aggiungere le uova",
        "5. Incorporare le gocce di cioccolato",
        "6. Cuocere per 10 minuti"
      ]
      
    • After:
      "steps": [
        "Preriscaldare il forno a 190°C",
        "Mescolare farina e bicarbonato di sodio",
        "Montare burro e zucchero a crema",
        "Aggiungere le uova",
        "Incorporare le gocce di cioccolato",
        "Cuocere per 10 minuti"
      ]
      
  4. Update Few-Shot Example 2 (Lines ~176-183)

    • Before:
      "steps": [
        "1. Tritare il salmone affumicato",
        "2. Sciogliere il burro e aggiungere lo scalogno tritato, far andare per qualche minuto",
        "3. Sfumare con il vino e aggiungere il salmone, cuocere un paio di minuti",
        "4. Aggiungere la panna, il pepe e il concentrato di pomodoro",
        "5. Cuocere la pasta al dente e ultimare la cottura in padella"
      ]
      
    • After:
      "steps": [
        "Tritare il salmone affumicato",
        "Sciogliere il burro e aggiungere lo scalogno tritato, far andare per qualche minuto",
        "Sfumare con il vino e aggiungere il salmone, cuocere un paio di minuti",
        "Aggiungere la panna, il pepe e il concentrato di pomodoro",
        "Cuocere la pasta al dente e ultimare la cottura in padella"
      ]
      
  5. Update OUTPUT FORMAT Section (Line ~99-109)

    • Before:
      "steps": [
        "1. Primo passaggio dettagliato",
        "2. Secondo passaggio dettagliato"
      ]
      
    • After:
      "steps": [
        "Primo passaggio dettagliato",
        "Secondo passaggio dettagliato"
      ]
      
  6. Update Version Changelog (Lines ~3-6)

    • Before:
      /**
       * Recipe Extraction System Prompts - Version 2.0
       * 
       * Changelog:
       * - v2.0 (2025-12-21): Added social media handling, few-shot examples, partial recipe support
       * - v1.0 (2024): Initial version with Italian translation and SI conversion
       */
      
    • After:
      /**
       * Recipe Extraction System Prompts - Version 2.1
       * 
       * Changelog:
       * - v2.1 (2025-12-21): Removed step number prefixes (now handled by frontend <ol>)
       * - v2.0 (2025-12-21): Added social media handling, few-shot examples, partial recipe support
       * - v1.0 (2024): Initial version with Italian translation and SI conversion
       */
      

Testing Strategy

  1. Manual Test: Extract a recipe and verify steps don't have "1. ", "2. " prefixes
  2. Visual Verification: Confirm frontend displays proper numbering (not double-numbered)
  3. LLM Response Check: Inspect raw LLM JSON to confirm clean steps

Story 2: Update Fallback Parser Prompt

As a system architect
I want the fallback parser to produce consistent step format
So that both structured and standard completions behave identically

Acceptance Criteria

  • parseRecipeWithStandardCompletion() system prompt updated to remove step numbering instruction
  • Fallback parser produces steps without number prefixes
  • Behavior matches main parser

Technical Specification

File: src/lib/server/parser.ts

Changes Required:

  1. Update System Prompt (Lines ~143-150)
    • Before:
      content: `You are a recipe extractor. Return ONLY valid JSON matching this schema:
      {
        "name": "recipe name in Italian",
        "servings": number or null,
        "description": "description in Italian or null",
        "ingredients": [{"item": "ingredient name", "amount": "quantity", "unit": "SI unit"}],
        "steps": ["1. First step", "2. Second step", ...]
      }
      
    • After:
      content: `You are a recipe extractor. Return ONLY valid JSON matching this schema:
      {
        "name": "recipe name in Italian",
        "servings": number or null,
        "description": "description in Italian or null",
        "ingredients": [{"item": "ingredient name", "amount": "quantity", "unit": "SI unit"}],
        "steps": ["First step", "Second step", ...]
      }
      

Testing Strategy

  1. Force Fallback: Temporarily break beta.chat.completions.parse() to trigger fallback
  2. Verify Output: Check that fallback produces clean steps
  3. Integration Test: Ensure recipe extraction works end-to-end with fallback

Story 3: Verify Frontend and Tandoor Integration

As a QA engineer
I want to confirm existing components work correctly with clean step data
So that no regressions are introduced

Acceptance Criteria

  • RecipeCard displays steps with single numbering (1., 2., 3.)
  • Tandoor import successfully creates recipe with correct step numbering
  • No visual regressions in step display

Technical Specification

Files to Verify (No Changes Needed):

  1. src/routes/share/components/RecipeCard.svelte (Lines 40-45)

    <ol class="list-decimal pl-5 text-sm">
      {#each recipe.steps as step}
        <li>{step}</li>
      {/each}
    </ol>
    
    • Uses <ol> which auto-numbers with CSS
    • No code changes needed
  2. src/lib/server/tandoor.ts (Lines 231-260)

    const steps: TandoorRecipeDTO['steps'] = (recipe.steps || []).map((instruction, index) => {
      return {
        instruction,
        order: index,
        ingredients: mappedIngredients
      };
    });
    
    • Step number comes from index, not instruction text
    • No code changes needed

Testing Strategy

  1. Frontend Test:

    • Extract a recipe
    • Verify steps display as "1. Step text", "2. Step text" (not "1. 1. Step text")
    • Check that step numbering is visually correct
  2. Tandoor Test:

    • Extract a recipe
    • Import to Tandoor
    • Verify Tandoor recipe shows correctly numbered steps
    • Confirm no parsing errors
  3. Visual Regression:

    • Compare before/after screenshots
    • Ensure no layout changes except removal of duplicate numbers

Story 4: Update Tests and Documentation

As a developer
I want tests and docs to reflect the new step format
So that future contributors understand the expected behavior

Acceptance Criteria

  • No tests fail due to expecting numbered steps
  • If test fixtures exist, they're updated to clean format
  • Changelog documents the change

Technical Specification

Files to Check:

  1. src/tests/sse-extraction.spec.ts

    • Line 78 has steps: [] (empty, no impact)
    • No changes needed
  2. Other Test Files:

Testing Strategy

  1. Run Test Suite:

    npm test
    
    • Verify all tests pass
    • No changes expected to be needed
  2. Update Documentation:

    • Prompt changelog already updated in Story 1
    • No other documentation changes needed

Dependencies and Risks

Dependencies

  • None: This change is self-contained within the LLM prompt layer

Risks

Risk Likelihood Impact Mitigation
LLM still adds numbers despite prompt change Low Medium Test with multiple recipes; adjust prompt wording if needed
Existing recipes in DB have numbered steps N/A None Recipes are extracted fresh each time, not stored
Tandoor integration breaks Very Low Medium Tandoor uses array index for numbering, not text parsing
Frontend numbering breaks Very Low High <ol> is standard HTML; CSS controls numbering style

Rollback Plan

If issues arise:

  1. Revert prompt changes (all in one file: recipe-extraction.ts)
  2. Revert parser.ts system prompt change
  3. No database or infrastructure changes to revert

Implementation Checklist

  • Story 1: Update RECIPE_EXTRACTION_PROMPT in recipe-extraction.ts
    • Remove numbering instruction
    • Update few-shot examples (2 instances)
    • Update OUTPUT FORMAT template
    • Remove quality checklist item
    • Update version to v2.1 with changelog
  • Story 2: Update parseRecipeWithStandardCompletion in parser.ts
    • Modify system prompt schema example
  • Story 3: Verify integrations
    • Test RecipeCard rendering
    • Test Tandoor import
    • Visual regression check
  • Story 4: Validate tests
    • Run npm test
    • Confirm no regressions

Success Metrics

  1. Visual Correctness: Steps display with single numbering (1., 2., 3.) in RecipeCard
  2. LLM Compliance: Raw LLM output contains steps without "1. ", "2. " prefixes
  3. Tandoor Integration: Recipe imports successfully with correct step ordering
  4. Test Pass Rate: 100% of existing tests pass without modification

Outcome File

Upon completion, create docs/outcomes/RemoveStepNumberPrefixes.md documenting:

  • Implementation details
  • Test results
  • Before/after screenshots
  • Any deviations from plan