Documents hard-won discoveries from active debugging sessions:
- Instagram GraphQL/mobile API silent caption truncation (no marker)
- DOM extraction (html-section strategy) as the only reliable approach
- creator-written '….' vs API truncation — cannot use as signal
- cookies.txt vs auth.json session management and sessionid loss
- Playwright browser session expiry independent of API cookies
- phi4-mini too strict for Italian recipe posts → gemma4 switch
- gemma4 thinking model behavior with max_tokens: 1024
- Tandoor requires Step for ingredients to be saved
- SvelteKit SSE: 3 bugs that caused phase updates to never reach UI
- Gitea CI gotchas: Alpine Chromium, $env/dynamic/private, secrets
- yt-dlp + Playwright split architecture rationale
- Infrastructure reference table
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Exported cleanText() and extractFromDOM() for unit testing
- Fixed metadata prefix regex to handle optional quotes
- Created comprehensive unit tests with mocked Playwright Page (15 tests, 12ms)
- All 275 tests passing
- Fixed NodeJS.Timer → NodeJS.Timeout in scheduler.ts line 13
- Fixed NodeJS.Timer[] → NodeJS.Timeout[] in fixtures.ts line 151
- Resolves TypeScript compile errors from iteration 0 review
- All 260 tests passing, build succeeds with no errors