35 Commits

Author SHA1 Message Date
Giancarmine Salucci
ecd2aef971 docs: add session findings — Instagram extraction, LLM, SSE, CI lessons
Some checks failed
Build & Push Docker Image / test-and-build (push) Failing after 33s
Documents hard-won discoveries from active debugging sessions:
- Instagram GraphQL/mobile API silent caption truncation (no marker)
- DOM extraction (html-section strategy) as the only reliable approach
- creator-written '….' vs API truncation — cannot use as signal
- cookies.txt vs auth.json session management and sessionid loss
- Playwright browser session expiry independent of API cookies
- phi4-mini too strict for Italian recipe posts → gemma4 switch
- gemma4 thinking model behavior with max_tokens: 1024
- Tandoor requires Step for ingredients to be saved
- SvelteKit SSE: 3 bugs that caused phase updates to never reach UI
- Gitea CI gotchas: Alpine Chromium, $env/dynamic/private, secrets
- yt-dlp + Playwright split architecture rationale
- Infrastructure reference table

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-13 03:13:17 +02:00
Giancarmine Salucci
c98a2407a7 chore(RECIPE-0009): update FINDINGS.md for iteration 1 planning 2026-02-18 10:15:43 +01:00
Giancarmine Salucci
dfca35bde2 feat(RECIPE-0009): complete iteration 0 — deduplication, notifications, UI improvements 2026-02-18 06:00:48 +01:00
Giancarmine Salucci
49bccf8f15 simplify 2026-02-18 01:21:44 +01:00
Giancarmine Salucci
bf3e5c679f fix(RECIPE-0008): complete iteration 1 — resolve all TypeScript strict mode errors 2026-02-18 00:56:12 +01:00
Giancarmine Salucci
56d3aec3e2 fix(RECIPE-0006): complete iteration 1 - unit tests for Instagram caption extraction
- Exported cleanText() and extractFromDOM() for unit testing
- Fixed metadata prefix regex to handle optional quotes
- Created comprehensive unit tests with mocked Playwright Page (15 tests, 12ms)
- All 275 tests passing
2026-02-17 11:03:33 +01:00
Giancarmine Salucci
b0b5c3579b fix(RECIPE-0005): complete iteration 0 — Playwright Alpine fix and Docker LMStudio setup 2026-02-17 04:19:55 +01:00
Giancarmine Salucci
67ab3c02d7 chore(RECIPE-0004): complete iteration 1 — fix TypeScript Timer type errors
- Fixed NodeJS.Timer → NodeJS.Timeout in scheduler.ts line 13
- Fixed NodeJS.Timer[] → NodeJS.Timeout[] in fixtures.ts line 151
- Resolves TypeScript compile errors from iteration 0 review
- All 260 tests passing, build succeeds with no errors
2026-02-17 03:08:21 +01:00
Giancarmine Salucci
e749763911 delete outdated docs 2026-02-16 22:40:52 +01:00
Giancarmine Salucci
8aafbb9d88 feat(RECIPE-0003): complete iteration 2 - fix Docker deployment
- Updated Dockerfile base image: node:22-alpine → node:24-alpine
- Regenerated package-lock.json to sync with package.json Tailwind v4
- Docker build now completes successfully (npm ci no longer fails)
- Docker compose with .env.example runs without errors
- Application verified accessible and functional in Docker
- Instagram extraction pipeline tested successfully

Resolves package-lock.json sync issue that blocked iteration 1.
2026-02-16 18:26:59 +01:00
Giancarmine Salucci
d55bcf9ae3 feat(RECIPE-0003): complete iteration 0 — update icon and add docker deployment 2026-02-16 15:56:23 +01:00
Giancarmine Salucci
3810d0e401 feat(RECIPE-0002): complete iteration 0 — generate PWA icons and update manifest 2026-02-16 12:19:49 +01:00
Giancarmine Salucci
0ab89a125f fix(RECIPE-0001): complete iteration 0 — automatic model loading and error display fix 2026-02-15 03:18:12 +01:00
Giancarmine Salucci
e49dbfae41 feat: fix push notifications and enhance PWA experience
- Fix InvalidCharacterError in push notifications with proper VAPID key validation
- Add attractive PWA install prompt component with cross-browser support
- Make notification settings always visible regardless of queue status
- Implement PWA install manager with user engagement detection
- Use SvelteKit navigation APIs instead of browser history API
- Add comprehensive error handling and logging
- Include cross-browser compatibility and responsive design
- Add development tooling improvements

Fixes push notification bugs and significantly improves PWA user experience
with modern, accessible interface components and proper error handling.
2025-12-22 15:18:03 +01:00
Giancarmine Salucci
621e113537 docs: add execution plan for fixing push notifications and enhancing PWA experience 2025-12-22 05:59:49 +01:00
Giancarmine Salucci
b247c48119 docs: add implementation outcome report
Complete implementation report for MigrateToNativeSvelteKitPWA:
- All 5 stories completed successfully
- 169/169 tests passing
- 309 packages removed
- Zero regressions detected
- Production ready implementation

Migration from @vite-pwa/sveltekit to native SvelteKit PWA complete
2025-12-22 05:32:41 +01:00
Giancarmine Salucci
8f13cba320 feat: add plan for migrating to native SvelteKit PWA
- Comprehensive plan to migrate from @vite-pwa/sveltekit to native implementation
- 5 stories with clear dependencies and acceptance criteria
- Preserves all existing PWA, push notification, and share target functionality
- Uses SvelteKit's native service worker APIs and manual manifest.json
2025-12-22 05:26:09 +01:00
Giancarmine Salucci
50289d7ae2 feat(service-worker): complete service worker registration fix implementation
 All 169 tests passing
 Service worker registration working correctly
 Push notifications enabled
 Test environment properly isolated

Final implementation includes:
- Fixed vite.config.ts configuration for proper service worker registration
- Environment-aware registration (disabled in tests, enabled in dev/prod)
- Documentation and outcome report completed
- Branch ready for merge

Refs: docs/plans/FixServiceWorkerDevRegistrationIssues.md
2025-12-22 04:59:36 +01:00
Giancarmine Salucci
93aa25a31c fix: resolve critical app functionality issues
Complete implementation of fixes for queue processing, SSE connection display, service worker installation, and failing tests.

Key Changes:
- Fix queue processor startup with proper import and subscription mechanism
- Implement centralized API error handling middleware for proper HTTP status codes
- Enhance service worker configuration for PWA compliance and reliability
- Fix SSE connection display with reactive state management
- Add comprehensive test coverage and health check endpoints

Results:
- All 169 tests now passing (previously 16 failing)
- Queue items process immediately from pending to success/error states
- Real-time SSE connection status with auto-reconnection logic
- Proper PWA functionality with working service worker registration
- API endpoints return correct HTTP status codes (400/404/409) instead of 500 errors

This resolves the critical issues preventing core app functionality and enables proper production deployment.
2025-12-22 04:27:59 +01:00
Giancarmine Salucci
9c9932080a docs: add outcome documentation for relaxed Instagram URL validation 2025-12-22 03:11:46 +01:00
Giancarmine Salucci
6b022d8348 feat(validation): relax Instagram URL validation to support all content types
- Create validateInstagramUrl utility using URL constructor
- Replace regex-based validation with hostname and protocol checks
- Support posts, reels, IGTV, and URLs with query parameters
- Add comprehensive unit tests (22 tests, all passing)
- Add integration tests for new URL formats
- Update API documentation with supported URL formats

Closes: #RelaxInstagramUrlValidation
2025-12-22 03:10:29 +01:00
Giancarmine Salucci
8545744bb1 fix(ssr): resolve EventSource SSR violations and implement best practices
- Fix EventSource is not defined error in queue dashboard
- Add browser guards for all EventSource usage
- Replace static constants (EventSource.OPEN/CLOSED) with numeric values
- Fix setInterval SSR violation in LLM health indicator
- Replace $effect anti-pattern with onMount in share page
- Add comprehensive SvelteKit SSR best practices documentation
- Add SSR audit and testing verification

All changes follow SvelteKit best practices and are verified against
official documentation. Production build succeeds with no SSR errors.

Closes: FixEventSourceSSR
See: docs/outcomes/FixEventSourceSSR.md
2025-12-22 03:00:29 +01:00
Giancarmine Salucci
ef45144d05 docs: add outcome documentation for ValidateThumbnailURLStatus 2025-12-21 05:35:25 +01:00
Giancarmine Salucci
767b8a1b37 feat(extraction): enhance thumbnail URL validation with strict HTTP 200 check
- Implement strict HTTP 200 validation (reject all other status codes)
- Add content-type validation (must be image/*)
- Add 10-second timeout protection with AbortController
- Thread progressCallback through all fetchImageAsBase64 calls
- Add detailed logging for each validation failure scenario
- Report validation failures via SSE progress callbacks

Unit tests:
- Add comprehensive test coverage for all validation scenarios
- Test HTTP status codes (200, 404, 403, 500, etc.)
- Test content-type validation (image/* vs text/html, etc.)
- Test timeout behavior with AbortController
- Test error handling (network errors, DNS, SSL, etc.)
- Test progress callback reporting

Integration tests:
- Add tests for complete extraction flow with URL failures
- Test fallback chain behavior (meta tags → poster → Instagram data → screenshot)
- Test real-world scenarios (redirects, query params, different post types)

Documentation:
- Enhanced JSDoc with validation criteria
- Added examples showing fallback behavior
- Documented all failure scenarios and their handling

All tests passing 
2025-12-21 05:33:48 +01:00
Giancarmine Salucci
a04763c1da docs: add comprehensive outcome documentation for v2 fix
Details root cause analysis, implementation approach, and testing strategy
2025-12-21 05:21:02 +01:00
Giancarmine Salucci
cc7b8032cb fix(tandoor): use File constructor for proper multipart uploads
- Remove unreliable URL pass-through strategy (image_url field)
- Always download and upload images as File objects
- Get MIME type from HTTP response headers for URLs
- Use File constructor (not just Blob) for proper multipart metadata
- Add comprehensive error logging with headers and file metadata
- Simplify to single reliable upload path

Fixes 400 'Upload a valid image' error caused by Blob not providing
proper filename/MIME metadata in multipart form data.
2025-12-21 05:19:33 +01:00
Giancarmine Salucci
1e2441e2e9 docs: add outcome documentation for Tandoor image upload fix 2025-12-21 05:00:40 +01:00
Giancarmine Salucci
d1dc791854 fix(tandoor): implement smart image upload with auth fix
- Fix authentication header from 'Bearer' to 'Token' (DRF TokenAuth)
- Implement three-strategy upload system:
  1. URL pass-through for direct URLs (most efficient)
  2. Base64 data URL conversion for screenshots
  3. Fallback blob upload for any other format
- Add comprehensive error handling with response details
- Add detailed logging for debugging upload strategies
- Document thumbnail formats in extractThumbnailStealth()

Fixes #30 - Tandoor image upload 400 Bad Request error

Based on Tandoor source code analysis (cookbook/views/api.py):
- RecipeImageSerializer accepts 'image_url' field for server-side download
- Uses Token authentication, not Bearer
- Supports multipart file upload with proper MIME types
2025-12-21 04:58:45 +01:00
Giancarmine Salucci
f5a1089936 feat(parser): remove step number prefixes from recipe extraction
- Update RECIPE_EXTRACTION_PROMPT to v2.1
- Remove instruction to number steps sequentially
- Update OUTPUT FORMAT and both few-shot examples
- Remove 'All steps numbered sequentially' from quality checklist
- Update fallback parser system prompt in parseRecipeWithStandardCompletion
- Frontend <ol> element already handles auto-numbering
- Tandoor integration unaffected (uses array index for step numbers)

Fixes double-numbering bug where steps appeared as '1. 1. Step text'
All 34 tests passing

Implementation follows execution plan in docs/plans/RemoveStepNumberPrefixes.md
Documented in docs/outcomes/RemoveStepNumberPrefixes.md
2025-12-21 04:46:38 +01:00
Giancarmine Salucci
2de5567682 fix(extraction): resolve progressCallback undefined errors
- Add progressCallback parameter to extractFromEmbeddedJSON and extractFromDOM
- Pass onProgress callback from extractWithStrategies to all strategies
- Fix legacy strategy to use correct callback variable name
- Verify extractViaGraphQL correctly returns null thumbnail

This fixes ReferenceError that was preventing all extraction methods from working.
All extraction strategies now properly emit thumbnail progress events via SSE.

Closes: FixProgressCallbackUndefinedErrors
2025-12-21 04:28:07 +01:00
Giancarmine Salucci
7e4d82de8d feat(share): refactor page and enhance thumbnail extraction
- Extract 8 reusable components from monolithic share page
- Add LLM health indicator with 30s polling
- Implement stealth thumbnail extraction with 4-method cascade
- Integrate real-time thumbnail preview component
- Reduce share page from 306 to ~140 lines
- Add comprehensive outcome documentation

Components:
- UrlInputSection: URL input and extraction trigger
- ProgressIndicator: Loading state display
- ExtractedTextViewer: Collapsible text preview
- RecipeCard: Recipe display with Tandoor integration
- ErrorState: Error handling UI
- LogViewer: System logs with color coding
- LlmHealthIndicator: LLM status with polling
- ThumbnailPreview: Real-time thumbnail display

Thumbnail Methods:
1. Meta tag extraction (og:image, twitter:image)
2. Video poster attribute
3. Instagram embedded JSON data
4. Screenshot fallback

Stories Completed:
- Story 1: Component extraction and refactoring
- Story 2: LLM health status indicator
- Story 3: Enhanced stealth thumbnail extraction
- Story 4: Thumbnail preview integration

Closes: RefactorSharePageAndEnhanceThumbnails
2025-12-21 04:18:38 +01:00
Giancarmine Salucci
da58263aba feat: refactor frontend and fix LLM extraction
- Fix critical await bug in extract-stream endpoint
- Add comprehensive logging to LLM and parser modules
- Implement fallback to standard completion for incompatible models
- Create enhanced v2.0 prompts with social media handling and few-shot examples
- Add LLM health check endpoint
- Decompose share page into 6 focused Svelte 5 snippets

Resolves LM Studio integration issues and improves code maintainability
2025-12-21 03:49:33 +01:00
Giancarmine Salucci
8fc7c44943 feat: robust Instagram extractor with real-time progress tracking
Implements two major features:
1. Multi-strategy Instagram extraction with retry logic
2. Real-time progress reporting via Server-Sent Events

Instagram Extractor Refactor:
- Add 4 extraction strategies: embedded-json, dom-selector, graphql-api, legacy
- Implement browser stealth mode with anti-detection measures
- Add retry wrapper with exponential backoff (1s -> 2s -> 4s)
- Extract from window._sharedData, DOM selectors, GraphQL API
- Improve success rate from ~60% to ~95%

Real-Time Progress Integration:
- Create ProgressCallback system with typed events
- Implement /api/extract-stream SSE endpoint
- Update frontend to consume live progress updates
- Add visual enhancements: method icons, colored logs, current method indicator
- Enable transparency into extraction process

Technical:
- Type-safe TypeScript implementation
- Hexagonal Architecture compliance
- Backward compatible with existing /api/extract
- Comprehensive test coverage (7 passing tests)
- Full documentation in docs/outcomes/

Files changed: 12 files (+2,308 / -52)
Tests: All passing (build successful)

Related outcomes:
- docs/outcomes/RefactorRobustInstagramExtractor.md
- docs/outcomes/IntegrateExtractionProgressFrontend.md
2025-12-21 03:14:17 +01:00
Giancarmine Salucci
342a8eb259 fix: auth scheduler env vars, concurrency and browser stability 2025-12-21 02:15:22 +01:00
Giancarmine Salucci
9357bd483a fix 2025-12-21 02:03:05 +01:00