# Execution Plan: Fix ProgressCallback Undefined Errors ## Overview **Outcome Name:** FixProgressCallbackUndefinedErrors **Created:** 2025-12-21 **Status:** Ready for Implementation ## Problem Analysis The extraction system is failing with `ReferenceError: progressCallback is not defined` at multiple locations in the extraction pipeline. The root cause is that several extraction methods are calling `extractThumbnailStealth(page, progressCallback)` with a `progressCallback` variable that doesn't exist in their scope. ### Affected Locations 1. **Line 224** - `extractFromEmbeddedJSON()` function 2. **Line 239** - `extractFromEmbeddedJSON()` function 3. **Line 346** - `extractFromDOM()` function 4. **Line 459** - Legacy extraction strategy in `extractWithStrategies()` ### Current Function Signatures ```typescript // These functions don't have progressCallback parameter async function extractFromEmbeddedJSON(page: Page): Promise async function extractFromDOM(page: Page): Promise // But they call this function with progressCallback async function extractThumbnailStealth( page: Page, progressCallback?: ProgressCallback ): Promise // extractWithStrategies has the callback but doesn't pass it down async function extractWithStrategies( url: string, page: Page, context: BrowserContext, onProgress?: ProgressCallback ): Promise ``` ## Root Cause The `extractWithStrategies()` function receives an `onProgress` callback parameter but does NOT pass it to the individual extraction method functions (`extractFromEmbeddedJSON`, `extractFromDOM`). These functions then try to use an undefined `progressCallback` variable when calling `extractThumbnailStealth()`. ## Stories ### Story 1: Add ProgressCallback Parameter to extractFromEmbeddedJSON **Priority:** High **Acceptance Criteria:** - Function signature updated to accept optional `progressCallback?: ProgressCallback` - All calls to `extractThumbnailStealth()` within the function use the parameter - No references to undefined variables **Implementation Steps:** 1. Update function signature: ```typescript async function extractFromEmbeddedJSON( page: Page, progressCallback?: ProgressCallback ): Promise ``` 2. Ensure both calls to `extractThumbnailStealth` (lines ~224 and ~239) use the parameter correctly (already doing this, just need to receive it) **Technical Notes:** - The function already correctly passes `progressCallback` to `extractThumbnailStealth` - Just needs to receive it as a parameter instead of referencing undefined variable --- ### Story 2: Add ProgressCallback Parameter to extractFromDOM **Priority:** High **Acceptance Criteria:** - Function signature updated to accept optional `progressCallback?: ProgressCallback` - Call to `extractThumbnailStealth()` at line ~346 uses the parameter - No references to undefined variables **Implementation Steps:** 1. Update function signature: ```typescript async function extractFromDOM( page: Page, progressCallback?: ProgressCallback ): Promise ``` 2. The call to `extractThumbnailStealth` at line ~346 already uses the parameter correctly **Technical Notes:** - Single call to `extractThumbnailStealth` in this function - Already written to use `progressCallback`, just needs to receive it --- ### Story 3: Update extractWithStrategies to Pass Callback to Strategy Functions **Priority:** High **Acceptance Criteria:** - All strategy functions receive the `onProgress` callback - Legacy strategy inline function updated to use parameter correctly - No undefined variable references in any strategy **Implementation Steps:** 1. Update the strategies array to pass `onProgress` to each function: ```typescript const strategies: Array<{ name: ExtractionMethod; fn: () => Promise; }> = [ { name: 'embedded-json', fn: () => extractFromEmbeddedJSON(page, onProgress) }, { name: 'dom-selector', fn: () => extractFromDOM(page, onProgress) }, { name: 'graphql-api', fn: () => extractViaGraphQL(url, context) }, { name: 'legacy', fn: async () => { const text = await extractCleanTextLegacy(page); const thumbnail = await extractThumbnailStealth(page, onProgress); return { bodyText: text, thumbnail }; } } ]; ``` 2. Verify the legacy strategy's inline function uses `onProgress` instead of `progressCallback` **Technical Notes:** - The legacy strategy is defined inline at line ~459 - Currently references undefined `progressCallback` - Should use `onProgress` from the parent function's scope - GraphQL strategy doesn't extract thumbnails so doesn't need the callback --- ### Story 4: Verify extractViaGraphQL Doesn't Need Callback **Priority:** Medium **Acceptance Criteria:** - Confirm `extractViaGraphQL` doesn't extract thumbnails - Ensure it doesn't have undefined variable references - Document if thumbnail extraction should be added in the future **Implementation Steps:** 1. Review `extractViaGraphQL` function (lines ~360-410) 2. Confirm it only extracts text, not thumbnails 3. Add code comment if thumbnail extraction could be beneficial **Technical Notes:** - Currently returns `ExtractedContent` but likely with `thumbnail: null` - May need enhancement in future to extract thumbnail via GraphQL data - Not urgent for this fix since it doesn't reference `progressCallback` --- ### Story 5: Verify All Other Functions Using extractThumbnailStealth **Priority:** Low **Acceptance Criteria:** - Search entire codebase for `extractThumbnailStealth` calls - Ensure all callers properly pass or omit the optional parameter - No other undefined variable references exist **Implementation Steps:** 1. Use grep/search to find all calls to `extractThumbnailStealth` 2. Verify each call either: - Passes a valid `ProgressCallback` parameter, or - Intentionally omits it (relying on optional parameter) 3. Document any findings **Technical Notes:** - The parameter is optional, so omitting it is valid - But passing an undefined variable is not valid - Found 5 calls total in extraction.ts (4 problematic, 1 direct call) --- ## Dependencies ```mermaid graph TD A[Story 1: Fix extractFromEmbeddedJSON] --> C[Story 3: Update extractWithStrategies] B[Story 2: Fix extractFromDOM] --> C C --> D[Story 5: Verify All Callers] E[Story 4: Verify GraphQL] --> D ``` **Critical Path:** Stories 1, 2, 3 must be completed together as they form a cohesive parameter-passing chain. ## Testing Strategy ### Unit Testing - Test each extraction function with and without `progressCallback` parameter - Verify callback is invoked when thumbnail is extracted - Verify no errors when callback is omitted ### Integration Testing - Test full extraction flow with SSE endpoint - Verify thumbnail progress events are emitted correctly - Test all extraction methods (embedded-json, dom-selector, graphql, legacy) ### Manual Testing 1. Start dev server 2. Attempt to extract from Instagram URL 3. Verify no `ReferenceError: progressCallback is not defined` errors 4. Verify thumbnail progress events appear in SSE stream 5. Test retry logic still works correctly ## Risk Assessment **Risk Level:** Low **Impact:** High (currently breaks all extractions) **Complexity:** Low (parameter passing fix) ### Potential Issues 1. **Breaking existing callers** - Mitigated by making parameter optional 2. **Missing other undefined references** - Mitigated by Story 5 verification 3. **Callback not propagating** - Mitigated by following the call chain ### Rollback Strategy - Changes are additive (adding optional parameters) - No breaking changes to function signatures - Easy to revert if issues arise ## Implementation Checklist - [ ] Story 1: Update `extractFromEmbeddedJSON` signature - [ ] Story 2: Update `extractFromDOM` signature - [ ] Story 3: Update `extractWithStrategies` to pass callbacks - [ ] Story 4: Review and document `extractViaGraphQL` - [ ] Story 5: Verify all `extractThumbnailStealth` callers - [ ] Run extraction tests - [ ] Manual testing with dev server - [ ] Commit with descriptive message ## Success Criteria 1. ✅ No `ReferenceError: progressCallback is not defined` errors 2. ✅ All extraction methods work correctly 3. ✅ Thumbnail progress events are emitted via SSE 4. ✅ Retry logic continues to function 5. ✅ No regression in existing functionality ## Technical References - **File:** [src/lib/server/extraction.ts](src/lib/server/extraction.ts) - **Type Definition:** `ProgressCallback = (event: ProgressEvent) => void` (line 22) - **SSE Integration:** [src/routes/api/extract-stream/+server.ts](src/routes/api/extract-stream/+server.ts) ## Notes This is a straightforward parameter-passing bug introduced during a recent refactor where thumbnail extraction was enhanced with progress callbacks. The extraction functions were updated to call `extractThumbnailStealth` with a callback, but weren't updated to receive that callback as a parameter. The fix maintains backward compatibility by making the parameter optional, allowing the functions to work with or without progress tracking.