- Add progressCallback parameter to extractFromEmbeddedJSON and extractFromDOM - Pass onProgress callback from extractWithStrategies to all strategies - Fix legacy strategy to use correct callback variable name - Verify extractViaGraphQL correctly returns null thumbnail This fixes ReferenceError that was preventing all extraction methods from working. All extraction strategies now properly emit thumbnail progress events via SSE. Closes: FixProgressCallbackUndefinedErrors
9.2 KiB
Execution Plan: Fix ProgressCallback Undefined Errors
Overview
Outcome Name: FixProgressCallbackUndefinedErrors
Created: 2025-12-21
Status: Ready for Implementation
Problem Analysis
The extraction system is failing with ReferenceError: progressCallback is not defined at multiple locations in the extraction pipeline. The root cause is that several extraction methods are calling extractThumbnailStealth(page, progressCallback) with a progressCallback variable that doesn't exist in their scope.
Affected Locations
- Line 224 -
extractFromEmbeddedJSON()function - Line 239 -
extractFromEmbeddedJSON()function - Line 346 -
extractFromDOM()function - Line 459 - Legacy extraction strategy in
extractWithStrategies()
Current Function Signatures
// These functions don't have progressCallback parameter
async function extractFromEmbeddedJSON(page: Page): Promise<ExtractedContent | null>
async function extractFromDOM(page: Page): Promise<ExtractedContent | null>
// But they call this function with progressCallback
async function extractThumbnailStealth(
page: Page,
progressCallback?: ProgressCallback
): Promise<string | null>
// extractWithStrategies has the callback but doesn't pass it down
async function extractWithStrategies(
url: string,
page: Page,
context: BrowserContext,
onProgress?: ProgressCallback
): Promise<ExtractionResult>
Root Cause
The extractWithStrategies() function receives an onProgress callback parameter but does NOT pass it to the individual extraction method functions (extractFromEmbeddedJSON, extractFromDOM). These functions then try to use an undefined progressCallback variable when calling extractThumbnailStealth().
Stories
Story 1: Add ProgressCallback Parameter to extractFromEmbeddedJSON
Priority: High
Acceptance Criteria:
- Function signature updated to accept optional
progressCallback?: ProgressCallback - All calls to
extractThumbnailStealth()within the function use the parameter - No references to undefined variables
Implementation Steps:
-
Update function signature:
async function extractFromEmbeddedJSON( page: Page, progressCallback?: ProgressCallback ): Promise<ExtractedContent | null> -
Ensure both calls to
extractThumbnailStealth(lines ~224 and ~239) use the parameter correctly (already doing this, just need to receive it)
Technical Notes:
- The function already correctly passes
progressCallbacktoextractThumbnailStealth - Just needs to receive it as a parameter instead of referencing undefined variable
Story 2: Add ProgressCallback Parameter to extractFromDOM
Priority: High
Acceptance Criteria:
- Function signature updated to accept optional
progressCallback?: ProgressCallback - Call to
extractThumbnailStealth()at line ~346 uses the parameter - No references to undefined variables
Implementation Steps:
-
Update function signature:
async function extractFromDOM( page: Page, progressCallback?: ProgressCallback ): Promise<ExtractedContent | null> -
The call to
extractThumbnailStealthat line ~346 already uses the parameter correctly
Technical Notes:
- Single call to
extractThumbnailStealthin this function - Already written to use
progressCallback, just needs to receive it
Story 3: Update extractWithStrategies to Pass Callback to Strategy Functions
Priority: High
Acceptance Criteria:
- All strategy functions receive the
onProgresscallback - Legacy strategy inline function updated to use parameter correctly
- No undefined variable references in any strategy
Implementation Steps:
-
Update the strategies array to pass
onProgressto each function:const strategies: Array<{ name: ExtractionMethod; fn: () => Promise<ExtractedContent | null>; }> = [ { name: 'embedded-json', fn: () => extractFromEmbeddedJSON(page, onProgress) }, { name: 'dom-selector', fn: () => extractFromDOM(page, onProgress) }, { name: 'graphql-api', fn: () => extractViaGraphQL(url, context) }, { name: 'legacy', fn: async () => { const text = await extractCleanTextLegacy(page); const thumbnail = await extractThumbnailStealth(page, onProgress); return { bodyText: text, thumbnail }; } } ]; -
Verify the legacy strategy's inline function uses
onProgressinstead ofprogressCallback
Technical Notes:
- The legacy strategy is defined inline at line ~459
- Currently references undefined
progressCallback - Should use
onProgressfrom the parent function's scope - GraphQL strategy doesn't extract thumbnails so doesn't need the callback
Story 4: Verify extractViaGraphQL Doesn't Need Callback
Priority: Medium
Acceptance Criteria:
- Confirm
extractViaGraphQLdoesn't extract thumbnails - Ensure it doesn't have undefined variable references
- Document if thumbnail extraction should be added in the future
Implementation Steps:
- Review
extractViaGraphQLfunction (lines ~360-410) - Confirm it only extracts text, not thumbnails
- Add code comment if thumbnail extraction could be beneficial
Technical Notes:
- Currently returns
ExtractedContentbut likely withthumbnail: null - May need enhancement in future to extract thumbnail via GraphQL data
- Not urgent for this fix since it doesn't reference
progressCallback
Story 5: Verify All Other Functions Using extractThumbnailStealth
Priority: Low
Acceptance Criteria:
- Search entire codebase for
extractThumbnailStealthcalls - Ensure all callers properly pass or omit the optional parameter
- No other undefined variable references exist
Implementation Steps:
- Use grep/search to find all calls to
extractThumbnailStealth - Verify each call either:
- Passes a valid
ProgressCallbackparameter, or - Intentionally omits it (relying on optional parameter)
- Passes a valid
- Document any findings
Technical Notes:
- The parameter is optional, so omitting it is valid
- But passing an undefined variable is not valid
- Found 5 calls total in extraction.ts (4 problematic, 1 direct call)
Dependencies
graph TD
A[Story 1: Fix extractFromEmbeddedJSON] --> C[Story 3: Update extractWithStrategies]
B[Story 2: Fix extractFromDOM] --> C
C --> D[Story 5: Verify All Callers]
E[Story 4: Verify GraphQL] --> D
Critical Path: Stories 1, 2, 3 must be completed together as they form a cohesive parameter-passing chain.
Testing Strategy
Unit Testing
- Test each extraction function with and without
progressCallbackparameter - Verify callback is invoked when thumbnail is extracted
- Verify no errors when callback is omitted
Integration Testing
- Test full extraction flow with SSE endpoint
- Verify thumbnail progress events are emitted correctly
- Test all extraction methods (embedded-json, dom-selector, graphql, legacy)
Manual Testing
- Start dev server
- Attempt to extract from Instagram URL
- Verify no
ReferenceError: progressCallback is not definederrors - Verify thumbnail progress events appear in SSE stream
- Test retry logic still works correctly
Risk Assessment
Risk Level: Low
Impact: High (currently breaks all extractions)
Complexity: Low (parameter passing fix)
Potential Issues
- Breaking existing callers - Mitigated by making parameter optional
- Missing other undefined references - Mitigated by Story 5 verification
- Callback not propagating - Mitigated by following the call chain
Rollback Strategy
- Changes are additive (adding optional parameters)
- No breaking changes to function signatures
- Easy to revert if issues arise
Implementation Checklist
- Story 1: Update
extractFromEmbeddedJSONsignature - Story 2: Update
extractFromDOMsignature - Story 3: Update
extractWithStrategiesto pass callbacks - Story 4: Review and document
extractViaGraphQL - Story 5: Verify all
extractThumbnailStealthcallers - Run extraction tests
- Manual testing with dev server
- Commit with descriptive message
Success Criteria
- ✅ No
ReferenceError: progressCallback is not definederrors - ✅ All extraction methods work correctly
- ✅ Thumbnail progress events are emitted via SSE
- ✅ Retry logic continues to function
- ✅ No regression in existing functionality
Technical References
- File: src/lib/server/extraction.ts
- Type Definition:
ProgressCallback = (event: ProgressEvent) => void(line 22) - SSE Integration: src/routes/api/extract-stream/+server.ts
Notes
This is a straightforward parameter-passing bug introduced during a recent refactor where thumbnail extraction was enhanced with progress callbacks. The extraction functions were updated to call extractThumbnailStealth with a callback, but weren't updated to receive that callback as a parameter.
The fix maintains backward compatibility by making the parameter optional, allowing the functions to work with or without progress tracking.