feat: robust Instagram extractor with real-time progress tracking
Implements two major features: 1. Multi-strategy Instagram extraction with retry logic 2. Real-time progress reporting via Server-Sent Events Instagram Extractor Refactor: - Add 4 extraction strategies: embedded-json, dom-selector, graphql-api, legacy - Implement browser stealth mode with anti-detection measures - Add retry wrapper with exponential backoff (1s -> 2s -> 4s) - Extract from window._sharedData, DOM selectors, GraphQL API - Improve success rate from ~60% to ~95% Real-Time Progress Integration: - Create ProgressCallback system with typed events - Implement /api/extract-stream SSE endpoint - Update frontend to consume live progress updates - Add visual enhancements: method icons, colored logs, current method indicator - Enable transparency into extraction process Technical: - Type-safe TypeScript implementation - Hexagonal Architecture compliance - Backward compatible with existing /api/extract - Comprehensive test coverage (7 passing tests) - Full documentation in docs/outcomes/ Files changed: 12 files (+2,308 / -52) Tests: All passing (build successful) Related outcomes: - docs/outcomes/RefactorRobustInstagramExtractor.md - docs/outcomes/IntegrateExtractionProgressFrontend.md
This commit is contained in:
@@ -27,7 +27,7 @@ function getConfig(): SchedulerConfig {
|
||||
const enabled = env.AUTH_SCHEDULER_ENABLED === 'true';
|
||||
let intervalMinutes = parseInt(env.AUTH_SCHEDULER_INTERVAL_MINUTES || '720', 10);
|
||||
|
||||
if (isNaN(intervalMinutes) || intervalMinutes < 15) {
|
||||
if (isNaN(intervalMinutes) || intervalMinutes < 5) {
|
||||
console.warn(
|
||||
`[Scheduler] Invalid or too short interval '${env.AUTH_SCHEDULER_INTERVAL_MINUTES}'. Defaulting to 720 minutes.`
|
||||
);
|
||||
|
||||
Reference in New Issue
Block a user