Files
insta-recipe/Dockerfile
Giancarmine Salucci 5b5bb947ef feat: replace Playwright extractor with yt-dlp subprocess
- Add instagram-extractor.ts: yt-dlp subprocess backend for Instagram
  caption extraction. No in-process browser state, maintained against
  Instagram frontend churn, supports cookies.txt for auth-walled reels.
- Add feature flag EXTRACTOR_BACKEND (ytdlp|playwright) in QueueProcessor
  so the old Playwright path remains available as fallback.
- Add 9 unit tests and 2 live-network integration tests for the new extractor.
- Dockerfile: install yt-dlp via pip3 alongside existing Chromium deps.
- docker-compose: expose EXTRACTOR_BACKEND env var (default: ytdlp).

Also in this commit:
- LLM: configurable per-request timeout via LLM_REQUEST_TIMEOUT_MS (default 120s);
  set maxRetries=0 to surface errors immediately; llama-swap /running health probe.
- QueueProcessor: thread progress callback through parser phase.
- LlmHealthIndicator: surface llama-swap loaded-model name.
- Logging: improve error serialization in queue-processor tests.
- .env.example: document llama-swap endpoint and model options.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-12 20:46:31 +02:00

27 lines
590 B
Docker

FROM node:24-alpine
WORKDIR /app
# Install yt-dlp (primary Instagram extractor) and Playwright system dependencies (fallback)
RUN apk add --no-cache \
python3 \
py3-pip \
chromium \
font-liberation \
font-noto \
font-noto-cjk && \
pip3 install --break-system-packages yt-dlp
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
EXPOSE 3000
ENV NODE_ENV=production
ENV CHROMIUM_EXECUTABLE_PATH=/usr/bin/chromium-browser
# Declare volume for Instagram authentication persistence
VOLUME ["/app/secrets"]
CMD ["node", "-e", "import('./build/index.js')"]