whisper-rtx2080

mozempk/whisper-rtx2080

Fork 0

Files

History

mozempk fd8d4deefb

Build & Push Docker Image / build-and-push (push) Successful in 6m39s

Details

fix: GPU warmup on startup + fix test_all.sh + document cold-GPU finding

GPU warmup (src/transcriber.rs):
  After creating WhisperState, run a 1s silent inference pass in load().
  CUDA JIT-compiles device kernels on the first whisper_full_with_state call.
  On a cold GPU this compilation disrupts the decode pipeline mid-inference,
  returning 0 segments in ~0.5s. The warmup forces all kernel compilation at
  startup so the first real job runs on fully compiled kernels.

test_all.sh:
  - Fix submit response field: 'id' → 'job_id' (was breaking all downstream steps)
  - Remove language=auto: not a valid ISO 639-1 code; omit field for auto-detect
  - Make BASE and AUDIO configurable via env vars (WHISPER_BASE_URL, TEST_AUDIO)
  - Fix DELETE assertion: completed jobs return 409 Conflict, not 204
  - Add explicit zero-segments failure check in quality inspection (step 9)
  - Add progress reporting to poll loop

docs/FINDINGS.md + KNOWLEDGE.md:
  Document cold GPU warmup issue, root cause, and fix.
  Document language=auto as invalid API usage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

2026-05-06 11:57:30 +02:00

ARCHITECTURE.md

docs: add ARCHITECTURE, CODE_STYLE, FINDINGS, USAGE under docs/

2026-05-06 10:17:53 +02:00

CODE_STYLE.md

docs: add ARCHITECTURE, CODE_STYLE, FINDINGS, USAGE under docs/

2026-05-06 10:17:53 +02:00

FINDINGS.md

fix: GPU warmup on startup + fix test_all.sh + document cold-GPU finding

2026-05-06 11:57:30 +02:00

USAGE.md

docs: add ARCHITECTURE, CODE_STYLE, FINDINGS, USAGE under docs/

2026-05-06 10:17:53 +02:00