Step 9 used 'echo $RESULT | python3 - << HEREDOC' which is a bash gotcha:
the heredoc takes over stdin (as the script source), so the pipe is
silently ignored and sys.stdin.read() returns empty string → JSONDecodeError.
Fix: write RESULT to a temp file and pass it as sys.argv[1] to the script.
Also removed the old buggy test suite that was accidentally left appended
at lines 181-327 (had language=auto, ['id'] field, wrong DELETE assertion).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GPU warmup (src/transcriber.rs):
After creating WhisperState, run a 1s silent inference pass in load().
CUDA JIT-compiles device kernels on the first whisper_full_with_state call.
On a cold GPU this compilation disrupts the decode pipeline mid-inference,
returning 0 segments in ~0.5s. The warmup forces all kernel compilation at
startup so the first real job runs on fully compiled kernels.
test_all.sh:
- Fix submit response field: 'id' → 'job_id' (was breaking all downstream steps)
- Remove language=auto: not a valid ISO 639-1 code; omit field for auto-detect
- Make BASE and AUDIO configurable via env vars (WHISPER_BASE_URL, TEST_AUDIO)
- Fix DELETE assertion: completed jobs return 409 Conflict, not 204
- Add explicit zero-segments failure check in quality inspection (step 9)
- Add progress reporting to poll loop
docs/FINDINGS.md + KNOWLEDGE.md:
Document cold GPU warmup issue, root cause, and fix.
Document language=auto as invalid API usage.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>