fix: disable previous-text conditioning to prevent end-of-file loops
All checks were successful
Build & Push Docker Image / build-and-push (push) Successful in 6m41s

set_no_context(true) stops whisper from feeding its own output back as
context for the next segment. Without this, at audio end the model
halluccinates a phrase ('All right.', 'So I think we're going to wrap up.')
and repeats it hundreds of times in a tight loop.

Observed: 759x 'All right.' + 750x 'So I think we're going to wrap up.'
in the final 8 seconds of a 101min YouTube conference recording.
After fix: clean termination with no repetition loops.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
mozempk
2026-05-06 00:14:56 +02:00
parent 2176206afe
commit 9a36000062

View File

@@ -68,6 +68,9 @@ impl Transcriber {
fp.set_logprob_thold(-1.0); fp.set_logprob_thold(-1.0);
fp.set_suppress_blank(true); fp.set_suppress_blank(true);
fp.set_suppress_non_speech_tokens(true); fp.set_suppress_non_speech_tokens(true);
// Prevent repetition loops on long audio: do not feed the previous
// segment's text back as context for the next segment.
fp.set_no_context(true);
fp.set_print_progress(false); fp.set_print_progress(false);
fp.set_print_realtime(false); fp.set_print_realtime(false);