fix: disable previous-text conditioning to prevent end-of-file loops
All checks were successful
Build & Push Docker Image / build-and-push (push) Successful in 6m41s
All checks were successful
Build & Push Docker Image / build-and-push (push) Successful in 6m41s
set_no_context(true) stops whisper from feeding its own output back as
context for the next segment. Without this, at audio end the model
halluccinates a phrase ('All right.', 'So I think we're going to wrap up.')
and repeats it hundreds of times in a tight loop.
Observed: 759x 'All right.' + 750x 'So I think we're going to wrap up.'
in the final 8 seconds of a 101min YouTube conference recording.
After fix: clean termination with no repetition loops.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
@@ -68,6 +68,9 @@ impl Transcriber {
|
|||||||
fp.set_logprob_thold(-1.0);
|
fp.set_logprob_thold(-1.0);
|
||||||
fp.set_suppress_blank(true);
|
fp.set_suppress_blank(true);
|
||||||
fp.set_suppress_non_speech_tokens(true);
|
fp.set_suppress_non_speech_tokens(true);
|
||||||
|
// Prevent repetition loops on long audio: do not feed the previous
|
||||||
|
// segment's text back as context for the next segment.
|
||||||
|
fp.set_no_context(true);
|
||||||
|
|
||||||
fp.set_print_progress(false);
|
fp.set_print_progress(false);
|
||||||
fp.set_print_realtime(false);
|
fp.set_print_realtime(false);
|
||||||
|
|||||||
Reference in New Issue
Block a user