Kaynağa Gözat

Avoid computing higher temperatures on no_speech segments (#1279)

* Avoid computing higher temperatures on no_speech

In decode_with_fallback, we compute higher temperatures in the case where compression_ratio is too high or avg_logprob is too low.
But as the computation of no_speech_prob doens't depend on sampling, we can avoid computing higher temperatures if we detect in the first one that the no_speech condition is fulfilled

* Update transcribe.py

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
Théo BOYER 1 yıl önce
ebeveyn
işleme
e334ff141d
1 değiştirilmiş dosya ile 5 ekleme ve 1 silme
  1. 5 1
      whisper/transcribe.py

+ 5 - 1
whisper/transcribe.py

@@ -174,7 +174,11 @@ def transcribe(
                 and decode_result.avg_logprob < logprob_threshold
             ):
                 needs_fallback = True  # average log probability is too low
-
+            if (
+                no_speech_threshold is not None
+                and decode_result.no_speech_prob > no_speech_threshold
+            ):
+                needs_fallback = False  # silence
             if not needs_fallback:
                 break