Browse Source

Update decoding.py (#1155)

* Update decoding.py

Following the suggestions of @Jeronymous in https://github.com/openai/whisper/pull/914 and https://github.com/openai/whisper/discussions/924, it solves the problem of endless loop.

* Removed blank line and whitespaces in empty lines.

* Suggested changes according to the linter

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
Fernando O. Gallego 2 years ago
parent
commit
b0022b3283
1 changed files with 7 additions and 0 deletions
  1. 7 0
      whisper/decoding.py

+ 7 - 0
whisper/decoding.py

@@ -471,6 +471,13 @@ class ApplyTimestampRules(LogitFilter):
                 # timestamps shouldn't decrease; forbid timestamp tokens smaller than the last
                 logits[k, self.tokenizer.timestamp_begin : timestamps[-1]] = -np.inf
 
+                # to force that timestamps are strictly increasing
+                if last_was_timestamp and not penultimate_was_timestamp:
+                    timestamp_last = timestamps[-1]
+                else:
+                    timestamp_last = timestamps[-1] + 1
+                logits[k, self.tokenizer.timestamp_begin : timestamp_last] = -np.inf
+
         if tokens.shape[1] == self.sample_begin:
             # suppress generating non-timestamp tokens at the beginning
             logits[:, : self.tokenizer.timestamp_begin] = -np.inf