Преглед на файлове

Use UTF-8 encoding to save the txt and vtt files (#37)

Explicitly set the text encoding to UTF-8 in order to avoid UnicodeEncodeErrors

Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
hanacchi преди 2 години
родител
ревизия
c85eaaae29
променени са 1 файла, в които са добавени 2 реда и са изтрити 2 реда
  1. 2 2
      whisper/transcribe.py

+ 2 - 2
whisper/transcribe.py

@@ -289,11 +289,11 @@ def cli():
         audio_basename = os.path.basename(audio_path)
 
         # save TXT
-        with open(os.path.join(output_dir, audio_basename + ".txt"), "w") as txt:
+        with open(os.path.join(output_dir, audio_basename + ".txt"), "w", encoding="utf-8") as txt:
             print(result["text"], file=txt)
 
         # save VTT
-        with open(os.path.join(output_dir, audio_basename + ".vtt"), "w") as vtt:
+        with open(os.path.join(output_dir, audio_basename + ".vtt"), "w", encoding="utf-8") as vtt:
             write_vtt(result["segments"], file=vtt)