Realtime mode will display intermediate transcriptions and translations as they are being processed.
It is recommended to run the AI Model on a GPU (CUDA
) for realtime mode.
To improve the speed of the realtime mode, you can use the following advanced Settings (under the Advanced -> Settings tab):
Setting | Description |
---|---|
model |
Defines the Whisper model for processing. Can improve speed in exchange for additional RAM usage and less precision. (if no realtime_whisper_model is used, this option is used for realtime processing as well.) |
whisper_precision |
Defines the precision of the Whisper model. (float16 should be faster then float32 with CUDA) |
stt_type |
faster_whisper is faster then original_whisper with the same accuracy. |
temperature_fallback |
Enables or Disables the temperature fallback when processing audio. Disabling this can speed up the processing and reduce stuck processing. |
beam_size |
Defines the beam size of the Whisper model. Can improve speed in exchange for less precision. (lower = faster) |
condition_on_previous_text |
Enables or Disables the condition on previous text. Disabling this can speed up the processing and reduce stuck processing while reducing relation of later results to previous transcripts. (is used for both realtime and regular processing) |
whisper_num_workers |
Increasing this can possibly speed up processing. |
realtime_frame_multiply |
Defines the minimum frame size of audio snippets. The default value should be fine for most cases. |
realtime_frequency_time |
Defines the time frequency (in seconds as fraction) when the recorded audio snippets are processed. Settings this too low can have a negative effect on realtime speed as the GPU might not be able to keep up with the speed. Setting this to 0.5 for every 500 milliseconds can already work nicely with a good GPU and medium model. |
realtime_whisper_beam_size |
Defines the beam size of the Whisper model. Can improve speed in exchange for less precision. (lower = faster) |
realtime_temperature_fallback |
Enables or Disables the temperature fallback when processing Realtime audio. Disabling this can speed up the processing and reduce stuck processing. |
Note:
Optionally you can also define a separate Whisper model for realtime processing by setting the following settings:(Its not really recommended though, as using this increases the used Memory, and using a smaller model only for realtime processing will increase differences between realtime results and the final result)
Setting Description realtime_whisper_model
Optionally defines a separate Whisper model for realtime processing.
Can improve speed in exchange for additional RAM usage and less precision.realtime_whisper_precision
Only used when realtime_whisper_model
is set.
Defines the precision of the Whisper model. (float16
should be faster thenfloat32
with CUDA)
Save the Settings by scrolling down and clicking on Save.
All realtime settings are applied immediately without requiring a restart
except the settings realtime_whisper_model
and realtime_whisper_precision
which require a restart.