New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

hallicination for full silent audio input #133

Open

yijiegao2006 opened this issue Oct 28, 2024 · 1 comment

yijiegao2006 commented Oct 28, 2024

If audio input is full silent, it will output/ return things like "MBC 뉴스 이재경입니다.".

Seems VAD has been applied in SYSTRAN/faster-whisper since for sample with long silence but verbal at the end it could return correct text.

Contributor

thiswillbeyourgithub commented Oct 29, 2024

This is inherent to all whisper based models.

Also related to #108 as cleaningup audio will return a very short result if it's silence so it wouls be ignored as under the duration threshold

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment