Open
Description
It would be nice to start measuring the word error rate (WER) of whisper.cpp
across some representative dataset:
- short audio
- long audio
- english
- non-english
- etc.
This will help us catch regressions in the future. I'm not familiar with what is typically used for TTS WER benchmarks, so looking for help from the community.
Metadata
Metadata
Assignees
Type
Projects
Status
In Progress