help wantedhigh priorityresearch🔬roadmap
説明
It would be nice to start measuring the word error rate (WER) of whisper.cpp across some representative dataset:
- short audio
- long audio
- english
- non-english
- etc.
This will help us catch regressions in the future. I'm not familiar with what is typically used for TTS WER benchmarks, so looking for help from the community.