kaldi-asr/kaldi

WSJ setup is a mess

Open

#1,846 建立於 2017年8月28日

在 GitHub 查看
 (7 留言) (0 反應) (0 負責人)Shell (15,392 star) (5,359 fork)batch import
help wantedstale-excludestopped development

描述

[I hit send too soon on this; I'm updating the comment.]

I think the time might have come to create an 's5b' version of the WSJ setup. WSJ is the oldest setup and the local scripts are not up to the standard of clarity that we usually expect. Some specific issues:

  • The dictionary preparation scripts (the larger dictionary) are using some special-purpose scripts that I created a long time ago and probably phonetisaurus would be a better choice.
  • Preparation and cleaning of the language modeling data is done in a way that's mixed up unnecessarily with the dictionary preparation-- better to keep separate things separate.
  • The scripts in local/ need to have much clearer and cleaner interfaces- it needs to be clear what the inputs and output are.
  • I'm not convinced that I like the way the text is normalized. Lower-case is more common than upper-case in ASR setups these days so let's do it lower case, and use <unk> (lower-case) instead of <SPOKEN_NOISE>, which is more standard, and make the phone names lower-case instead of upper-case.

Part of my motivation is that we'll be doing some RNNLM stuff with this setup (since we have example scripts for older setups) and the scripts need to be cleaner. I don't know if anyone has the time and inclination to work on this?

貢獻者指南