[TTS]wavernn合成报错 · PaddlePaddle/PaddleSpeech#2653

Repository metrics

Stars: (9,453 stars)
PR merge metrics: (Avg merge 45d 19h) (6 merged PRs in 30d)

Description

使用fastspeech2和wavernn进行合成，命令如下

python3 /home/aistudio/PaddleSpeech/paddlespeech/t2s/exps/fastspeech2/../synthesize_e2e.py \
--am=fastspeech2_mix \
--am_config=/home/aistudio/checkpoint/fastspeech2_mix_ckpt_1.2.0/default.yaml \
--am_ckpt=/home/aistudio/work/exp_spkxtdt/output/checkpoints/snapshot_iter_113958.pdz \
--am_stat=/home/aistudio/checkpoint/fastspeech2_mix_ckpt_1.2.0/speech_stats.npy \
--voc=wavernn_csmsc \
--voc_config=/home/aistudio/checkpoint/wavernn_csmsc_ckpt_0.2.0/default.yaml \
--voc_ckpt=/home/aistudio/checkpoint/wavernn_csmsc_ckpt_0.2.0/snapshot_iter_400000.pdz \
--voc_stat=/home/aistudio/checkpoint/wavernn_csmsc_ckpt_0.2.0/feats_stats.npy \
--lang=mix \
--text=/home/aistudio/work/exp_spkxtdt/sentence.txt \
--output_dir=/home/aistudio/work/exp_spkxtdt/wav_out_rnn \
--phones_dict=/home/aistudio/checkpoint/fastspeech2_mix_ckpt_1.2.0/phone_id_map.txt \
--speaker_dict=/home/aistudio/checkpoint/fastspeech2_mix_ckpt_1.2.0/speaker_id_map.txt \
--spk_id=0 \
--ngpu=1

经测试 fs2+pwg 运行正常，使用下面的方式调用 fs2+wavernn 合成成功，唯独使用synthesize_e2e.py合成失败

from paddlespeech.cli.tts.infer import TTSExecutor
tts = TTSExecutor()
tts(text="欢迎使用飞桨！",
    output="output.wav", 
    am="fastspeech2_mix",
    am_ckpt=am_ckpt,           # 微调后的模型地址
    am_config = os.path.join(out_am_path, "default.yaml"),
    am_stat = os.path.join(out_am_path, "speech_stats.npy"),
    
    phones_dict = os.path.join(out_am_path, "phone_id_map.txt"),
    speaker_dict = os.path.join(out_am_path, "speaker_id_map.txt"),
    spk_id = 0,
    
    voc = "wavernn_csmsc",
    voc_config = os.path.join(out_vocoder_path, "default.yaml"),
    voc_ckpt = os.path.join(out_vocoder_path, "snapshot_iter_400000.pdz"),
    voc_stat = os.path.join(out_vocoder_path, "feats_stats.npy"),
    lang="mix"
    )

Contributor guide

Research direction: Compare the synthesize e2e.py script execution path with the TTSExecutor API to identify differences in model loading or inference logic causing the failure.
Tech stack: python
Domain: backend
Issue type: Bug
Difficulty: 2
Estimated time: 1-3 hours
Activity status: Active
Clarity: Mostly clear
Prerequisites: PythonPaddleSpeech
Newbie friendliness: 65

Repository metrics

Description

Contributor guide

Get fresh easy issues in your inbox.