Reached max decoder steps 5000

gaoguoyao · 2022 年5 月 24 日 02:57

问题定位B22
在前面的步骤基本都没啥问题，tts_melgan.nemo也是可以正常导入。

[attach]19880[/attach]

[attach]19879[/attach]

TTS训练中，无法正常输出频谱图。不知道出现这种情况的原因是什么？
万分感谢！

YWHW · 2022 年5 月 24 日 03:37

你训练了多少轮? 请提供一下tacotron2声学模型训练部分的代码，

gaoguoyao · 2022 年5 月 24 日 11:47

decoder:
target: nemo.collections.tts.modules.tacotron2.Decoder
decoder_rnn_dim: 1024
encoder_embedding_dim: ${model.encoder.encoder_embedding_dim}
gate_threshold: 0.4
max_decoder_steps: 10000
n_frames_per_step: 1 # currently only 1 is supported
n_mel_channels: ${n_mels}
p_attention_dropout: 0.1
p_decoder_dropout: 0.1
prenet_dim: 256
prenet_p_dropout: 0.5

Attention parameters

attention_dim: 128
attention_rnn_dim: 1024

AttentionLocation Layer parameters

attention_location_kernel_size: 31
attention_location_n_filters: 32
early_stopping: true

################################################################
! HYDRA_FULL_ERROR=1
python tacotron2.py train_dataset=/home/x/2022q2/conv_nemo/manifest/train_tts_6th.json
validation_datasets=/home/x/2022q2/conv_nemo/manifest/test_tts_6th.json
trainer.max_epochs=1600
trainer.accelerator=null
trainer.check_val_every_n_epoch=1

谢谢，麻烦了

YWHW · 2022 年5 月 24 日 12:11

随着语句的增加，拟合的难度也会增加，需要增加训练次数，设置成5000试一下