Commit 43b2a870 by libei

reviese transformer_dla hparameter set: relu_dropout=0.1

parent fb957510
...@@ -442,7 +442,7 @@ def transformer_dla_base(): ...@@ -442,7 +442,7 @@ def transformer_dla_base():
hparams.decoder_layers = 6 hparams.decoder_layers = 6
hparams.normalize_before = True hparams.normalize_before = True
hparams.attention_dropout = 0.1 hparams.attention_dropout = 0.1
hparams.residual_dropout = 0.1 hparams.relu_dropout = 0.1
hparams.learning_rate = 0.4 hparams.learning_rate = 0.4
hparams.learning_rate_warmup_steps = 8000 hparams.learning_rate_warmup_steps = 8000
hparams.batch_size = 2048 hparams.batch_size = 2048
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论