\item 在翻译请求高并发的场景中,使用批量翻译也是有效利用GPU设备的方式。不过,机器翻译是一个处理不定长序列的任务,输入的句子长度差异较大。而且,由于译文长度无法预知,进一步增加了不同长度的句子所消耗计算资源的不确定性。这时,可以让长度相近的句子在一个批次里处理,减小由于句子长度不统一造成的补全过多、设备利用率低的问题。例如,可以按输入句子长度范围分组,如图XXX。 也可以设计更加细致的方法对句子进行分组,以最大化批量翻译中设备的利用率({\color{red} 参考文献:TurboTransformers: An Efficient GPU Serving System For Transformer Models})。
title = {Relational inductive biases, deep learning, and graph networks},
journal = {CoRR},
volume = {abs/1806.01261},
year = {2018}
author = {Peter Shaw and
Jakob Uszkoreit and
Ashish Vaswani},
title = {Self-Attention with Relative Position Representations},
publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
pages = {464--468},
year = {2018},
author = {Zihang Dai and
Zhilin Yang and
Yiming Yang and
Jaime G. Carbonell and
Quoc V. Le and
Ruslan Salakhutdinov},
title = {Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context},
journal = {Annual Meeting of the Association for Computational Linguistics},
pages = {2978--2988},
year = {2019}
title={Attention is All You Need},
author={Ashish {Vaswani} and Noam {Shazeer} and Niki {Parmar} and Jakob {Uszkoreit} and Llion {Jones} and Aidan N. {Gomez} and Lukasz {Kaiser} and Illia {Polosukhin}},
publisher={International Conference on Neural Information Processing},
author = {Junhui Li and
Deyi Xiong and
Zhaopeng Tu and
Muhua Zhu and
Min Zhang and
Guodong Zhou},
title = {Modeling Source Syntax for Neural Machine Translation},
pages = {688--697},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
author = {Akiko Eriguchi and
Kazuma Hashimoto and
Yoshimasa Tsuruoka},
title = {Tree-to-Sequence Attentional Neural Machine Translation},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
author = {Baosong Yang and
Derek F. Wong and
Tong Xiao and
Lidia S. Chao and
Jingbo Zhu},
title = {Towards Bidirectional Hierarchical Representations for Attention-based
Neural Machine Translation},
publisher = {Conference on Empirical Methods in Natural Language Processing},
pages = {1432--1441},
year = {2017}
author = {Huadong Chen and
Shujian Huang and
David Chiang and
Jiajun Chen},
title = {Improved Neural Machine Translation with a Syntax-Aware Encoder and
pages = {1936--1945},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
author = {Zhaopeng Tu and
Zhengdong Lu and
Yang Liu and
Xiaohua Liu and
Hang Li},
title = {Modeling Coverage for Neural Machine Translation},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
author = {Rico Sennrich and
Barry Haddow},
title = {Linguistic Input Features Improve Neural Machine Translation},
pages = {83--91},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
author = {Xing Shi and
Inkit Padhi and
Kevin Knight},
title = {Does String-Based Neural {MT} Learn Source Syntax?},
pages = {1526--1534},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
author = {Emanuele Bugliarello and
Naoaki Okazaki},
title = {Enhancing Machine Translation with Dependency-Aware Self-Attention},
pages = {1618--1627},
publisher = {Annual Meeting of the Association for Computational Linguistics},