\item OpenSeq2Seq。由NVIDIA团队开发的\upcite{DBLP:journals/corr/abs-1805-10387}基于TensorFlow的模块化架构,用于序列到序列的模型,允许从可用组件中组装新模型,支持混合精度训练,利用NVIDIA Volta Turing GPU中的Tensor核心,基于Horovod的快速分布式训练,支持多GPU,多节点多模式。网址:\url{https://nvidia.github.io/OpenSeq2Seq/html/index.html}
\parinterval 机器翻译相关评测主要有两种组织形式,一种是由政府及国家相关机构组织,权威性强。如由美国国家标准技术研究所组织的NIST评测、日本国家科学咨询系统中心主办的NACSIS Test Collections for IR(NTCIR)PatentMT、日本科学振兴机构(Japan Science and Technology Agency,简称JST)等组织联合举办的Workshop on Asian Translation(WAT)以及国内由中文信息学会主办的全国机器翻译大会(China Conference on Machine Translation,简称CCMT);另一种是由相关学术机构组织,具有领域针对性的特点,如倾向新闻领域的Conference on Machine Translation(WMT)以及面向口语的International Workshop on Spoken Language Translation(IWSLT)。下面将针对上述评测进行简要介绍。
\item WMT由Special Interest Group for Machine Translation(SIGMT)主办,会议自2006年起每年召开一次,是一个涉及机器翻译多种任务的综合性会议,包括多领域翻译评测任务、质量评价任务以及其他与机器翻译的相关任务(如文档对齐评测等)。现在WMT已经成为机器翻译领域的旗舰评测会议,很多研究工作都以WMT评测结果作为基准。WMT评测涉及的语言范围较广,包括英语、德语、芬兰语、捷克语、罗马尼亚语等十多种语言,翻译方向一般以英语为核心,探索英语与其他语言之间的翻译性能,领域包括新闻、信息技术、生物医学。最近,也增加了无指导机器翻译等热门问题。WMT在评价方面类似于CCMT,也采用人工评价与自动评价相结合的方式,自动评价的指标一般为BLEU、TER 等。此外,WMT公开了所有评测数据,因此也经常被机器翻译相关人员所使用。更多WMT的机器翻译评测相关信息可参考SIGMT官网:\url{http://www.sigmt.org/}。
\parinterval 以上评测数据大多可以从评测网站上下载,此外部分数据也可以从LDC(Lingu-istic Data Consortium)上申请,网址为\url{https://www.ldc.upenn.edu/}。ELRA(Euro-pean Language Resources Association)上也有一些免费的语料库供研究使用,其官网为\url{http://www.elra.info/}。从机器翻译发展的角度看,这些评测任务给相关研究提供了基准数据集,使得不同的系统都可以在同一个环境下进行比较和分析,进而建立了机器翻译研究所需的实验基础。此外,公开评测也使得研究者可以第一时间了解机器翻译研究的最新成果,比如,有多篇ACL会议最佳论文的灵感就来自当年参加机器翻译评测任务的系统。
@@ -6045,120 +6022,141 @@ author = {Yoshua Bengio and
@inproceedings{ElMaghraby2018EnhancingTF,
title={Enhancing Translation from English to Arabic Using Two-Phase Decoder Translation},
author={Ayah ElMaghraby and Ahmed Rafea},
booktitle={IntelliSys},
year={2018}
pages = {539--549},
publisher = {Intelligent Systems and Applications},
year = {2018}
}
@inproceedings{Geng2018AdaptiveMD,
title={Adaptive Multi-pass Decoder for Neural Machine Translation},
author={X. Geng and X. Feng and B. Qin and T. Liu},
booktitle={EMNLP},
author={Xinwei Geng and
Xiaocheng Feng and
Bing Qin and
Ting Liu},
publisher ={Conference on Empirical Methods in Natural Language Processing},
pages={523--532},
year={2018}
}
@article{Lee2018DeterministicNN,
title={Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement},
author={Jason Lee and Elman Mansimov and Kyunghyun Cho},
journal={ArXiv},
year={2018},
volume={abs/1802.06901}
pages = {1173--1182},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2018}
}
@inproceedings{Gu2019LevenshteinT,
title={Levenshtein Transformer},
author={Jiatao Gu and Changhan Wang and Jake Zhao},
booktitle={NeurIPS},
year={2019}
publisher = {Conference and Workshop on Neural Information Processing Systems},
pages = {11179--11189},
year = {2019},
}
@inproceedings{Guo2020JointlyMS,
title={Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation},
author={Junliang Guo and Linli Xu and E. Chen},
booktitle={ACL},
year={2020}
author={Junliang Guo and Linli Xu and Enhong Chen},
pages = {376--385},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2020}
}
@article{Stahlberg2018AnOS,
title={An Operation Sequence Model for Explainable Neural Machine Translation},
author={Felix Stahlberg and Danielle Saunders and B. Byrne},
journal={ArXiv},
year={2018},
volume={abs/1808.09688}
author={Felix Stahlberg and Danielle Saunders and Bill Byrne},
pages = {175--186},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2018}
}
@inproceedings{Stern2019InsertionTF,
title={Insertion Transformer: Flexible Sequence Generation via Insertion Operations},
author={Mitchell Stern and William Chan and J. Kiros and Jakob Uszkoreit},
booktitle={ICML},
author={Mitchell Stern and William Chan and Jamie Kiros and Jakob Uszkoreit},
publisher={International Conference on Machine Learning},
pages={5976--5985},
year={2019}
}
@article{stling2017NeuralMT,
title={Neural machine translation for low-resource languages},
author={Robert {\"O}stling and J. Tiedemann},
journal={ArXiv},
author={Robert {\"O}stling and J{\"{o}}rg Tiedemann},
journal={CoRR},
year={2017},
volume={abs/1708.05729}
}
@article{Kikuchi2016ControllingOL,
title={Controlling Output Length in Neural Encoder-Decoders},
author={Yuta Kikuchi and Graham Neubig and Ryohei Sasano and H. Takamura and M. Okumura},
journal={ArXiv},
year={2016},
volume={abs/1609.09552}
author={Yuta Kikuchi and
Graham Neubig and
Ryohei Sasano and
Hiroya Takamura and
Manabu Okumura},
pages = {1328--1338},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2016}
}
@inproceedings{Takase2019PositionalET,
title={Positional Encoding to Control Output Sequence Length},
author={S. Takase and N. Okazaki},
booktitle={NAACL-HLT},
author={Sho Takase and
Naoaki Okazaki},
publisher={Annual Conference of the North American Chapter of the Association for Computational Linguistics},
pages={3999--4004},
year={2019}
}
@inproceedings{Murray2018CorrectingLB,
title={Correcting Length Bias in Neural Machine Translation},
author={Kenton Murray and David Chiang},
booktitle={WMT},
year={2018}
pages = {212--223},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@article{Sountsov2016LengthBI,
title={Length bias in Encoder Decoder Models and a Case for Global Conditioning},
author={Pavel Sountsov and Sunita Sarawagi},
journal={ArXiv},
year={2016},
volume={abs/1606.03402}
pages = {1516--1525},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2016}
}
@inproceedings{Jean2015MontrealNM,
title={Montreal Neural Machine Translation Systems for WMT'15},
author={S. Jean and Orhan Firat and Kyunghyun Cho and R. Memisevic and Yoshua Bengio},
booktitle={WMT@EMNLP},
author={S{\'{e}}bastien Jean and
Orhan Firat and
Kyunghyun Cho and
Roland Memisevic and
Yoshua Bengio},
publisher={Conference on Empirical Methods in Natural Language Processing},
pages={134--140},
year={2015}
}
@inproceedings{Yang2018OtemUtemOA,
title={Otem{\&}Utem: Over- and Under-Translation Evaluation Metric for NMT},
author={J. Yang and Biao Zhang and Yue Qin and Xiangwen Zhang and Q. Lin and Jinsong Su},
booktitle={NLPCC},
author={Jing Yang and
Biao Zhang and
Yue Qin and
Xiangwen Zhang and
Qian Lin and
Jinsong Su},
publisher={CCF International Conference on Natural Language Processing and Chinese Computing},
pages={291--302},
year={2018}
}
@inproceedings{Mi2016CoverageEM,
title={Coverage Embedding Models for Neural Machine Translation},
author={Haitao Mi and B. Sankaran and Z. Wang and Abe Ittycheriah},
booktitle={EMNLP},
year={2016}
}
@article{Kazimi2017CoverageFC,
title={Coverage for Character Based Neural Machine Translation},
author={M. Kazimi and Marta R. Costa-juss{\`a}},
journal={Proces. del Leng. Natural},
year={2017},
volume={59},
pages={99-106}
author={Haitao Mi and
Baskaran Sankaran and
Zhiguo Wang and
Abe Ittycheriah},
pages = {955--960},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2016}
}
@inproceedings{DBLP:conf/emnlp/HuangZM17,
...
...
@@ -6175,7 +6173,8 @@ author = {Yoshua Bengio and
@inproceedings{Wiseman2016SequencetoSequenceLA,
title={Sequence-to-Sequence Learning as Beam-Search Optimization},
author={Sam Wiseman and Alexander M. Rush},
booktitle={EMNLP},
publisher={Conference on Empirical Methods in Natural Language Processing},
pages={1296--1306},
year={2016}
}
...
...
@@ -6192,10 +6191,12 @@ author = {Yoshua Bengio and
@article{Ma2019LearningTS,
title={Learning to Stop in Structured Prediction for Neural Machine Translation},
author={M. Ma and Renjie Zheng and Liang Huang},
journal={ArXiv},
year={2019},
volume={abs/1904.01032}
author={Mingbo Ma and
Renjie Zheng and
Liang Huang},
pages = {1884--1889},
publisher = { Annual Conference of the North American Chapter of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{KleinOpenNMT,
...
...
@@ -6219,119 +6220,153 @@ author = {Yoshua Bengio and
year = {2015}
}
@inproceedings{Eisner2011LearningST,
title={Learning Speed-Accuracy Tradeoffs in Nondeterministic Inference Algorithms},
author={J. Eisner and Hal Daum{\'e}},
year={2011}
}
@inproceedings{Jiang2012LearnedPF,
title={Learned Prioritization for Trading Off Accuracy and Speed},
author={J. Jiang and Adam R. Teichert and Hal Daum{\'e} and J. Eisner},
booktitle={NIPS},
year={2012}
author={Jiarong Jiang and Adam R. Teichert and Hal Daum{\'e} and Jason Eisner},
publisher={Conference and Workshop on Neural Information Processing Systems},
pages={1340--1348},
year= {2012}
}
@inproceedings{Zheng2020OpportunisticDW,
title={Opportunistic Decoding with Timely Correction for Simultaneous Translation},
author={Renjie Zheng and M. Ma and Baigong Zheng and Kaibo Liu and Liang Huang},
booktitle={ACL},
author={Renjie Zheng and
Mingbo Ma and
Baigong Zheng and
Kaibo Liu and
Liang Huang},
publisher={Annual Meeting of the Association for Computational Linguistics},
pages={437--442},
year={2020}
}
@inproceedings{Ma2019STACLST,
title={STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework},
author={M. Ma and L. Huang and Hao Xiong and Renjie Zheng and Kaibo Liu and Baigong Zheng and Chuanqiang Zhang and Zhongjun He and Hairong Liu and X. Li and H. Wu and Haifeng Wang},
booktitle={ACL},
author={Mingbo Ma and
Liang Huang and
Hao Xiong and
Renjie Zheng and
Kaibo Liu and
Baigong Zheng and
Chuanqiang Zhang and
Zhongjun He and
Hairong Liu and
Xing Li and
Hua Wu and
Haifeng Wang},
publisher={Annual Meeting of the Association for Computational Linguistics},
pages={3025--3036},
year={2019}
}
@inproceedings{Gimpel2013ASE,
title={A Systematic Exploration of Diversity in Machine Translation},
author={Kevin Gimpel and Dhruv Batra and Chris Dyer and Gregory Shakhnarovich},
booktitle={EMNLP},
publisher={Conference on Empirical Methods in Natural Language Processing},
pages={1100--1111},
year={2013}
}
@article{Li2016MutualIA,
title={Mutual Information and Diverse Decoding Improve Neural Machine Translation},
author={J. Li and Dan Jurafsky},
journal={ArXiv},
author={Jiwei Li and Dan Jurafsky},
journal={CoRR},
year={2016},
volume={abs/1601.00372}
}
@inproceedings{Li2016ADO,
title={A Diversity-Promoting Objective Function for Neural Conversation Models},
author={J. Li and Michel Galley and Chris Brockett and Jianfeng Gao and W. Dolan},
booktitle={HLT-NAACL},
author={Jiwei Li and
Michel Galley and
Chris Brockett and
Jianfeng Gao and
Bill Dolan},
publisher={Annual Conference of the North American Chapter of the Association for Computational Linguistics},
pages={110--119},
year={2016}
}
@inproceedings{He2018SequenceTS,
title={Sequence to Sequence Mixture Model for Diverse Machine Translation},
author={Xuanli He and Gholamreza Haffari and Mohammad Norouzi},
booktitle={CoNLL},
year={2018}
pages = {583--592},
publisher = {International Conference on Computational Linguistics},
year = {2018}
}
@article{Shen2019MixtureMF,
title={Mixture Models for Diverse Machine Translation: Tricks of the Trade},
author={Tianxiao Shen and Myle Ott and M. Auli and Marc'Aurelio Ranzato},
journal={ArXiv},
year={2019},
volume={abs/1902.07816}
author={Tianxiao Shen and Myle Ott and Michael Auli and Marc'Aurelio Ranzato},
pages = {5719--5728},
publisher = {International Conference on Machine Learning},
year = {2019},
}
@article{Wu2020GeneratingDT,
title={Generating Diverse Translation from Model Distribution with Dropout},
author={Xuanfu Wu and Yang Feng and Chenze Shao},
journal={ArXiv},
year={2020},
volume={abs/2010.08178}
pages={1088--1097},
publisher={Annual Meeting of the Association for Computational Linguistics},
year={2020}
}
@inproceedings{Sun2020GeneratingDT,
title={Generating Diverse Translation by Manipulating Multi-Head Attention},
author={Zewei Sun and Shujian Huang and Hao-Ran Wei and Xin-Yu Dai and Jiajun Chen},
booktitle={AAAI},
author={Zewei Sun and Shujian Huang and Hao Ran Wei and Xin Yu Dai and Jiajun Chen},
publisher={AAAI Conference on Artificial Intelligence},
pages={8976--8983},
year={2020}
}
@article{Vijayakumar2016DiverseBS,
title={Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models},
author={Ashwin K. Vijayakumar and Michael Cogswell and R. R. Selvaraju and Q. Sun and Stefan Lee and David J. Crandall and Dhruv Batra},
journal={ArXiv},
author={Ashwin K. Vijayakumar and
Michael Cogswell and
Ramprasaath R. Selvaraju and
Qing Sun and
Stefan Lee and
David J. Crandall and
Dhruv Batra},
journal={CoRR},
year={2016},
volume={abs/1610.02424}
}
@inproceedings{Liu2014SearchAwareTF,
title={Search-Aware Tuning for Machine Translation},
author={L. Liu and Liang Huang},
booktitle={EMNLP},
author={Lemao Liu and
Liang Huang},
publisher={Conference on Empirical Methods in Natural Language Processing},
pages={1942--1952},
year={2014}
}
@inproceedings{Yu2013MaxViolationPA,
title={Max-Violation Perceptron and Forced Decoding for Scalable MT Training},
author={Heng Yu and Liang Huang and Haitao Mi and Kai Zhao},
booktitle={EMNLP},
publisher={Conference on Empirical Methods in Natural Language Processing},
pages={1112--1123},
year={2013}
}
@inproceedings{Stahlberg2019OnNS,
title={On NMT Search Errors and Model Errors: Cat Got Your Tongue?},
author={Felix Stahlberg and
B. Byrne},
booktitle={EMNLP/IJCNLP},
Bill Byrne},
publisher={Conference on Empirical Methods in Natural Language Processing},
pages={3354--3360},
year={2019}
}
@inproceedings{Niehues2017AnalyzingNM,
title={Analyzing Neural MT Search and Model Performance},
author={J. Niehues and Eunah Cho and Thanh-Le Ha and Alexander H. Waibel},
booktitle={NMT@ACL},
author={Jan Niehues and
Eunah Cho and
Thanh-Le Ha and
Alex Waibel},
pages={11--17},
publisher={Annual Meeting of the Association for Computational Linguistics},
year={2017}
}
...
...
@@ -6346,26 +6381,31 @@ author = {Yoshua Bengio and
@article{Ranzato2016SequenceLT,
title={Sequence Level Training with Recurrent Neural Networks},
author={Marc'Aurelio Ranzato and S. Chopra and M. Auli and W. Zaremba},
journal={CoRR},
year={2016},
volume={abs/1511.06732}
author={Marc'Aurelio Ranzato and
Sumit Chopra and
Michael Auli and
Wojciech Zaremba},
publisher={International Conference on Learning Representations},
year={2016}
}
@article{Bengio2015ScheduledSF,
title={Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks},
author={S. Bengio and Oriol Vinyals and Navdeep Jaitly and Noam Shazeer},
journal={ArXiv},
year={2015},
volume={abs/1506.03099}
author={Samy Bengio and
Oriol Vinyals and
Navdeep Jaitly and
Noam Shazeer},
booktitle = {Conference and Workshop on Neural Information Processing Systems},
pages = {1171--1179},
year = {2015}
}
@article{Zhang2019BridgingTG,
title={Bridging the Gap between Training and Inference for Neural Machine Translation},
author={Wen Zhang and Y. Feng and Fandong Meng and Di You and Qun Liu},
journal={ArXiv},
year={2019},
volume={abs/1906.02448}
author={Wen Zhang and Yang Feng and Fandong Meng and Di You and Qun Liu},
pages = {4334--4343},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/acl/ShenCHHWSL16,
...
...
@@ -6381,15 +6421,6 @@ author = {Yoshua Bengio and
year = {2016},
}
@article{Gage1994ANA,
title={A new algorithm for data compression},
author={P. Gage},
journal={The C Users Journal archive},
year={1994},
volume={12},
pages={23-38}
}
@inproceedings{DBLP:conf/acl/SennrichHB16a,
author = {Rico Sennrich and
Barry Haddow and
...
...
@@ -6433,26 +6464,31 @@ author = {Yoshua Bengio and
@article{Narang2017BlockSparseRN,
title={Block-Sparse Recurrent Neural Networks},
author={Sharan Narang and Eric Undersander and G. Diamos},
journal={ArXiv},
author={Sharan Narang and Eric Undersander and Gregory Diamos},
journal={CoRR},
year={2017},
volume={abs/1711.02782}
}
@article{Gale2019TheSO,
title={The State of Sparsity in Deep Neural Networks},
author={T. Gale and E. Elsen and Sara Hooker},
journal={ArXiv},
author={Trevor Gale and
Erich Elsen and
Sara Hooker},
journal={CoRR},
year={2019},
volume={abs/1902.09574}
}
@article{Michel2019AreSH,
title={Are Sixteen Heads Really Better than One?},
author={Paul Michel and Omer Levy and Graham Neubig},
journal={ArXiv},
year={2019},
volume={abs/1905.10650}
author = {Paul Michel and
Omer Levy and
Graham Neubig},
title = {Are Sixteen Heads Really Better than One?},
publisher = {Conference and Workshop on Neural Information Processing Systems},
pages = {14014--14024},
year = {2019}
}
@inproceedings{DBLP:journals/corr/abs-1905-09418,
...
...
@@ -6480,17 +6516,11 @@ author = {Yoshua Bengio and
@article{Katharopoulos2020TransformersAR,
title={Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention},
author={Angelos Katharopoulos and Apoorv Vyas and Nikolaos Pappas and Franccois Fleuret},
journal={ArXiv},
journal={CoRR},
year={2020},
volume={abs/2006.16236}
}
@inproceedings{Beal2003VariationalAF,
title={Variational algorithms for approximate Bayesian inference},
author={M. Beal},
year={2003}
}
@article{xiao2011language,
title ={Language Modeling for Syntax-Based Machine Translation Using Tree Substitution Grammars: A Case Study on Chinese-English Translation},
author ={Xiao, Tong and Zhu, Jingbo and Zhu, Muhua},
...
...
@@ -6503,33 +6533,40 @@ author = {Yoshua Bengio and
@inproceedings{Li2009VariationalDF,
title={Variational Decoding for Statistical Machine Translation},
author={Zhifei Li and J. Eisner and S. Khudanpur},
booktitle={ACL/IJCNLP},
author={Zhifei Li and
Jason Eisner and
Sanjeev Khudanpur},
publisher={Annual Meeting of the Association for Computational Linguistics},
pages={593--601},
year={2009}
}
@article{Bastings2019ModelingLS,
title={Modeling Latent Sentence Structure in Neural Machine Translation},
author={Jasmijn Bastings and W. Aziz and Ivan Titov and K. Sima'an},
journal={ArXiv},
year={2019},
volume={abs/1901.06436}
author={Jasmijn Bastings and
Wilker Aziz and
Ivan Titov and
Khalil Sima'an},
journal = {CoRR},
volume = {abs/1901.06436},
year = {2019}
}
@article{Shah2018GenerativeNM,
title={Generative Neural Machine Translation},
author={Harshil Shah and D. Barber},
journal={ArXiv},
year={2018},
volume={abs/1806.05138}
author={Harshil Shah and
David Barber},
publisher={Conference and Workshop on Neural Information Processing Systems},
publisher={Conference and Workshop on Neural Information Processing Systems},
pages={4565--4573},
year={2016}
}
@article{Duan2017OneShotIL,
title={One-Shot Imitation Learning},
author={Yan Duan and Marcin Andrychowicz and Bradly C. Stadie and Jonathan Ho and J. Schneider and Ilya Sutskever and P. Abbeel and W. Zaremba},
journal={ArXiv},
author={Yan Duan and Marcin Andrychowicz and Bradly C. Stadie and Jonathan Ho and Jonas Schneider and Ilya Sutskever and Pieter Abbeel and Wojciech Zaremba},