Commit da1be3c3 by 孟霞

合并分支 'master' 到 'mengxia'

Master

查看合并请求 !357
parents d8b052bd b49d3bf3
This source diff could not be displayed because it is too large. You can view the blob instead.
......@@ -58,7 +58,7 @@
\parinterval 自注意力机制也可以被看做是一个序列表示模型。比如,对于每个目标位置$j$,都生成一个与之对应的源语句子表示,它的形式为:
\begin{eqnarray}
\mathbi{C}}_j = \sum_i \alpha_{i,j}\vectorn{\emph{h}}_i
\mathbi{C}_j & = & \sum_i \alpha_{i,j}\vectorn{\emph{h}}_i
\label{eq:12-4201}
\end{eqnarray}
......@@ -561,7 +561,7 @@ Transformer Deep(48层) & 30.2 & 43.1 & 194$\times 10^
\section{推断}
\parinterval Transformer解码器生成译文词序列的过程和其它神经机器翻译系统类似,都是从左往右生成,且下一个单词的预测依赖已经生成的单词。其具体推断过程如图\ref{fig:12-56}所示,其中$\mathbi{C}}_i$是编码-解码注意力的结果,解码器首先根据“<eos>”和$\mathbi{C}}_1$生成第一个单词“how”,然后根据“how”和$\mathbi{C}}_2$生成第二个单词“are”,以此类推,当解码器生成“<eos>”时结束推断。
\parinterval Transformer解码器生成译文词序列的过程和其它神经机器翻译系统类似,都是从左往右生成,且下一个单词的预测依赖已经生成的单词。其具体推断过程如图\ref{fig:12-56}所示,其中$\mathbi{C}_i$是编码-解码注意力的结果,解码器首先根据“<eos>”和$\mathbi{C}_1$生成第一个单词“how”,然后根据“how”和$\mathbi{C}_2$生成第二个单词“are”,以此类推,当解码器生成“<eos>”时结束推断。
\parinterval 但是,Transformer在推断阶段无法对所有位置进行并行化操作,因为对于每一个目标语单词都需要对前面所有单词进行注意力操作,因此它推断速度非常慢。可以采用的加速手段有:低精度\upcite{DBLP:journals/corr/CourbariauxB16}、Cache(缓存需要重复计算的变量)\upcite{DBLP:journals/corr/abs-1805-00631}、共享注意力网络等\upcite{Xiao2019SharingAW}。关于Transformer模型的推断加速方法将会在{\chapterfourteen}进一步深入讨论。
......
\begin{tikzpicture}
\begin{scope}
\node [anchor=center] (node1) at (0,0) {\textbf{Machine translation}, sometiomes referred to by the abbreviation \textbf{MT} (not to be };
\node [anchor=north] (node2) at (node1.south) {confused with computer-aided translation,,machine-aided human translation inter};
\node [anchor=north] (node3) at (node2.south) {-active translation), is a subfield of computational linguistics that investigates the};
\node [anchor=north] (node4) at ([xshift=-1.8em]node3.south) {use of software to translate text or speech from one language to another.};
\node [anchor=south] (node5) at ([xshift=-12.8em,yshift=0.5em]node1.north) {\Large{WIKIPEDIA}};
\draw [-,line width=1pt]([xshift=-16.1em]node1.north) -- ([xshift=16.1em]node1.north);
\draw [-,line width=1pt]([xshift=-16.1em,yshift=-9.4em]node1.north) -- ([xshift=16.1em,yshift=-9.4em]node1.north);
\node [anchor=north] (node6) at ([xshift=-11.8em,yshift=-0.8em]node4.south) {\Large{维基百科}};
\node [anchor=north] (node7) at ([yshift=-4.6em]node3.south) {{\small\sffamily\bfnew{机器翻译}}(英语:Machine Translation,经常简写为MT,简称机译或机翻)};
\node [anchor=north] (node8) at ([xshift=-0.1em]node7.south) {属于计算语言学的范畴,其研究借由计算机程序将文字或演说从一种自然};
\node [anchor=north] (node9) at ([xshift=-9.85em]node8.south) {语言翻译成另一种自然语言。};
\begin{pgfonlayer}{background}
{
\node[rectangle,draw=black,inner sep=0.2em,fill=white,drop shadow] [fit =(node1)(node2)(node3)(node4)(node5)(node6)(node7)(node8)(node9)] (remark2) {};
}
\end{pgfonlayer}
\end{scope}
\end{tikzpicture}
\ No newline at end of file
......@@ -8,7 +8,7 @@
\node [rectangle,inner sep=0.4em,draw,fill=blue!20!white] [fit = (e) (c)] (box) {};
\end{pgfonlayer}
\draw [->,thick] ([yshift=-1em]box.south)--([yshift=-0.1em]box.south) node [pos=0,below] (bottom1) {\small{单词$w$one-hot表示}};
\draw [->,thick] ([yshift=-1em]box.south)--([yshift=-0.1em]box.south) node [pos=0,below] (bottom1) {\small{单词$w$One-hot表示}};
\draw [->,thick] ([yshift=0.1em]box.north)--([yshift=1em]box.north) node [pos=1,above] (top1) {\scriptsize{$\mathbi{e}$=(8,.2,-1,.9,...,1)}};
\node [anchor=north] (bottom2) at ([yshift=0.3em]bottom1.south) {\scriptsize{$\mathbi{o}$=(0,0,1,0,...,0)}};
\node [anchor=south] (top2) at ([yshift=-0.3em]top1.north) {\small{单词$w$的分布式表示}};
......
......@@ -53,7 +53,7 @@
\node [secnode,anchor=south west,fill=green!30,minimum width=9em,minimum height=4.5em,align=center] (sec15) at ([yshift=0.8em]sec13.north west) {第十五章\\ 神经机器翻译 \\ 结构优化};
\node [secnode,anchor=south west,fill=green!30,minimum width=9em,minimum height=4.5em,align=center] (sec16) at ([xshift=0.8em]sec15.south east) {第十六章\\ 低资源 \\ 机器翻译};
\node [secnode,anchor=south west,fill=green!30,minimum width=9em,minimum height=4.5em,align=center] (sec17) at ([xshift=0.8em]sec16.south east) {第十七章\\ 多模态、多层次 \\ 机器翻译};
\node [secnode,anchor=south west,fill=amber!25,minimum width=28.7em,align=center] (sec18) at ([yshift=0.8em]sec15.north west) {第十八章\hspace{1em} 机器翻译工业实践};
\node [secnode,anchor=south west,fill=amber!25,minimum width=28.7em,align=center] (sec18) at ([yshift=0.8em]sec15.north west) {第十八章\hspace{1em} 机器翻译应用技术};
\node [rectangle,draw,dotted,thick,inner sep=0.1em,fill opacity=1] [fit = (sec13) (sec14)] (nmtbasebox) {};
\draw [->,very thick] ([yshift=-0.7em]sec15.south) -- ([yshift=-0.1em]sec15.south);
\draw [->,very thick] ([yshift=-0.7em]sec16.south) -- ([yshift=-0.1em]sec16.south);
......
......@@ -93,7 +93,7 @@
\item 第十五章\ 神经机器翻译结构优化
\item 第十六章\ 低资源机器翻译
\item 第十七章\ 多模态、多层次机器翻译
\item 第十八章\ 机器翻译工业实践
\item 第十八章\ 机器翻译应用技术
\end{itemize}
\end{itemize}
......@@ -105,7 +105,7 @@
本书的第三部分主要介绍神经机器翻译模型,该模型也是近些年机器翻译的热点。第九章介绍了神经网络和深度学习的基础知识以保证本书知识体系的完备性。同时,第九章也介绍了基于神经网络的语言模型,其建模思想在神经机器翻译中被大量使用。第十、十一、十二章分别对三种经典的神经机器翻译模型进行介绍,以模型提出的时间为序,从最初的基于循环网络的模型,到最新的Transformer模型均有涉及。其中也会对编码器-解码器框架、注意力机制等经典方法和技术进行介绍。
本书的第四部分会进一步对机器翻译的前沿技术进行讨论,以神经机器翻译为主。第十三、十四、十五章是神经机器翻译研发的三个主要方面,也是近几年机器翻译领域讨论最多的几个方向。第十六章也是机器翻译的热门方向之一,包括无监督翻译等主题都会在这里被讨论。第十六章会对语音、图像翻译等多模态方法以及篇章级翻译等方法进行介绍,它们可以被看作是机器翻译在更多任务上的扩展。第十七章会结合笔者在各种机器翻译比赛和机器翻译产品研发的经验,对机器翻译系统搭建的具体流程和一些常见技术进行讨论,包括调优方法、前后处理等,都是机器翻译工业应用中的常见问题
本书的第四部分会进一步对机器翻译的前沿技术进行讨论,以神经机器翻译为主。第十三、十四、十五章是神经机器翻译研发的三个主要方面,也是近几年机器翻译领域讨论最多的几个方向。第十六章也是机器翻译的热门方向之一,包括无监督翻译等主题都会在这里被讨论。第十六章会对语音、图像翻译等多模态方法以及篇章级翻译等方法进行介绍,它们可以被看作是机器翻译在更多任务上的扩展。第十七章会结合笔者在各种机器翻译比赛和机器翻译产品研发的经验,对机器翻译的应用技术进行讨论
%-------------------------------------------
\begin{figure}[htp]
......
......@@ -5508,7 +5508,147 @@ pages ={157-166},
pages = {7057--7067},
year = {2019}
}
@inproceedings{DBLP:conf/aclnmt/HoangKHC18,
author = {Cong Duy Vu Hoang and
Philipp Koehn and
Gholamreza Haffari and
Trevor Cohn},
title = {Iterative Back-Translation for Neural Machine Translation},
pages = {18--24},
publisher = {Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/icml/OttAGR18,
author = {Myle Ott and
Michael Auli and
David Grangier and
Marc'Aurelio Ranzato},
title = {Analyzing Uncertainty in Neural Machine Translation},
volume = {80},
pages = {3953--3962},
publisher = {{PMLR}},
year = {2018}
}
@inproceedings{DBLP:conf/acl/FadaeeBM17a,
author = {Marzieh Fadaee and
Arianna Bisazza and
Christof Monz},
title = {Data Augmentation for Low-Resource Neural Machine Translation},
pages = {567--573},
publisher = {Association for Computational Linguistics},
year = {2017}
}
@inproceedings{finding2006adafre,
author = {S. F. Adafre and Maarten de Rijke},
title = {Finding Similar Sentences across Multiple Languages in Wikipedia },
publisher = {European Association of Computational Linguistics},
year = {2006}
}
@inproceedings{method2008keiji,
author = {Keiji Yasuda and Eiichiro Sumita},
title = {Method for building sentence-aligned corpus from wikipedia},
publisher = {AAAI Conference on Artificial Intelligence},
year = {2008}
}
@article{DBLP:journals/coling/MunteanuM05,
author = {Dragos Stefan Munteanu and
Daniel Marcu},
title = {Improving Machine Translation Performance by Exploiting Non-Parallel
Corpora},
journal = {Computational Linguistics},
volume = {31},
number = {4},
pages = {477--504},
year = {2005}
}
@inproceedings{DBLP:conf/naacl/SmithQT10,
author = {Jason R. Smith and
Chris Quirk and
Kristina Toutanova},
title = {Extracting Parallel Sentences from Comparable Corpora using Document
Level Alignment},
pages = {403--411},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2010}
}
@inproceedings{DBLP:conf/emnlp/ZhangZ16,
author = {Jiajun Zhang and
Chengqing Zong},
title = {Exploiting Source-side Monolingual Data in Neural Machine Translation},
pages = {1535--1545},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
}
@inproceedings{DBLP:conf/acl/XiaKAN19,
author = {Mengzhou Xia and
Xiang Kong and
Antonios Anastasopoulos and
Graham Neubig},
title = {Generalized Data Augmentation for Low-Resource Translation},
pages = {5786--5796},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/WangPDN18,
author = {Xinyi Wang and
Hieu Pham and
Zihang Dai and
Graham Neubig},
title = {SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine
Translation},
pages = {856--861},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/acl/GaoZWXQCZL19,
author = {Fei Gao and
Jinhua Zhu and
Lijun Wu and
Yingce Xia and
Tao Qin and
Xueqi Cheng and
Wengang Zhou and
Tie-Yan Liu},
title = {Soft Contextual Data Augmentation for Neural Machine Translation},
pages = {5539--5544},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/WangLWLS19,
author = {Shuo Wang and
Yang Liu and
Chao Wang and
Huanbo Luan and
Maosong Sun},
title = {Improving Back-Translation with Uncertainty-based Confidence Estimation},
pages = {791--802},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/WuWXQLL19,
author = {Lijun Wu and
Yiren Wang and
Yingce Xia and
Tao Qin and
Jianhuang Lai and
Tie-Yan Liu},
title = {Exploiting Monolingual Data at Scale for Neural Machine Translation},
pages = {4205--4215},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/LiLHZZ19,
author = {Guanlin Li and
Lemao Liu and
Guoping Huang and
Conghui Zhu and
Tiejun Zhao},
title = {Understanding Data Augmentation in Neural Machine Translation: Two
Perspectives towards Generalization},
pages = {5688--5694},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
%%%%% chapter 16------------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
......
......@@ -139,10 +139,10 @@
%\include{Chapter6/chapter6}
%\include{Chapter7/chapter7}
%\include{Chapter8/chapter8}
%\include{Chapter9/chapter9}
\include{Chapter9/chapter9}
\include{Chapter10/chapter10}
%\include{Chapter11/chapter11}
%\include{Chapter12/chapter12}
\include{Chapter11/chapter11}
\include{Chapter12/chapter12}
%\include{Chapter13/chapter13}
%\include{Chapter14/chapter14}
%\include{Chapter15/chapter15}
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论