Commit 0163aa74 by xiaotong

minor updates

parent 1c155b41
...@@ -57,7 +57,7 @@ ...@@ -57,7 +57,7 @@
\begin{itemize} \begin{itemize}
\item 自然语言翻译问题的复杂性极高。语言是人类进化的最高成就之一,自然语言具有高度的概括性、灵活性、丰富性,这些都很难用几个简单的模型和算法描述出来,因此翻译问题的数学建模和计算机程序实现难度很大。虽然近几年Alpha Go等人工智能系统在围棋等领域取得了令人瞩目的成绩,但是相比翻译来说,围棋等棋类任务仍然``简单'',比如,对于一个句子,其潜在的译文几乎是不可穷尽的,即使同一句话不同人的理解也不尽相同,甚至在翻译一个句子、一个单词的时候,要考虑整个篇章的上下文语境,这些问题都不是传统棋类任务所具备的。 \item 自然语言翻译问题的复杂性极高。语言是人类进化的最高成就之一,自然语言具有高度的概括性、灵活性、丰富性,这些都很难用几个简单的模型和算法描述出来,因此翻译问题的数学建模和计算机程序实现难度很大。虽然近几年Alpha Go等人工智能系统在围棋等领域取得了令人瞩目的成绩,但是相比翻译来说,围棋等棋类任务仍然``简单'',比如,对于一个句子,其潜在的译文几乎是不可穷尽的,即使同一句话不同人的理解也不尽相同,甚至在翻译一个句子、一个单词的时候,要考虑整个篇章的上下文语境,这些问题都不是传统棋类任务所具备的。
\vspace{0.5em} \vspace{0.5em}
\item 计算机的``理解''与人类的``理解''很难统一。人类一直希望把自己进行翻译所使用的知识描述出来,并用计算机程序进行实现,包括早期基于规则的机器翻译方法都源自这个思想。但是经过实践发现,人和计算机在``理解''自然语言上存在着鸿沟。首先,人类的语言能力是经过长时间多种外部环境因素共同刺激形成的,这种能力很难直接准确表达。也就是说人类的语言知识本身就很难描述,更不用说让计算机来理解;其次,人和机器翻译系统理解语言的目标不一样。人理解和使用语言是为了进行生活和工作,目标非常复杂,而机器翻译系统更多的是为了对某些数学上定义的目标函数进行优化。也就是说,机器翻译系统关注的是翻译这个单一目标,而并不是像人一样进行复杂的活动;此外,人和计算机的运行方式有着本质区别。人类语言能力的生物学机理与机器翻译系统所使用的计算模型本质上是不同的,机器翻译系统使用的是其自身能够理解的``知识'',比如,统计学上的词语表示。这种知识并不需要人来理解,当然计算机也并不必须要理解人是如何思考的。 \item 计算机的``理解''与人类的``理解''存在鸿沟。人类一直希望把自己进行翻译所使用的知识描述出来,并用计算机程序进行实现,包括早期基于规则的机器翻译方法都源自这个思想。但是经过实践发现,人和计算机在``理解''自然语言上存在着鸿沟。首先,人类的语言能力是经过长时间多种外部环境因素共同刺激形成的,这种能力很难直接准确表达。也就是说人类的语言知识本身就很难描述,更不用说让计算机来理解;其次,人和机器翻译系统理解语言的目标不一样。人理解和使用语言是为了进行生活和工作,目标非常复杂,而机器翻译系统更多的是为了对某些数学上定义的目标函数进行优化。也就是说,机器翻译系统关注的是翻译这个单一目标,而并不是像人一样进行复杂的活动;此外,人和计算机的运行方式有着本质区别。人类语言能力的生物学机理与机器翻译系统所使用的计算模型本质上是不同的,机器翻译系统使用的是其自身能够理解的``知识'',比如,统计学上的词语表示。这种知识并不需要人来理解,当然计算机也并不必须要理解人是如何思考的。
\vspace{0.5em} \vspace{0.5em}
\item 单一的方法无法解决多样的翻译问题。首先,语种的多样性会导致任意两种语言之间的翻译实际上都是不同的翻译任务。比如,世界上存在的语言不下几千种,如果任意两种语言进行互译就有上百万种翻译需求。虽然已经有研究者尝试用同一个框架甚至同一个翻译系统进行全语种的翻译,但是离真正可用还有相当的距离;此外,不同的领域,不同的应用场景对翻译也有不同的需求。比如,文学作品的翻译和新闻的翻译就有不同、口译和笔译也有不同,类似的情况不胜枚举。机器翻译需要适用多样的需求,这些又进一步增加了对翻译进行计算机建模的难度;还有,对于机器翻译来说,充足的高质量数据是必要的,但是不同语种、不同领域、不同应用场景所拥有数据量有明显差异,甚至很多语种几乎没有可用的数据,这时开发机器翻译系统的难度可想而知。注意,现在的机器翻译还无法像人类一样在学习少量样例的情况下进行举一反三,因此数据稀缺情况下的机器翻译也给我们提出了很大挑战。 \item 单一的方法无法解决多样的翻译问题。首先,语种的多样性会导致任意两种语言之间的翻译实际上都是不同的翻译任务。比如,世界上存在的语言不下几千种,如果任意两种语言进行互译就有上百万种翻译需求。虽然已经有研究者尝试用同一个框架甚至同一个翻译系统进行全语种的翻译,但是离真正可用还有相当的距离;此外,不同的领域,不同的应用场景对翻译也有不同的需求。比如,文学作品的翻译和新闻的翻译就有不同、口译和笔译也有不同,类似的情况不胜枚举。机器翻译需要适用多样的需求,这些又进一步增加了对翻译进行计算机建模的难度;还有,对于机器翻译来说,充足的高质量数据是必要的,但是不同语种、不同领域、不同应用场景所拥有数据量有明显差异,甚至很多语种几乎没有可用的数据,这时开发机器翻译系统的难度可想而知。注意,现在的机器翻译还无法像人类一样在学习少量样例的情况下进行举一反三,因此数据稀缺情况下的机器翻译也给我们提出了很大挑战。
\end{itemize} \end{itemize}
...@@ -590,6 +590,8 @@ His house is on the south bank of the river. ...@@ -590,6 +590,8 @@ His house is on the south bank of the river.
\parinterval 《机器学习》\cite{周志华2016机器学习}由南京大学教授周志华教授所著,作为机器学习领域入门教材,该书尽可能地涵盖了机器学习基础知识的各个方面,试图尽可能少地使用数学知识介绍机器学习方法与思想。在机器翻译中使用的很多机器学习概念和方法可以从该书中进行学习。 \parinterval 《机器学习》\cite{周志华2016机器学习}由南京大学教授周志华教授所著,作为机器学习领域入门教材,该书尽可能地涵盖了机器学习基础知识的各个方面,试图尽可能少地使用数学知识介绍机器学习方法与思想。在机器翻译中使用的很多机器学习概念和方法可以从该书中进行学习。
\parinterval 《神经网络与深度学习》{\color{red} 参考文献!}由复旦大学邱锡鹏教授所著,全面的介绍了神经网络和深度学习的基本概念和常用技术,同时涉及了许多深度学习的前沿方法。该书适合初学者阅读,同时又不失为一本面向专业人士的参考书。
\parinterval TensorFlow官网提供了一个有关神经机器翻译的教程,介绍了从数据处理开始如何利用TensorFlow工具从零搭建一个神经机器翻译系统以及如何解码,其地址为\url{https://www.tensorflow.org/tutorials/text/nmt\_with\_attention}。此外谷歌和Facebook也分别提供了基于序列到序列机器翻译模型的高级教程。谷歌的版本是基于TensorFlow实现,网址为:\url{https://github.com/tensorflow/nmt},Facebook的教程主要是基于PyTorch实现,网址为:\url{https://pytorch.org/tutorials/intermediate/seq2seq\_translation\_tutorial.html}。网站上也包含一些综述论文,其中详细的介绍了神经机器翻译的发展历程,问题定义以及目前遇到的问题。 \parinterval TensorFlow官网提供了一个有关神经机器翻译的教程,介绍了从数据处理开始如何利用TensorFlow工具从零搭建一个神经机器翻译系统以及如何解码,其地址为\url{https://www.tensorflow.org/tutorials/text/nmt\_with\_attention}。此外谷歌和Facebook也分别提供了基于序列到序列机器翻译模型的高级教程。谷歌的版本是基于TensorFlow实现,网址为:\url{https://github.com/tensorflow/nmt},Facebook的教程主要是基于PyTorch实现,网址为:\url{https://pytorch.org/tutorials/intermediate/seq2seq\_translation\_tutorial.html}。网站上也包含一些综述论文,其中详细的介绍了神经机器翻译的发展历程,问题定义以及目前遇到的问题。
\parinterval \url{http://www.statmt.org}是一个介绍机器翻译研究的网站,该网站包含了对统计机器翻译研究的一些介绍资料,一些自然语言处理的会议和workshop,常用工具以及语料库。\url{http://www.mt-archive.info}\url{https://www.aclweb.org/anthology}\\网站上有许多介绍机器翻译和自然语言处理的论文。通过这个网站可以了解到自然语言处理领域的一些重要的会议,比如与机器翻译相关的国际会议有: \parinterval \url{http://www.statmt.org}是一个介绍机器翻译研究的网站,该网站包含了对统计机器翻译研究的一些介绍资料,一些自然语言处理的会议和workshop,常用工具以及语料库。\url{http://www.mt-archive.info}\url{https://www.aclweb.org/anthology}\\网站上有许多介绍机器翻译和自然语言处理的论文。通过这个网站可以了解到自然语言处理领域的一些重要的会议,比如与机器翻译相关的国际会议有:
...@@ -621,6 +623,7 @@ His house is on the south bank of the river. ...@@ -621,6 +623,7 @@ His house is on the south bank of the river.
\end{itemize} \end{itemize}
\vspace{0.5em} \vspace{0.5em}
除了会议之外,《Computational Linguistics》、《Machine Translation》、《Transactions of the Association for Computational Linguistics》等期刊也发表了许多与机器翻译相关的重要论文。
......
...@@ -189,7 +189,7 @@ ...@@ -189,7 +189,7 @@
\centering \centering
\input{./Chapter2/Figures/figure-schematic-chain-rule} \input{./Chapter2/Figures/figure-schematic-chain-rule}
\setlength{\belowcaptionskip}{-1cm} \setlength{\belowcaptionskip}{-1cm}
\caption{A,B,C,D,E关系图} \caption{事件A,B,C,D,E之间的关系图}
\label{fig:2.2-3} \label{fig:2.2-3}
\end{figure} \end{figure}
%------------------------------------------- %-------------------------------------------
...@@ -901,7 +901,7 @@ c_{\textrm{KN}}(\cdot) & = & \begin{cases} \textrm{count}(\cdot)\quad \textrm{fo ...@@ -901,7 +901,7 @@ c_{\textrm{KN}}(\cdot) & = & \begin{cases} \textrm{count}(\cdot)\quad \textrm{fo
\parinterval 我们前面提到Kneser-Ney Smoothing 是当前一个标准的、广泛采用的、先进的平滑算法。还有很多基于此为基础衍生出来的算法,有兴趣的读者可以查找更多资料了解。\cite{parsing2009speech}\cite{ney1994structuring}\cite{chen1999empirical} \parinterval 我们前面提到Kneser-Ney Smoothing 是当前一个标准的、广泛采用的、先进的平滑算法。还有很多基于此为基础衍生出来的算法,有兴趣的读者可以查找更多资料了解。\cite{parsing2009speech}\cite{ney1994structuring}\cite{chen1999empirical}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{句法分析(短语结构)}\index{Chapter2.5} \section{句法分析(短语结构分析}\index{Chapter2.5}
\parinterval 通过前面两节的内容,我们已经了解什么叫做``词''、如何对分词问题进行统计建模。同时也了解了如何对词序列的概率用统计语言模型进行描述。无论是分词还是语言模型都是句子浅层词串信息的一种表示。对于一个自然语言句子来说,它更深层次的结构信息可以通过句法信息来描述,而句法信息也是机器翻译和自然语言处理其它任务中常用的知识源之一。本节将会对相关概念进行介绍。 \parinterval 通过前面两节的内容,我们已经了解什么叫做``词''、如何对分词问题进行统计建模。同时也了解了如何对词序列的概率用统计语言模型进行描述。无论是分词还是语言模型都是句子浅层词串信息的一种表示。对于一个自然语言句子来说,它更深层次的结构信息可以通过句法信息来描述,而句法信息也是机器翻译和自然语言处理其它任务中常用的知识源之一。本节将会对相关概念进行介绍。
......
...@@ -207,9 +207,9 @@ ...@@ -207,9 +207,9 @@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{(三)如何从大量的双语平行数据中学习?}\index{Chapter3.2.3.3} \subsubsection{(三)如何从大量的双语平行数据中学习?}\index{Chapter3.2.3.3}
\parinterval 上面提到的方法需要能在更多的句子上使用。假设,有$N$个互译句对$(\mathbf{s}^1,\mathbf{t}^1)$,...,\\$(\mathbf{s}^N,\mathbf{t}^N)$。这时,我们仍然可以使用基于相对频度的方法进行概率估计,具体方法如下: \parinterval 上面提到的方法需要能在更多的句子上使用。假设,有$N$个互译句对$(\mathbf{s}^{[1]},\mathbf{t}^{[1]})$,...,\\$(\mathbf{s}^{[N]},\mathbf{t}^{[N]})$。这时,我们仍然可以使用基于相对频度的方法进行概率估计,具体方法如下:
\begin{eqnarray} \begin{eqnarray}
\textrm{P}(x,y) = \frac{{\sum_{i=1}^{N} c(x,y;\mathbf{s}^i,\mathbf{t}^i)}}{\sum_{i=1}^{n}{{\sum_{x',y'} c(x',y';\mathbf{s}^i,\mathbf{t}^i)}}} \textrm{P}(x,y) = \frac{{\sum_{i=1}^{N} c(x,y;\mathbf{s}^{[i]},\mathbf{t}^{[i]})}}{\sum_{i=1}^{n}{{\sum_{x',y'} c(x',y';\mathbf{s}^{[i]},\mathbf{t}^{[i]})}}}
\label{eqC3.5-new} \label{eqC3.5-new}
\end{eqnarray} \end{eqnarray}
...@@ -228,11 +228,11 @@ ...@@ -228,11 +228,11 @@
\label{example3-2} \label{example3-2}
\end{example} \end{example}
\parinterval 举个例子来说明在多个句子上计算单词翻译概率的方法。例\ref{example3-2}展示了一个由两个句对构成的平行语料库。我们用$\mathbf{s}^1$$\mathbf{s}^2$分别表示第一个句对和第二个句对的源语句子,$\mathbf{t}^1$$\mathbf{t}^2$表示对应的目标语句子。于是,``翻译''和``translation''的翻译概率为 \parinterval 举个例子来说明在多个句子上计算单词翻译概率的方法。例\ref{example3-2}展示了一个由两个句对构成的平行语料库。我们用$\mathbf{s}^{[1]}$$\mathbf{s}^{[2]}$分别表示第一个句对和第二个句对的源语言句子,$\mathbf{t}^{[1]}$$\mathbf{t}^{[2]}$表示对应的目标语言句子。于是,``翻译''和``translation''的翻译概率为
{\small {\small
\begin{eqnarray} \begin{eqnarray}
{\textrm{P}(\textrm{``翻译''},\textrm{``translation''})} & = & {\frac{c(\textrm{``翻译''},\textrm{``translation''};\mathbf{s}^{1},\mathbf{t}^{1})+c(\textrm{``翻译''},\textrm{``translation''};\mathbf{s}^{2},\mathbf{t}^{2})}{\sum_{x',y'} c(x',y';\mathbf{s}^{1},\mathbf{t}^{1}) + \sum_{x',y'} c(x',y';\mathbf{s}^{2},\mathbf{t}^{2})}} \nonumber \\ {\textrm{P}(\textrm{``翻译''},\textrm{``translation''})} & = & {\frac{c(\textrm{``翻译''},\textrm{``translation''};\mathbf{s}^{[1]},\mathbf{t}^{[1]})+c(\textrm{``翻译''},\textrm{``translation''};\mathbf{s}^{[2]},\mathbf{t}^{[2]})}{\sum_{x',y'} c(x',y';\mathbf{s}^{[1]},\mathbf{t}^{[1]}) + \sum_{x',y'} c(x',y';\mathbf{s}^{[2]},\mathbf{t}^{[2]})}} \nonumber \\
& = & \frac{4 + 1}{|\mathbf{s}^{1}| \times |\mathbf{t}^{1}| + |\mathbf{s}^{2}| \times |\mathbf{t}^{2}|} \nonumber \\ & = & \frac{4 + 1}{|\mathbf{s}^{[1]}| \times |\mathbf{t}^{[1]}| + |\mathbf{s}^{[2]}| \times |\mathbf{t}^{[2]}|} \nonumber \\
& = & \frac{4 + 1}{9 \times 7 + 5 \times 7} \nonumber \\ & = & \frac{4 + 1}{9 \times 7 + 5 \times 7} \nonumber \\
& = & \frac{5}{98} & = & \frac{5}{98}
\label{eqC3.6-new} \label{eqC3.6-new}
......
This source diff could not be displayed because it is too large. You can view the blob instead.
...@@ -99,110 +99,4 @@ ...@@ -99,110 +99,4 @@
\indexentry{Chapter5.2|hyperpage}{134} \indexentry{Chapter5.2|hyperpage}{134}
\indexentry{Chapter5.2.1|hyperpage}{134} \indexentry{Chapter5.2.1|hyperpage}{134}
\indexentry{Chapter5.2.1.1|hyperpage}{135} \indexentry{Chapter5.2.1.1|hyperpage}{135}
\indexentry{Chapter5.2.1.2|hyperpage}{136} \in
\indexentry{Chapter5.2.1.3|hyperpage}{136} \ No newline at end of file
\indexentry{Chapter5.2.1.4|hyperpage}{137}
\indexentry{Chapter5.2.1.5|hyperpage}{138}
\indexentry{Chapter5.2.1.6|hyperpage}{139}
\indexentry{Chapter5.2.2|hyperpage}{140}
\indexentry{Chapter5.2.2.1|hyperpage}{141}
\indexentry{Chapter5.2.2.2|hyperpage}{141}
\indexentry{Chapter5.2.2.3|hyperpage}{142}
\indexentry{Chapter5.2.2.4|hyperpage}{143}
\indexentry{Chapter5.2.3|hyperpage}{144}
\indexentry{Chapter5.2.3.1|hyperpage}{144}
\indexentry{Chapter5.2.3.2|hyperpage}{146}
\indexentry{Chapter5.2.4|hyperpage}{148}
\indexentry{Chapter5.3|hyperpage}{151}
\indexentry{Chapter5.3.1|hyperpage}{151}
\indexentry{Chapter5.3.1.1|hyperpage}{151}
\indexentry{Chapter5.3.1.2|hyperpage}{153}
\indexentry{Chapter5.3.1.3|hyperpage}{154}
\indexentry{Chapter5.3.2|hyperpage}{155}
\indexentry{Chapter5.3.3|hyperpage}{156}
\indexentry{Chapter5.3.4|hyperpage}{160}
\indexentry{Chapter5.3.5|hyperpage}{161}
\indexentry{Chapter5.4|hyperpage}{162}
\indexentry{Chapter5.4.1|hyperpage}{163}
\indexentry{Chapter5.4.2|hyperpage}{164}
\indexentry{Chapter5.4.2.1|hyperpage}{165}
\indexentry{Chapter5.4.2.2|hyperpage}{167}
\indexentry{Chapter5.4.2.3|hyperpage}{169}
\indexentry{Chapter5.4.3|hyperpage}{172}
\indexentry{Chapter5.4.4|hyperpage}{174}
\indexentry{Chapter5.4.4.1|hyperpage}{174}
\indexentry{Chapter5.4.4.2|hyperpage}{175}
\indexentry{Chapter5.4.4.3|hyperpage}{175}
\indexentry{Chapter5.4.5|hyperpage}{177}
\indexentry{Chapter5.4.6|hyperpage}{178}
\indexentry{Chapter5.4.6.1|hyperpage}{179}
\indexentry{Chapter5.4.6.2|hyperpage}{181}
\indexentry{Chapter5.4.6.3|hyperpage}{182}
\indexentry{Chapter5.5|hyperpage}{184}
\indexentry{Chapter5.5.1|hyperpage}{184}
\indexentry{Chapter5.5.1.1|hyperpage}{185}
\indexentry{Chapter5.5.1.2|hyperpage}{187}
\indexentry{Chapter5.5.1.3|hyperpage}{188}
\indexentry{Chapter5.5.1.4|hyperpage}{189}
\indexentry{Chapter5.5.2|hyperpage}{190}
\indexentry{Chapter5.5.2.1|hyperpage}{190}
\indexentry{Chapter5.5.2.2|hyperpage}{190}
\indexentry{Chapter5.5.3|hyperpage}{192}
\indexentry{Chapter5.5.3.1|hyperpage}{192}
\indexentry{Chapter5.5.3.2|hyperpage}{194}
\indexentry{Chapter5.5.3.3|hyperpage}{194}
\indexentry{Chapter5.5.3.4|hyperpage}{195}
\indexentry{Chapter5.5.3.5|hyperpage}{196}
\indexentry{Chapter5.6|hyperpage}{196}
\indexentry{Chapter6.1|hyperpage}{199}
\indexentry{Chapter6.1.1|hyperpage}{201}
\indexentry{Chapter6.1.2|hyperpage}{203}
\indexentry{Chapter6.1.3|hyperpage}{206}
\indexentry{Chapter6.2|hyperpage}{208}
\indexentry{Chapter6.2.1|hyperpage}{208}
\indexentry{Chapter6.2.2|hyperpage}{209}
\indexentry{Chapter6.2.3|hyperpage}{210}
\indexentry{Chapter6.2.4|hyperpage}{211}
\indexentry{Chapter6.3|hyperpage}{212}
\indexentry{Chapter6.3.1|hyperpage}{214}
\indexentry{Chapter6.3.2|hyperpage}{216}
\indexentry{Chapter6.3.3|hyperpage}{220}
\indexentry{Chapter6.3.3.1|hyperpage}{220}
\indexentry{Chapter6.3.3.2|hyperpage}{220}
\indexentry{Chapter6.3.3.3|hyperpage}{222}
\indexentry{Chapter6.3.3.4|hyperpage}{223}
\indexentry{Chapter6.3.3.5|hyperpage}{225}
\indexentry{Chapter6.3.4|hyperpage}{225}
\indexentry{Chapter6.3.4.1|hyperpage}{226}
\indexentry{Chapter6.3.4.2|hyperpage}{227}
\indexentry{Chapter6.3.4.3|hyperpage}{230}
\indexentry{Chapter6.3.5|hyperpage}{232}
\indexentry{Chapter6.3.5.1|hyperpage}{233}
\indexentry{Chapter6.3.5.2|hyperpage}{233}
\indexentry{Chapter6.3.5.3|hyperpage}{234}
\indexentry{Chapter6.3.5.4|hyperpage}{234}
\indexentry{Chapter6.3.5.5|hyperpage}{235}
\indexentry{Chapter6.3.5.5|hyperpage}{236}
\indexentry{Chapter6.3.6|hyperpage}{237}
\indexentry{Chapter6.3.6.1|hyperpage}{239}
\indexentry{Chapter6.3.6.2|hyperpage}{240}
\indexentry{Chapter6.3.6.3|hyperpage}{241}
\indexentry{Chapter6.3.7|hyperpage}{242}
\indexentry{Chapter6.4|hyperpage}{244}
\indexentry{Chapter6.4.1|hyperpage}{245}
\indexentry{Chapter6.4.2|hyperpage}{246}
\indexentry{Chapter6.4.3|hyperpage}{249}
\indexentry{Chapter6.4.4|hyperpage}{251}
\indexentry{Chapter6.4.5|hyperpage}{252}
\indexentry{Chapter6.4.6|hyperpage}{253}
\indexentry{Chapter6.4.7|hyperpage}{255}
\indexentry{Chapter6.4.8|hyperpage}{256}
\indexentry{Chapter6.4.9|hyperpage}{257}
\indexentry{Chapter6.4.10|hyperpage}{260}
\indexentry{Chapter6.5|hyperpage}{260}
\indexentry{Chapter6.5.1|hyperpage}{261}
\indexentry{Chapter6.5.2|hyperpage}{261}
\indexentry{Chapter6.5.3|hyperpage}{262}
\indexentry{Chapter6.5.4|hyperpage}{262}
\indexentry{Chapter6.5.5|hyperpage}{263}
\indexentry{Chapter6.6|hyperpage}{264}
\boolfalse {citerequest}\boolfalse {citetracker}\boolfalse {pagetracker}\boolfalse {backtracker}\relax \boolfalse {citerequest}\boolfalse {citetracker}\boolfalse {pagetracker}\boolfalse {backtracker}\relax
\babel@toc {english}{}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {part}{\@mypartnumtocformat {I}{机器翻译基础}}{9}{part.1}% \select@language {english}
\defcounter {refsection}{0}\relax
\contentsline {part}{\@mypartnumtocformat {I}{机器翻译基础}}{9}{part.1}
\ttl@starttoc {default@1} \ttl@starttoc {default@1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {1}机器翻译简介}{11}{chapter.1}% \contentsline {chapter}{\numberline {1}机器翻译简介}{11}{chapter.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.1}机器翻译的概念}{11}{section.1.1}% \contentsline {section}{\numberline {1.1}机器翻译的概念}{11}{section.1.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.2}机器翻译简史}{14}{section.1.2}% \contentsline {section}{\numberline {1.2}机器翻译简史}{14}{section.1.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.3}机器翻译现状}{19}{section.1.3}% \contentsline {section}{\numberline {1.3}机器翻译现状}{19}{section.1.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.4}机器翻译方法}{20}{section.1.4}% \contentsline {section}{\numberline {1.4}机器翻译方法}{20}{section.1.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.4.1}基于规则的机器翻译}{20}{subsection.1.4.1}% \contentsline {subsection}{\numberline {1.4.1}基于规则的机器翻译}{20}{subsection.1.4.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.4.2}基于实例的机器翻译}{22}{subsection.1.4.2}% \contentsline {subsection}{\numberline {1.4.2}基于实例的机器翻译}{22}{subsection.1.4.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.4.3}统计机器翻译}{23}{subsection.1.4.3}% \contentsline {subsection}{\numberline {1.4.3}统计机器翻译}{23}{subsection.1.4.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.4.4}神经机器翻译}{24}{subsection.1.4.4}% \contentsline {subsection}{\numberline {1.4.4}神经机器翻译}{24}{subsection.1.4.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.4.5}对比分析}{25}{subsection.1.4.5}% \contentsline {subsection}{\numberline {1.4.5}对比分析}{25}{subsection.1.4.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.5}翻译质量评价}{25}{section.1.5}% \contentsline {section}{\numberline {1.5}翻译质量评价}{25}{section.1.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.5.1}人工评价}{26}{subsection.1.5.1}% \contentsline {subsection}{\numberline {1.5.1}人工评价}{26}{subsection.1.5.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.5.2}自动评价}{27}{subsection.1.5.2}% \contentsline {subsection}{\numberline {1.5.2}自动评价}{27}{subsection.1.5.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{BLEU}{27}{section*.15}% \contentsline {subsubsection}{BLEU}{27}{section*.15}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{TER}{28}{section*.16}% \contentsline {subsubsection}{TER}{28}{section*.16}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于检测点的评价}{29}{section*.17}% \contentsline {subsubsection}{基于检测点的评价}{29}{section*.17}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.6}机器翻译应用}{30}{section.1.6}% \contentsline {section}{\numberline {1.6}机器翻译应用}{30}{section.1.6}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.7}开源项目与评测}{32}{section.1.7}% \contentsline {section}{\numberline {1.7}开源项目与评测}{32}{section.1.7}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.7.1}开源机器翻译系统}{33}{subsection.1.7.1}% \contentsline {subsection}{\numberline {1.7.1}开源机器翻译系统}{33}{subsection.1.7.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{统计机器翻译开源系统}{33}{section*.19}% \contentsline {subsubsection}{统计机器翻译开源系统}{33}{section*.19}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{神经机器翻译开源系统}{34}{section*.20}% \contentsline {subsubsection}{神经机器翻译开源系统}{34}{section*.20}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.7.2}常用数据集及公开评测任务}{36}{subsection.1.7.2}% \contentsline {subsection}{\numberline {1.7.2}常用数据集及公开评测任务}{36}{subsection.1.7.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.8}推荐学习资源}{39}{section.1.8}% \contentsline {section}{\numberline {1.8}推荐学习资源}{39}{section.1.8}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {2}词法、语法及统计建模基础}{43}{chapter.2}% \contentsline {chapter}{\numberline {2}词法、语法及统计建模基础}{43}{chapter.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.1}问题概述 }{44}{section.2.1}% \contentsline {section}{\numberline {2.1}问题概述 }{44}{section.2.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.2}概率论基础}{45}{section.2.2}% \contentsline {section}{\numberline {2.2}概率论基础}{45}{section.2.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.2.1}随机变量和概率}{46}{subsection.2.2.1}% \contentsline {subsection}{\numberline {2.2.1}随机变量和概率}{46}{subsection.2.2.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.2.2}联合概率、条件概率和边缘概率}{47}{subsection.2.2.2}% \contentsline {subsection}{\numberline {2.2.2}联合概率、条件概率和边缘概率}{47}{subsection.2.2.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.2.3}链式法则}{48}{subsection.2.2.3}% \contentsline {subsection}{\numberline {2.2.3}链式法则}{48}{subsection.2.2.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.2.4}贝叶斯法则}{49}{subsection.2.2.4}% \contentsline {subsection}{\numberline {2.2.4}贝叶斯法则}{49}{subsection.2.2.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.2.5}KL距离和熵}{51}{subsection.2.2.5}% \contentsline {subsection}{\numberline {2.2.5}KL距离和熵}{51}{subsection.2.2.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)信息熵}{51}{section*.27}% \contentsline {subsubsection}{(一)信息熵}{51}{section*.27}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)KL距离}{52}{section*.29}% \contentsline {subsubsection}{(二)KL距离}{52}{section*.29}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)交叉熵}{53}{section*.30}% \contentsline {subsubsection}{(三)交叉熵}{53}{section*.30}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.3}中文分词}{53}{section.2.3}% \contentsline {section}{\numberline {2.3}中文分词}{53}{section.2.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.3.1}基于词典的分词方法}{54}{subsection.2.3.1}% \contentsline {subsection}{\numberline {2.3.1}基于词典的分词方法}{54}{subsection.2.3.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.3.2}基于统计的分词方法}{55}{subsection.2.3.2}% \contentsline {subsection}{\numberline {2.3.2}基于统计的分词方法}{55}{subsection.2.3.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{统计模型的学习与推断}{56}{section*.34}% \contentsline {subsubsection}{统计模型的学习与推断}{56}{section*.34}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{掷骰子游戏}{56}{section*.36}% \contentsline {subsubsection}{掷骰子游戏}{56}{section*.36}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{全概率分词方法}{58}{section*.40}% \contentsline {subsubsection}{全概率分词方法}{58}{section*.40}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.4}$n$-gram语言模型 }{61}{section.2.4}% \contentsline {section}{\numberline {2.4}$n$-gram语言模型 }{61}{section.2.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.4.1}建模}{61}{subsection.2.4.1}% \contentsline {subsection}{\numberline {2.4.1}建模}{61}{subsection.2.4.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.4.2}未登录词和平滑算法}{63}{subsection.2.4.2}% \contentsline {subsection}{\numberline {2.4.2}未登录词和平滑算法}{63}{subsection.2.4.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{加法平滑方法}{64}{section*.47}% \contentsline {subsubsection}{加法平滑方法}{64}{section*.47}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{古德-图灵估计法}{65}{section*.49}% \contentsline {subsubsection}{古德-图灵估计法}{65}{section*.49}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{Kneser-Ney平滑方法}{66}{section*.51}% \contentsline {subsubsection}{Kneser-Ney平滑方法}{66}{section*.51}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.5}句法分析(短语结构)}{68}{section.2.5}% \contentsline {section}{\numberline {2.5}句法分析(短语结构分析)}{68}{section.2.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.5.1}句子的句法树表示}{68}{subsection.2.5.1}% \contentsline {subsection}{\numberline {2.5.1}句子的句法树表示}{68}{subsection.2.5.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.5.2}上下文无关文法}{70}{subsection.2.5.2}% \contentsline {subsection}{\numberline {2.5.2}上下文无关文法}{70}{subsection.2.5.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.5.3}规则和推导的概率}{75}{subsection.2.5.3}% \contentsline {subsection}{\numberline {2.5.3}规则和推导的概率}{75}{subsection.2.5.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.6}小结及深入阅读}{77}{section.2.6}% \contentsline {section}{\numberline {2.6}小结及深入阅读}{77}{section.2.6}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {part}{\@mypartnumtocformat {II}{统计机器翻译}}{79}{part.2}% \contentsline {part}{\@mypartnumtocformat {II}{统计机器翻译}}{79}{part.2}
\ttl@stoptoc {default@1} \ttl@stoptoc {default@1}
\ttl@starttoc {default@2} \ttl@starttoc {default@2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {3}基于词的机器翻译模型}{81}{chapter.3}% \contentsline {chapter}{\numberline {3}基于词的机器翻译模型}{81}{chapter.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.1}什么是基于词的翻译模型}{81}{section.3.1}% \contentsline {section}{\numberline {3.1}什么是基于词的翻译模型}{81}{section.3.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.2}构建一个简单的机器翻译系统}{83}{section.3.2}% \contentsline {section}{\numberline {3.2}构建一个简单的机器翻译系统}{83}{section.3.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.2.1}如何进行翻译?}{83}{subsection.3.2.1}% \contentsline {subsection}{\numberline {3.2.1}如何进行翻译?}{83}{subsection.3.2.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)机器翻译流程}{84}{section*.66}% \contentsline {subsubsection}{(二)机器翻译流程}{84}{section*.66}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)人工 vs. 机器}{85}{section*.68}% \contentsline {subsubsection}{(三)人工 vs. 机器}{85}{section*.68}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.2.2}基本框架}{85}{subsection.3.2.2}% \contentsline {subsection}{\numberline {3.2.2}基本框架}{85}{subsection.3.2.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.2.3}单词翻译概率}{86}{subsection.3.2.3}% \contentsline {subsection}{\numberline {3.2.3}单词翻译概率}{86}{subsection.3.2.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)什么是单词翻译概率?}{86}{section*.70}% \contentsline {subsubsection}{(一)什么是单词翻译概率?}{86}{section*.70}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)如何从一个双语平行数据中学习?}{87}{section*.72}% \contentsline {subsubsection}{(二)如何从一个双语平行数据中学习?}{87}{section*.72}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)如何从大量的双语平行数据中学习?}{88}{section*.73}% \contentsline {subsubsection}{(三)如何从大量的双语平行数据中学习?}{88}{section*.73}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.2.4}句子级翻译模型}{89}{subsection.3.2.4}% \contentsline {subsection}{\numberline {3.2.4}句子级翻译模型}{89}{subsection.3.2.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)句子级翻译的基础模型}{89}{section*.75}% \contentsline {subsubsection}{(一)句子级翻译的基础模型}{89}{section*.75}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)生成流畅的译文}{91}{section*.77}% \contentsline {subsubsection}{(二)生成流畅的译文}{91}{section*.77}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.2.5}解码}{92}{subsection.3.2.5}% \contentsline {subsection}{\numberline {3.2.5}解码}{92}{subsection.3.2.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.3}基于词的翻译建模}{95}{section.3.3}% \contentsline {section}{\numberline {3.3}基于词的翻译建模}{95}{section.3.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.3.1}噪声信道模型}{95}{subsection.3.3.1}% \contentsline {subsection}{\numberline {3.3.1}噪声信道模型}{95}{subsection.3.3.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.3.2}统计机器翻译的三个基本问题}{98}{subsection.3.3.2}% \contentsline {subsection}{\numberline {3.3.2}统计机器翻译的三个基本问题}{98}{subsection.3.3.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{词对齐}{99}{section*.86}% \contentsline {subsubsection}{词对齐}{99}{section*.86}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于词对齐的翻译模型}{100}{section*.89}% \contentsline {subsubsection}{基于词对齐的翻译模型}{100}{section*.89}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于词对齐的翻译实例}{101}{section*.91}% \contentsline {subsubsection}{基于词对齐的翻译实例}{101}{section*.91}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.4}IBM模型1-2}{102}{section.3.4}% \contentsline {section}{\numberline {3.4}IBM模型1-2}{102}{section.3.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.4.1}IBM模型1}{102}{subsection.3.4.1}% \contentsline {subsection}{\numberline {3.4.1}IBM模型1}{102}{subsection.3.4.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.4.2}IBM模型2}{104}{subsection.3.4.2}% \contentsline {subsection}{\numberline {3.4.2}IBM模型2}{104}{subsection.3.4.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.4.3}解码及计算优化}{105}{subsection.3.4.3}% \contentsline {subsection}{\numberline {3.4.3}解码及计算优化}{105}{subsection.3.4.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.4.4}训练}{106}{subsection.3.4.4}% \contentsline {subsection}{\numberline {3.4.4}训练}{106}{subsection.3.4.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)目标函数}{106}{section*.96}% \contentsline {subsubsection}{(一)目标函数}{106}{section*.96}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)优化}{107}{section*.98}% \contentsline {subsubsection}{(二)优化}{107}{section*.98}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.5}IBM模型3-5及隐马尔可夫模型}{112}{section.3.5}% \contentsline {section}{\numberline {3.5}IBM模型3-5及隐马尔可夫模型}{112}{section.3.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.1}基于产出率的翻译模型}{113}{subsection.3.5.1}% \contentsline {subsection}{\numberline {3.5.1}基于产出率的翻译模型}{113}{subsection.3.5.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.2}IBM 模型3}{115}{subsection.3.5.2}% \contentsline {subsection}{\numberline {3.5.2}IBM 模型3}{115}{subsection.3.5.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.3}IBM 模型4}{117}{subsection.3.5.3}% \contentsline {subsection}{\numberline {3.5.3}IBM 模型4}{117}{subsection.3.5.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.4} IBM 模型5}{118}{subsection.3.5.4}% \contentsline {subsection}{\numberline {3.5.4} IBM 模型5}{118}{subsection.3.5.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.5}隐马尔可夫模型}{120}{subsection.3.5.5}% \contentsline {subsection}{\numberline {3.5.5}隐马尔可夫模型}{120}{subsection.3.5.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{隐马尔可夫模型}{120}{section*.110}% \contentsline {subsubsection}{隐马尔可夫模型}{120}{section*.110}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{词对齐模型}{121}{section*.112}% \contentsline {subsubsection}{词对齐模型}{121}{section*.112}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.6}解码和训练}{122}{subsection.3.5.6}% \contentsline {subsection}{\numberline {3.5.6}解码和训练}{122}{subsection.3.5.6}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.6}问题分析}{123}{section.3.6}% \contentsline {section}{\numberline {3.6}问题分析}{123}{section.3.6}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.6.1}词对齐及对称化}{123}{subsection.3.6.1}% \contentsline {subsection}{\numberline {3.6.1}词对齐及对称化}{123}{subsection.3.6.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.6.2}Deficiency}{124}{subsection.3.6.2}% \contentsline {subsection}{\numberline {3.6.2}Deficiency}{124}{subsection.3.6.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.6.3}句子长度}{125}{subsection.3.6.3}% \contentsline {subsection}{\numberline {3.6.3}句子长度}{125}{subsection.3.6.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.6.4}其它问题}{125}{subsection.3.6.4}% \contentsline {subsection}{\numberline {3.6.4}其它问题}{125}{subsection.3.6.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.7}小结及深入阅读}{125}{section.3.7}% \contentsline {section}{\numberline {3.7}小结及深入阅读}{125}{section.3.7}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {part}{\@mypartnumtocformat {III}{神经机器翻译}}{127}{part.3}% \contentsline {part}{\@mypartnumtocformat {III}{神经机器翻译}}{127}{part.3}
\ttl@stoptoc {default@2} \ttl@stoptoc {default@2}
\ttl@starttoc {default@3} \ttl@starttoc {default@3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {4}人工神经网络和神经语言建模}{129}{chapter.4}% \contentsline {chapter}{\numberline {4}人工神经网络和神经语言建模}{129}{chapter.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {4.1}深度学习与人工神经网络}{130}{section.4.1}% \contentsline {section}{\numberline {4.1}深度学习与人工神经网络}{130}{section.4.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.1.1}发展简史}{130}{subsection.4.1.1}% \contentsline {subsection}{\numberline {4.1.1}发展简史}{130}{subsection.4.1.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)早期的人工神经网络和第一次寒冬}{130}{section*.114}% \contentsline {subsubsection}{(一)早期的人工神经网络和第一次寒冬}{130}{section*.114}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)神经网络的第二次高潮和第二次寒冬}{131}{section*.115}% \contentsline {subsubsection}{(二)神经网络的第二次高潮和第二次寒冬}{131}{section*.115}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)深度学习和神经网络的崛起}{132}{section*.116}% \contentsline {subsubsection}{(三)深度学习和神经网络的崛起}{132}{section*.116}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.1.2}为什么需要深度学习}{133}{subsection.4.1.2}% \contentsline {subsection}{\numberline {4.1.2}为什么需要深度学习}{133}{subsection.4.1.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)端到端学习和表示学习}{133}{section*.118}% \contentsline {subsubsection}{(一)端到端学习和表示学习}{133}{section*.118}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)深度学习的效果}{134}{section*.120}% \contentsline {subsubsection}{(二)深度学习的效果}{134}{section*.120}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {4.2}神经网络基础}{134}{section.4.2}% \contentsline {section}{\numberline {4.2}神经网络基础}{134}{section.4.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.2.1}线性代数基础}{134}{subsection.4.2.1}% \contentsline {subsection}{\numberline {4.2.1}线性代数基础}{134}{subsection.4.2.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)标量、向量和矩阵}{135}{section*.122}% \contentsline {subsubsection}{(一)标量、向量和矩阵}{135}{section*.122}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)矩阵的转置}{136}{section*.123}% \contentsline {subsubsection}{(二)矩阵的转置}{136}{section*.123}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)矩阵加法和数乘}{136}{section*.124}% \contentsline {subsubsection}{(三)矩阵加法和数乘}{136}{section*.124}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(四)矩阵乘法和矩阵点乘}{137}{section*.125}% \contentsline {subsubsection}{(四)矩阵乘法和矩阵点乘}{137}{section*.125}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(五)线性映射}{138}{section*.126}% \contentsline {subsubsection}{(五)线性映射}{138}{section*.126}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(六)范数}{139}{section*.127}% \contentsline {subsubsection}{(六)范数}{139}{section*.127}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.2.2}人工神经元和感知机}{140}{subsection.4.2.2}% \contentsline {subsection}{\numberline {4.2.2}人工神经元和感知机}{140}{subsection.4.2.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)感知机\ \raisebox {0.5mm}{------}\ 最简单的人工神经元模型}{141}{section*.130}% \contentsline {subsubsection}{(一)感知机\ \raisebox {0.5mm}{------}\ 最简单的人工神经元模型}{141}{section*.130}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)神经元内部权重}{141}{section*.133}% \contentsline {subsubsection}{(二)神经元内部权重}{141}{section*.133}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)神经元的输入\ \raisebox {0.5mm}{------}\ 离散 vs 连续}{142}{section*.135}% \contentsline {subsubsection}{(三)神经元的输入\ \raisebox {0.5mm}{------}\ 离散 vs 连续}{142}{section*.135}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(四)神经元内部的参数学习}{143}{section*.137}% \contentsline {subsubsection}{(四)神经元内部的参数学习}{143}{section*.137}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.2.3}多层神经网络}{144}{subsection.4.2.3}% \contentsline {subsection}{\numberline {4.2.3}多层神经网络}{144}{subsection.4.2.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)线性变换和激活函数}{144}{section*.139}% \contentsline {subsubsection}{(一)线性变换和激活函数}{144}{section*.139}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)单层神经网络$\rightarrow $多层神经网络}{146}{section*.146}% \contentsline {subsubsection}{(二)单层神经网络$\rightarrow $多层神经网络}{146}{section*.146}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.2.4}函数拟合能力}{148}{subsection.4.2.4}% \contentsline {subsection}{\numberline {4.2.4}函数拟合能力}{148}{subsection.4.2.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {4.3}神经网络的张量实现}{151}{section.4.3}% \contentsline {section}{\numberline {4.3}神经网络的张量实现}{151}{section.4.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.3.1} 张量及其计算}{151}{subsection.4.3.1}% \contentsline {subsection}{\numberline {4.3.1} 张量及其计算}{151}{subsection.4.3.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)张量}{151}{section*.156}% \contentsline {subsubsection}{(一)张量}{151}{section*.156}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)张量的矩阵乘法}{153}{section*.159}% \contentsline {subsubsection}{(二)张量的矩阵乘法}{153}{section*.159}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)张量的单元操作}{154}{section*.161}% \contentsline {subsubsection}{(三)张量的单元操作}{154}{section*.161}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.3.2}张量的物理存储形式}{155}{subsection.4.3.2}% \contentsline {subsection}{\numberline {4.3.2}张量的物理存储形式}{155}{subsection.4.3.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.3.3}使用开源框架实现张量计算}{156}{subsection.4.3.3}% \contentsline {subsection}{\numberline {4.3.3}使用开源框架实现张量计算}{156}{subsection.4.3.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.3.4}神经网络中的前向传播}{160}{subsection.4.3.4}% \contentsline {subsection}{\numberline {4.3.4}神经网络中的前向传播}{160}{subsection.4.3.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.3.5}神经网络实例}{161}{subsection.4.3.5}% \contentsline {subsection}{\numberline {4.3.5}神经网络实例}{161}{subsection.4.3.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {4.4}神经网络的参数训练}{162}{section.4.4}% \contentsline {section}{\numberline {4.4}神经网络的参数训练}{162}{section.4.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.1}损失函数}{163}{subsection.4.4.1}% \contentsline {subsection}{\numberline {4.4.1}损失函数}{163}{subsection.4.4.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.2}基于梯度的参数优化}{164}{subsection.4.4.2}% \contentsline {subsection}{\numberline {4.4.2}基于梯度的参数优化}{164}{subsection.4.4.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)梯度下降}{165}{section*.179}% \contentsline {subsubsection}{(一)梯度下降}{165}{section*.179}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)梯度获取}{167}{section*.181}% \contentsline {subsubsection}{(二)梯度获取}{167}{section*.181}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)基于梯度的方法的变种和改进}{169}{section*.185}% \contentsline {subsubsection}{(三)基于梯度的方法的变种和改进}{169}{section*.185}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.3}参数更新的并行化策略}{172}{subsection.4.4.3}% \contentsline {subsection}{\numberline {4.4.3}参数更新的并行化策略}{172}{subsection.4.4.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.4}梯度消失、梯度爆炸和稳定性训练}{174}{subsection.4.4.4}% \contentsline {subsection}{\numberline {4.4.4}梯度消失、梯度爆炸和稳定性训练}{174}{subsection.4.4.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)梯度消失现象及解决方法}{174}{section*.188}% \contentsline {subsubsection}{(一)梯度消失现象及解决方法}{174}{section*.188}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)梯度爆炸现象及解决方法}{175}{section*.192}% \contentsline {subsubsection}{(二)梯度爆炸现象及解决方法}{175}{section*.192}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)稳定性训练}{175}{section*.193}% \contentsline {subsubsection}{(三)稳定性训练}{175}{section*.193}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.5}过拟合}{177}{subsection.4.4.5}% \contentsline {subsection}{\numberline {4.4.5}过拟合}{177}{subsection.4.4.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.6}反向传播}{178}{subsection.4.4.6}% \contentsline {subsection}{\numberline {4.4.6}反向传播}{178}{subsection.4.4.6}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)输出层的反向传播}{179}{section*.196}% \contentsline {subsubsection}{(一)输出层的反向传播}{179}{section*.196}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)隐藏层的反向传播}{181}{section*.200}% \contentsline {subsubsection}{(二)隐藏层的反向传播}{181}{section*.200}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)程序实现}{182}{section*.203}% \contentsline {subsubsection}{(三)程序实现}{182}{section*.203}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {4.5}神经语言模型}{184}{section.4.5}% \contentsline {section}{\numberline {4.5}神经语言模型}{184}{section.4.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.5.1}基于神经网络的语言建模}{184}{subsection.4.5.1}% \contentsline {subsection}{\numberline {4.5.1}基于神经网络的语言建模}{184}{subsection.4.5.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)基于前馈神经网络的语言模型}{185}{section*.206}% \contentsline {subsubsection}{(一)基于前馈神经网络的语言模型}{185}{section*.206}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)基于循环神经网络的语言模型}{187}{section*.209}% \contentsline {subsubsection}{(二)基于循环神经网络的语言模型}{187}{section*.209}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)基于自注意力机制的语言模型}{188}{section*.211}% \contentsline {subsubsection}{(三)基于自注意力机制的语言模型}{188}{section*.211}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(四)语言模型的评价}{189}{section*.213}% \contentsline {subsubsection}{(四)语言模型的评价}{189}{section*.213}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.5.2}单词表示模型}{190}{subsection.4.5.2}% \contentsline {subsection}{\numberline {4.5.2}单词表示模型}{190}{subsection.4.5.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)One-hot编码}{190}{section*.214}% \contentsline {subsubsection}{(一)One-hot编码}{190}{section*.214}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)分布式表示}{190}{section*.216}% \contentsline {subsubsection}{(二)分布式表示}{190}{section*.216}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.5.3}句子表示模型及预训练}{192}{subsection.4.5.3}% \contentsline {subsection}{\numberline {4.5.3}句子表示模型及预训练}{192}{subsection.4.5.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)简单的上下文表示模型}{192}{section*.220}% \contentsline {subsubsection}{(一)简单的上下文表示模型}{192}{section*.220}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)ELMO模型}{194}{section*.223}% \contentsline {subsubsection}{(二)ELMO模型}{194}{section*.223}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)GPT模型}{194}{section*.225}% \contentsline {subsubsection}{(三)GPT模型}{194}{section*.225}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(四)BERT模型}{195}{section*.227}% \contentsline {subsubsection}{(四)BERT模型}{195}{section*.227}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(五)为什么要预训练?}{196}{section*.229}% \contentsline {subsubsection}{(五)为什么要预训练?}{196}{section*.229}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {4.6}小结及深入阅读}{196}{section.4.6}% \contentsline {section}{\numberline {4.6}小结及深入阅读}{196}{section.4.6}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {5}神经机器翻译模型}{199}{chapter.5}% \contentsline {chapter}{\numberline {5}神经机器翻译模型}{199}{chapter.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.1}神经机器翻译的发展简史}{199}{section.5.1}% \contentsline {section}{\numberline {5.1}神经机器翻译的发展简史}{199}{section.5.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.1.1}神经机器翻译的起源}{201}{subsection.5.1.1}% \contentsline {subsection}{\numberline {5.1.1}神经机器翻译的起源}{201}{subsection.5.1.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.1.2}神经机器翻译的品质 }{203}{subsection.5.1.2}% \contentsline {subsection}{\numberline {5.1.2}神经机器翻译的品质 }{203}{subsection.5.1.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.1.3}神经机器翻译的优势 }{206}{subsection.5.1.3}% \contentsline {subsection}{\numberline {5.1.3}神经机器翻译的优势 }{206}{subsection.5.1.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.2}编码器-解码器框架}{208}{section.5.2}% \contentsline {section}{\numberline {5.2}编码器-解码器框架}{208}{section.5.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.2.1}框架结构}{208}{subsection.5.2.1}% \contentsline {subsection}{\numberline {5.2.1}框架结构}{208}{subsection.5.2.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.2.2}表示学习}{209}{subsection.5.2.2}% \contentsline {subsection}{\numberline {5.2.2}表示学习}{209}{subsection.5.2.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.2.3}简单的运行实例}{210}{subsection.5.2.3}% \contentsline {subsection}{\numberline {5.2.3}简单的运行实例}{210}{subsection.5.2.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.2.4}机器翻译范式的对比}{211}{subsection.5.2.4}% \contentsline {subsection}{\numberline {5.2.4}机器翻译范式的对比}{211}{subsection.5.2.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.3}基于循环神经网络的翻译模型及注意力机制}{212}{section.5.3}% \contentsline {section}{\numberline {5.3}基于循环神经网络的翻译模型及注意力机制}{212}{section.5.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.1}建模}{214}{subsection.5.3.1}% \contentsline {subsection}{\numberline {5.3.1}建模}{214}{subsection.5.3.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.2}输入(词嵌入)及输出(Softmax)}{216}{subsection.5.3.2}% \contentsline {subsection}{\numberline {5.3.2}输入(词嵌入)及输出(Softmax)}{216}{subsection.5.3.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.3}循环神经网络结构}{220}{subsection.5.3.3}% \contentsline {subsection}{\numberline {5.3.3}循环神经网络结构}{220}{subsection.5.3.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{循环神经单元(RNN)}{220}{section*.251}% \contentsline {subsubsection}{循环神经单元(RNN)}{220}{section*.251}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{长短时记忆网络(LSTM)}{220}{section*.252}% \contentsline {subsubsection}{长短时记忆网络(LSTM)}{220}{section*.252}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{门控循环单元(GRU)}{222}{section*.255}% \contentsline {subsubsection}{门控循环单元(GRU)}{222}{section*.255}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{双向模型}{223}{section*.257}% \contentsline {subsubsection}{双向模型}{223}{section*.257}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{多层循环神经网络}{225}{section*.259}% \contentsline {subsubsection}{多层循环神经网络}{225}{section*.259}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.4}注意力机制}{225}{subsection.5.3.4}% \contentsline {subsection}{\numberline {5.3.4}注意力机制}{225}{subsection.5.3.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{翻译中的注意力机制}{226}{section*.262}% \contentsline {subsubsection}{翻译中的注意力机制}{226}{section*.262}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{上下文向量的计算}{227}{section*.265}% \contentsline {subsubsection}{上下文向量的计算}{227}{section*.265}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{注意力机制的解读}{230}{section*.270}% \contentsline {subsubsection}{注意力机制的解读}{230}{section*.270}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.5}训练}{232}{subsection.5.3.5}% \contentsline {subsection}{\numberline {5.3.5}训练}{232}{subsection.5.3.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{损失函数}{233}{section*.273}% \contentsline {subsubsection}{损失函数}{233}{section*.273}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{长参数初始化}{233}{section*.274}% \contentsline {subsubsection}{长参数初始化}{233}{section*.274}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{优化策略}{234}{section*.275}% \contentsline {subsubsection}{优化策略}{234}{section*.275}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{梯度裁剪}{234}{section*.277}% \contentsline {subsubsection}{梯度裁剪}{234}{section*.277}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{学习率策略}{235}{section*.278}% \contentsline {subsubsection}{学习率策略}{235}{section*.278}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{并行训练}{236}{section*.281}% \contentsline {subsubsection}{并行训练}{236}{section*.281}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.6}推断}{237}{subsection.5.3.6}% \contentsline {subsection}{\numberline {5.3.6}推断}{237}{subsection.5.3.6}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{贪婪搜索}{239}{section*.285}% \contentsline {subsubsection}{贪婪搜索}{239}{section*.285}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{束搜索}{240}{section*.288}% \contentsline {subsubsection}{束搜索}{240}{section*.288}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsubsection}{长度惩罚}{241}{section*.290}% \contentsline {subsubsection}{长度惩罚}{241}{section*.290}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.7}实例-GNMT}{242}{subsection.5.3.7}% \contentsline {subsection}{\numberline {5.3.7}实例-GNMT}{242}{subsection.5.3.7}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.4}Transformer}{244}{section.5.4}% \contentsline {section}{\numberline {5.4}Transformer}{244}{section.5.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.1}自注意力模型}{245}{subsection.5.4.1}% \contentsline {subsection}{\numberline {5.4.1}自注意力模型}{245}{subsection.5.4.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.2}Transformer架构}{246}{subsection.5.4.2}% \contentsline {subsection}{\numberline {5.4.2}Transformer架构}{246}{subsection.5.4.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.3}位置编码}{249}{subsection.5.4.3}% \contentsline {subsection}{\numberline {5.4.3}位置编码}{249}{subsection.5.4.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.4}基于点乘的注意力机制}{251}{subsection.5.4.4}% \contentsline {subsection}{\numberline {5.4.4}基于点乘的注意力机制}{251}{subsection.5.4.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.5}掩码操作}{252}{subsection.5.4.5}% \contentsline {subsection}{\numberline {5.4.5}掩码操作}{252}{subsection.5.4.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.6}多头注意力}{253}{subsection.5.4.6}% \contentsline {subsection}{\numberline {5.4.6}多头注意力}{253}{subsection.5.4.6}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.7}残差网络和层正则化}{255}{subsection.5.4.7}% \contentsline {subsection}{\numberline {5.4.7}残差网络和层正则化}{255}{subsection.5.4.7}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.8}前馈全连接网络子层}{256}{subsection.5.4.8}% \contentsline {subsection}{\numberline {5.4.8}前馈全连接网络子层}{256}{subsection.5.4.8}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.9}训练}{257}{subsection.5.4.9}% \contentsline {subsection}{\numberline {5.4.9}训练}{257}{subsection.5.4.9}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.10}推断}{260}{subsection.5.4.10}% \contentsline {subsection}{\numberline {5.4.10}推断}{260}{subsection.5.4.10}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.5}序列到序列问题及应用}{260}{section.5.5}% \contentsline {section}{\numberline {5.5}序列到序列问题及应用}{260}{section.5.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.5.1}自动问答}{261}{subsection.5.5.1}% \contentsline {subsection}{\numberline {5.5.1}自动问答}{261}{subsection.5.5.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.5.2}自动文摘}{261}{subsection.5.5.2}% \contentsline {subsection}{\numberline {5.5.2}自动文摘}{261}{subsection.5.5.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.5.3}文言文翻译}{262}{subsection.5.5.3}% \contentsline {subsection}{\numberline {5.5.3}文言文翻译}{262}{subsection.5.5.3}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.5.4}对联生成}{262}{subsection.5.5.4}% \contentsline {subsection}{\numberline {5.5.4}对联生成}{262}{subsection.5.5.4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.5.5}古诗生成}{263}{subsection.5.5.5}% \contentsline {subsection}{\numberline {5.5.5}古诗生成}{263}{subsection.5.5.5}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.6}小结及深入阅读}{264}{section.5.6}% \contentsline {section}{\numberline {5.6}小结及深入阅读}{264}{section.5.6}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {part}{\@mypartnumtocformat {IV}{附录}}{267}{part.4}% \contentsline {part}{\@mypartnumtocformat {IV}{附录}}{267}{part.4}
\ttl@stoptoc {default@3} \ttl@stoptoc {default@3}
\ttl@starttoc {default@4} \ttl@starttoc {default@4}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {A}附录A}{269}{appendix.1.A}% \contentsline {chapter}{\numberline {A}附录A}{269}{Appendix.1.A}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {B}附录B}{271}{appendix.2.B}% \contentsline {chapter}{\numberline {B}附录B}{271}{Appendix.2.B}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {B.1}IBM模型3训练方法}{271}{section.2.B.1}% \contentsline {section}{\numberline {B.1}IBM模型3训练方法}{271}{section.2.B.1}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {B.2}IBM模型4训练方法}{273}{section.2.B.2}% \contentsline {section}{\numberline {B.2}IBM模型4训练方法}{273}{section.2.B.2}
\defcounter {refsection}{0}\relax \defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {B.3}IBM模型5训练方法}{274}{section.2.B.3}% \contentsline {section}{\numberline {B.3}IBM模型5训练方法}{274}{section.2.B.3}
\contentsfinish \contentsfinish
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论