合并分支 'caorunzhe' 到 'master'

Caorunzhe 查看合并请求 !410

合并分支 'caorunzhe' 到 'master'
Caorunzhe 查看合并请求 !410
75e4b93d · 曹润柘 · 5d0ddbe8 · b597f795 · 75e4b93d · 75e4b93d
Commit 75e4b93d authored Nov 16, 2020 by 曹润柘
--- a/Chapter16/Figures/figure-application-process-of-back-translation.tex
+++ b/Chapter16/Figures/figure-application-process-of-back-translation.tex
 \begin{tikzpicture}
-\begin{scope}
-\node [anchor=center] (node1) at (4.9,1) {\small{训练：}};
-\node [anchor=center] (node11) at (5.5,1) {};
+\tikzstyle{bignode} = [line width=0.6pt,draw=black,minimum width=6.3em,minimum height=2.2em,fill=blue!20,rounded corners=2pt]
-\node [anchor=center] (node12) at (6.7,1) {};
+\tikzstyle{middlenode} = [line width=0.6pt,draw=black,minimum width=5.6em,minimum height=2.2em,fill=blue!20,rounded corners=2pt]
-\node [anchor=center] (node2) at (4.9,0.5) {\small{推断：}};
-\node [anchor=center] (node21) at (5.5,0.5) {};
-\node [anchor=center] (node22) at (6.7,0.5) {};
+\node [anchor=center] (node1-1) at (0,0) {\scriptsize{汉语}};
-\node [anchor=west,line width=0.6pt,draw=black,minimum width=5.6em,minimum height=2.2em,fill=blue!20,rounded corners=2pt] (node1-1) at (0,0) {\footnotesize{双语数据}};
+\node [anchor=west] (node1-2) at ([xshift=1.0em]node1-1.east) {\scriptsize{英语}};
-\node [anchor=south,line width=0.6pt,draw=black,minimum width=4.5em,minimum height=2.2em,fill=blue!20,rounded corners=2pt] (node1-2) at ([yshift=-5em]node1-1.south) {\footnotesize{目标语伪数据}};
+\node [anchor=north] (node1-3) at ([xshift=1.65em]node1-1.south) {\scriptsize{反向翻译模型}};
-\node [anchor=west,line width=0.6pt,draw=black,minimum width=4.5em,minimum height=2.2em,fill=red!20,rounded corners=2pt] (node2-1) at ([xshift=-8.8em,yshift=-2.5em]node1-1.west) {\footnotesize{反向翻译系统}};
+\draw [->,line width=0.6pt](node1-1.east)--(node1-2.west);
-\node [anchor=west,line width=0.6pt,draw=black,minimum width=4.5em,minimum height=2.2em,fill=red!20,rounded corners=2pt] (node3-1) at ([xshift=3em,yshift=-2.5em]node1-1.east) {\footnotesize{前向翻译系统}};
+\begin{pgfonlayer}{background}
-\draw [->,line width=1pt](node1-1.west)--([xshift=3em]node2-1.north);
+{
-\draw [->,line width=1pt](node1-1.east)--([xshift=-3em]node3-1.north);
+\node[fill=red!20,rounded corners=2pt,inner sep=0.2em,draw=black,line width=0.6pt,minimum width=6.0em] [fit =(node1-1)(node1-2)(node1-3)]  (remark1) {};
-\draw [->,line width=1pt](node1-2.east)--([xshift=-3em]node3-1.south);
+}
-\draw [->,line width=1pt](node11.east)--(node12.west);
+\end{pgfonlayer}
-\draw [->,line width=1pt,dashed](node21.east)--(node22.west);
-\draw [->,line width=1pt,dashed]([xshift=3em]node2-1.south)--(node1-2.west);
+\node [anchor=north](node2-1) at ([xshift=-1.93em,yshift=-1.95em]remark1.south){\scriptsize{汉语}};
-\end{scope}
+\node [anchor=north](node2-1-2) at (node2-1.south){\scriptsize{真实数据}};
+\begin{pgfonlayer}{background}
+{
+\node[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node2-1)(node2-1-2)]  (remark2-1) {};
+}
+\end{pgfonlayer}
+\node [anchor=west](node2-2) at ([xshift=0.82em,yshift=0.68em]remark2-1.east){\scriptsize{英语}};
+\node [anchor=north](node2-2-2) at (node2-2.south){\scriptsize{真实数据}};
+\begin{pgfonlayer}{background}
+{
+\node[fill=green!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node2-2)(node2-2-2)]  (remark2-2) {};
+}
+\end{pgfonlayer}
+\draw [->,line width=0.6pt]([yshift=-2.0em]remark1.south)--(remark1.south) node [pos=0.5,right] (pos1) {\scriptsize{训练}};
+\node [anchor=west](node3-1) at ([xshift=5.0em,yshift=0.1em]node1-2.east){\scriptsize{汉语}};
+\node [anchor=north](node3-1-2) at (node3-1.south){\scriptsize{真实数据}};
+\begin{pgfonlayer}{background}
+{
+\node[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node3-1)(node3-1-2)]  (remark3-1) {};
+}
+\end{pgfonlayer}
+\node [anchor=north](node3-2) at ([yshift=-2.15em]remark3-1.south){\scriptsize{英语}};
+\node [anchor=north](node3-2-2) at (node3-2.south){\scriptsize{伪数据}};
+\begin{pgfonlayer}{background}
+{
+\node[fill=yellow!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node3-2)(node3-2-2)]  (remark3-2) {};
+}
+\end{pgfonlayer}
+\draw [->,line width=0.6pt](remark3-1.south)--(remark3-2.north) node [pos=0.5,right] (pos2) {\scriptsize{翻译}};
+\begin{pgfonlayer}{background}
+{
+\node[rounded corners=2pt,inner sep=0.3em,draw=black,line width=0.6pt,dotted] [fit =(remark3-1)(remark3-2)]  (remark2) {};
+}
+\end{pgfonlayer}
+\draw [->,line width=0.6pt](remark1.east)--([yshift=2.40em]remark2.west) node [pos=0.5,above] (pos2) {\scriptsize{模型翻译}};
+\node [anchor=south](pos2-2) at ([yshift=-0.5em]pos2.north){\scriptsize{使用反向}};
+\draw[decorate,thick,decoration={brace,amplitude=5pt}] ([yshift=1.3em,xshift=1.5em]node3-1.east) -- ([yshift=-7.7em,xshift=1.5em]node3-1.east) node [pos=0.1,right,xshift=0.0em,yshift=0.0em] (label1) {\scriptsize{{混合}}};
+\node [anchor=west](node4-1) at ([xshift=3.5em,yshift=3.94em]node3-2.east){\scriptsize{英语}};
+\node [anchor=north](node4-1-2) at (node4-1.south){\scriptsize{伪数据}};
+\begin{pgfonlayer}{background}
+{
+\node[fill=yellow!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node4-1)(node4-1-2)]  (remark4-1) {};
+}
+\end{pgfonlayer}
+\node [anchor=north](node4-2) at ([yshift=-1.59em]node4-1.south){\scriptsize{英语}};
+\node [anchor=north](node4-2-2) at (node4-2.south){\scriptsize{真实数据}};
+\begin{pgfonlayer}{background}
+{
+\node[fill=green!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node4-2)(node4-2-2)]  (remark4-2) {};
+}
+\end{pgfonlayer}
+\node [anchor=west](node4-3) at ([xshift=1.7em]node4-2.east){\scriptsize{汉语}};
+\node [anchor=north](node4-3-2) at (node4-3.south){\scriptsize{真实数据}};
+\begin{pgfonlayer}{background}
+{
+\node[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node4-3)(node4-3-2)]  (remark4-3) {};
+}
+\end{pgfonlayer}
+\node [anchor=west](node4-4) at ([xshift=1.7em]node4-1.east){\scriptsize{汉语}};
+\node [anchor=north](node4-4-2) at (node4-4.south){\scriptsize{真实数据}};
+\begin{pgfonlayer}{background}
+{
+\node[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node4-4)(node4-4-2)]  (remark4-3) {};
+}
+\end{pgfonlayer}
+\node [anchor=center] (node5-1) at ([xshift=4.3em,yshift=-1.48em]node4-4.east) {\scriptsize{英语}};
+\node [anchor=west] (node5-2) at ([xshift=1.0em]node5-1.east) {\scriptsize{汉语}};
+\node [anchor=north] (node5-3) at ([xshift=1.65em]node5-1.south) {\scriptsize{正向翻译模型}};
+\draw [->,line width=0.6pt](node5-1.east)--(node5-2.west);
+\begin{pgfonlayer}{background}
+{
+\node[fill=red!20,rounded corners=2pt,inner sep=0.2em,draw=black,line width=0.6pt,minimum width=6.0em] [fit =(node5-1)(node5-2)(node5-3)]  (remark3) {};
+}
+\end{pgfonlayer}
+\draw [->,line width=0.6pt]([xshift=-2em]remark3.west)--(remark3.west) node [pos=0.5,above] (pos3) {\scriptsize{训练}};
 \end{tikzpicture}
\ No newline at end of file
--- a/Chapter9/chapter9.tex
+++ b/Chapter9/chapter9.tex
@@ -66,7 +66,7 @@
 \subsubsection{2. 神经网络的第二次高潮和第二次寒冬}
-\parinterval 虽然第一代神经网络受到了打击，但是20世纪80年代，第二代人工神经网络开始萌发新的生机。在这个发展阶段，生物属性已经不再是神经网络的唯一灵感来源，在{\small\bfnew{连接主义}}\index{连接主义}（Connectionism）\index{Connectionism}和分布式表示两种思潮的影响下，神经网络方法再次走入了人们的视线。
+\parinterval 虽然第一代神经网络受到了打击，但是在20世纪80年代，第二代人工神经网络开始萌发新的生机。在这个发展阶段，生物属性已经不再是神经网络的唯一灵感来源，在{\small\bfnew{连接主义}}\index{连接主义}（Connectionism）\index{Connectionism}和分布式表示两种思潮的影响下，神经网络方法再次走入了人们的视线。
 \vspace{0.3em}
 \parinterval （1）符号主义与连接主义
@@ -102,7 +102,7 @@
 \vspace{0.5em}
 \end{itemize}
-\parinterval 另外，从应用的角度，数据量的快速提升和模型容量的增加也为深度学习的成功提供了条件，数据量的增加使得深度学习有了用武之地，例如，2000年以来，无论在学术研究还是在工业实践中，双语数据的使用数量都在逐年上升（如图\ref{fig:9-1}所示）。现在的深度学习模型参数量都十分巨大，因此需要大规模数据才能保证模型学习的充分性，而大数据时代的到来为训练这样的模型提供了数据基础。
+\parinterval 另外，从应用的角度来看，数据量的快速提升和模型容量的增加也为深度学习的成功提供了条件，数据量的增加使得深度学习有了用武之地，例如，2000年以来，无论在学术研究还是在工业实践中，双语数据的使用数量都在逐年上升（如图\ref{fig:9-1}所示）。现在的深度学习模型参数量都十分巨大，因此需要大规模数据才能保证模型学习的充分性，而大数据时代的到来为训练这样的模型提供了数据基础。
 %----------------------------------------------------------------------
 \begin{figure}[htp]
@@ -142,7 +142,7 @@
 \begin{itemize}
 \vspace{0.5em}
-\item 特征的构造需要耗费大量的时间和精力。在传统机器学习的特征工程方法中，特征提取过程往往依赖于大量的先验假设，都基于人力完成的，这样导致相关系统的研发周期也大大增加；
+\item 特征的构造需要耗费大量的时间和精力。在传统机器学习的特征工程方法中，特征提取都是基于人力完成的，该过程往往依赖于大量的先验假设，会导致相关系统的研发周期也大大增加；
 \vspace{0.5em}
 \item 最终的系统性能强弱非常依赖特征的选择。有一句话在业界广泛流传：“数据和特征决定了机器学习的上限”，但是人的智力和认知是有限的，因此人工设计的特征的准确性和覆盖度会存在瓶颈；
 \vspace{0.5em}
@@ -150,7 +150,7 @@
 \vspace{0.5em}
 \end{itemize}
-\parinterval 端到端学习将人们从大量的特征提取工作之中解放出来，可以不需要太多人的先验知识。从某种意义上讲，对问题的特征提取完全是自动完成的，这也意味着哪怕系统开发者不是该任务的“专家”也可以完成相关系统的开发。此外，端到端学习实际上也隐含了一种新的对问题的表示形式\ $\dash$分布式表示。 在这种框架下，模型的输入可以被描述为分布式的实数向量，这样模型可以有更多的维度描述一个事物，同时避免传统符号系统对客观事物离散化的刻画。比如，在自然语言处理中，表示学习重新定义了什么是词，什么是句子。在本章后面的内容中也会看到，表示学习可以让计算机对语言文字的描述更加准确和充分。
+\parinterval 端到端学习将人们从大量的特征提取工作之中解放出来，可以不需要太多人的先验知识。从某种意义上讲，对问题的特征提取完全是自动完成的，这也意味着即使系统开发者不是该任务的“专家”也可以完成相关系统的开发。此外，端到端学习实际上也隐含了一种新的对问题的表示形式\ $\dash$分布式表示。 在这种框架下，模型的输入可以被描述为分布式的实数向量，这样模型可以有更多的维度描述一个事物，同时避免传统符号系统对客观事物离散化的刻画。比如，在自然语言处理中，表示学习重新定义了什么是词，什么是句子。在本章后面的内容中也会看到，表示学习可以让计算机对语言文字的描述更加准确和充分。
 %----------------------------------------------------------------------------------------
 %    NEW SUBSUB-SECTION
@@ -196,7 +196,7 @@
 \subsection{线性代数基础} \label{sec:9.2.1}
-\parinterval 线性代数作为一个数学分支，广泛应用于科学和工程中，神经网络的数学描述中也大量使用了线性代数工具。因此，这里对线性代数的一些概念进行简要介绍，以方便后续对神经网络的数学描述。
+\parinterval 线性代数作为一个数学分支，广泛应用于科学和工程中，神经网络的数学描述中也大量使用了线性代数工具。因此，这里对线性代数的一些概念进行简要介绍，以方便后续对神经网络进行数学描述。
 %----------------------------------------------------------------------------------------
 %    NEW SUBSUB-SECTION
@@ -740,7 +740,7 @@ x_1\cdot w_1+x_2\cdot w_2+x_3\cdot w_3 & = & 0\cdot 1+0\cdot 1+1\cdot 1 \nonumbe
 %-------------------------------------------
 \vspace{-0.5em}
-\parinterval 那激活函数又是什么？神经元在接收到经过线性变换的结果后，通过激活函数的处理，得到最终的输出$ \mathbf y $。激活函数的目的是解决实际问题中的非线性变换，线性变换只能拟合直线，而激活函数的加入，使神经网络具有了拟合曲线的能力。 特别是在实际问题中，很多现象都无法用简单的线性关系描述，这时可以使用非线性激活函数来描述更加复杂的问题。常见的非线性函数有Sigmoid、ReLU、Tanh等。如图\ref{fig:9-15}列举了几种激活函数的形式。
+\parinterval 那激活函数又是什么？神经元在接收到经过线性变换的结果后，通过激活函数的处理，得到最终的输出$ \mathbf y $。激活函数的目的是解决实际问题中的非线性变换，线性变换只能拟合直线，而激活函数的加入，使神经网络具有了拟合曲线的能力。 特别是在实际问题中，很多现象都无法用简单的线性关系描述，这时可以使用非线性激活函数来描述更加复杂的问题。常见的非线性激活函数有Sigmoid、ReLU、Tanh等。如图\ref{fig:9-15}列举了几种激活函数的形式。
 %----------------------------------------------
 \begin{figure}[htp]
@@ -1069,7 +1069,7 @@ f(x)=\begin{cases} 0 & x\le 0 \\x & x>0\end{cases}
 \parinterval 有了张量这个工具，可以很容易地实现任意的神经网络。反过来，神经网络都可以被看作是张量的函数。一种经典的神经网络计算模型是：给定输入张量，通过各个神经网络层所对应的张量计算之后，最后得到输出张量。这个过程也被称作{\small\sffamily\bfseries{前向传播}}\index{前向传播}（Forward Propagation\index{Forward Propagation}），它常常被应用在使用神经网络对新的样本进行推断中。
-\parinterval 来看一个具体的例子，如图\ref{fig:9-37}(a)是一个根据天气情况判断穿衣指数（穿衣指数是人们穿衣薄厚的依据）的过程，将当天的天空状况、低空气温、水平气压作为输入，通过一层神经元在输入数据中提取温度、风速两方面的特征，并根据这两方面的特征判断穿衣指数。需要注意的是，在实际的神经网络中，并不能准确地知道神经元究竟可以提取到哪方面的特征，以上表述是为了让读者更好地理解神经网络的建模过程和前向传播过程。这里将上述过程建模为如图\ref{fig:9-37}(b)所示的两层神经网络。
+\parinterval 来看一个具体的例子，如图\ref{fig:9-37}是一个根据天气情况判断穿衣指数（穿衣指数是人们穿衣薄厚的依据）的过程，将当天的天空状况、低空气温、水平气压作为输入，通过一层神经元在输入数据中提取温度、风速两方面的特征，并根据这两方面的特征判断穿衣指数。需要注意的是，在实际的神经网络中，并不能准确地知道神经元究竟可以提取到哪方面的特征，以上表述是为了让读者更好地理解神经网络的建模过程和前向传播过程。这里将上述过程建模为如图\ref{fig:9-37}所示的两层神经网络。
 %----------------------------------------------
 \begin{figure}[htp]