chapter7 fig

a3a7c1da · zengxin · 46043306 · a3a7c1da · a3a7c1da · a3a7c1da
Commit a3a7c1da authored May 09, 2020 by zengxin
--- a/Book/Chapter7/Chapter7.tex
+++ b/Book/Chapter7/Chapter7.tex
@@ -270,7 +270,7 @@

 \subsubsection{大词表和OOV问题}

-\parinterval 首先来具体看一看神经机器翻译的大词表问题。神经机器翻译模型训练和解码都依赖于源语言和目标语言的词表。在建模中，词表中的每一个单词都会被转换为分布式（向量）表示，即词嵌入。这些向量会作为模型的输入（见第六章）。如果每个单词都对应一个向量，那么单词的各种变形（时态、语态等）都会导致词表和相应的向量数量的增加。
+\parinterval 首先来具体看一看神经机器翻译的大词表问题。神经机器翻译模型训练和解码都依赖于源语言和目标语言的词表。在建模中，词表中的每一个单词都会被转换为分布式（向量）表示，即词嵌入。这些向量会作为模型的输入（见第六章）。如果每个单词都对应一个向量，那么单词的各种变形（时态、语态等）都会导致词表和相应的向量数量的增加。图\ref{fig:7-7}展示了一些英语单词的时态语态变化。

 %----------------------------------------------
 \begin{figure}[htp]
@@ -1180,9 +1180,7 @@ b &=& \omega_{\textrm{high}}\cdot |\mathbf{x}|
 \label{eq:7-15}
 \end{eqnarray}

-\noindent 其中$\gamma_{k}$表示第$k$个系统的权重，且满足$\sum_{k=1}^{K} \gamma_{k} = 1$。公式\ref{eq:7-15}是一种线性模型。权重$\{ \gamma_{k}\}$可以在开发集上自动调整，比如，使用最小错误率训练得到最优的权重（见第四章）。不过在实践中发现，如果这$K$个模型都是由一个基础模型衍生出来的，权重$\{ \gamma_{k}\}$对最终结果的影响并不大。因此，有时候也简单的将权重设置为$\gamma_{k} = \frac{1}{K}$。
-
-\parinterval 公式\ref{eq:7-15}是一种典型的线性插值模型，这类模型在语言建模等任务中已经得到成功应用。从统计学习的角度，对多个模型的插值可以有效的降低经验错误率。不过，多模型集成依赖一个假设：这些模型之间需要有一定的互补性。这种互补性有时也体现在多个模型预测的上限上，称为Oracle。比如，可以把这$K$个模型输出中BLEU最高的结果作为Oracle，也可以选择每个预测结果中使BLEU达到最高的译文单词，这样构成的句子作为Oracle。当然，并不是说Oracle提高，模型集成的结果一定会变好。因为Oracle是最理想情况下的结果，而实际预测的结果与Oracle往往有很大差异。如何使用Oracle进行模型优化也是很多研究者在探索的问题。
+\noindent 其中$\gamma_{k}$表示第$k$个系统的权重，且满足$\sum_{k=1}^{K} \gamma_{k} = 1$。公式\ref{eq:7-15}是一种线性模型。权重$\{ \gamma_{k}\}$可以在开发集上自动调整，比如，使用最小错误率训练得到最优的权重（见第四章）。不过在实践中发现，如果这$K$个模型都是由一个基础模型衍生出来的，权重$\{ \gamma_{k}\}$对最终结果的影响并不大。因此，有时候也简单的将权重设置为$\gamma_{k} = \frac{1}{K}$。图\ref{fig:7-25}展示了对三个模型预测结果的集成。

 %----------------------------------------------
 \begin{figure}[htp]
@@ -1193,6 +1191,8 @@ b &=& \omega_{\textrm{high}}\cdot |\mathbf{x}|
 \end{figure}
 %----------------------------------------------

+\parinterval 公式\ref{eq:7-15}是一种典型的线性插值模型，这类模型在语言建模等任务中已经得到成功应用。从统计学习的角度，对多个模型的插值可以有效的降低经验错误率。不过，多模型集成依赖一个假设：这些模型之间需要有一定的互补性。这种互补性有时也体现在多个模型预测的上限上，称为Oracle。比如，可以把这$K$个模型输出中BLEU最高的结果作为Oracle，也可以选择每个预测结果中使BLEU达到最高的译文单词，这样构成的句子作为Oracle。当然，并不是说Oracle提高，模型集成的结果一定会变好。因为Oracle是最理想情况下的结果，而实际预测的结果与Oracle往往有很大差异。如何使用Oracle进行模型优化也是很多研究者在探索的问题。
+
 \parinterval 此外，如何构建集成用的模型也是非常重要的，甚至说这部分工作会成为模型集成方法中最困难的部分。绝大多数时候，模型生成并没有固定的方法。系统研发者大多也是``八仙过海、各显神通''。一些常用的方法有：

 \begin{itemize}

--- a/Book/Chapter7/Figures/figure-batch-generation-method.tex
+++ b/Book/Chapter7/Figures/figure-batch-generation-method.tex

 \begin{tikzpicture}
-	\tikzstyle{node} = [minimum height=1.0em,draw=teal,fill=teal!10]
-	\tikzstyle{legend} = [minimum height=1.0em,minimum width=1.0em,draw]
-	\tikzstyle{node2} = [minimum width=1.0em,minimum height=4.1em,draw=blue,fill=blue!10]
-	\node[node,minimum width=2.8em] (node1) at (0,0) {};
-	\node[node,minimum width=4.0em,anchor=north west] (node2) at (node1.south west) {};
-	\node[node,minimum width=3.2em,anchor=north west] (node3) at (node2.south west) {};
-	\node[node,minimum width=3.0em,anchor=north west] (node4) at (node3.south west) {};
+	\tikzstyle{node} = [minimum height=1.0*1.2em,draw=teal,fill=teal!10]
+	\tikzstyle{legend} = [minimum height=1.0*1.2em,minimum width=1.0*1.2em,draw]
+	\tikzstyle{node2} = [minimum width=1.0*1.2em,minimum height=4.1*1.2em,draw=blue,fill=blue!10]
+	\node[node,minimum width=2.8*1.2em] (node1) at (0,0) {};
+	\node[node,minimum width=4.0*1.2em,anchor=north west] (node2) at (node1.south west) {};
+	\node[node,minimum width=3.2*1.2em,anchor=north west] (node3) at (node2.south west) {};
+	\node[node,minimum width=3.0*1.2em,anchor=north west] (node4) at (node3.south west) {};
 	\node[node2,anchor = north west] (grad1) at ([xshift=1.2em]node1.north east) {};
-	\node[node,minimum width=3.7em,anchor=north west] (node5) at (grad1.north east) {};
-	\node[node,minimum width=2.8em,anchor=north west] (node6) at (node5.south west) {};
-	\node[node,minimum width=3.2em,anchor=north west] (node7) at (node6.south west) {};
-	\node[node,minimum width=4.0em,anchor=north west] (node8) at (node7.south west) {};
-	\node[font=\scriptsize,anchor=east] (line1) at (node1.west) {gpu1};
-	\node[font=\scriptsize,anchor=east] (line2) at (node2.west) {gpu2};
-	\node[font=\scriptsize,anchor=east] (line3) at (node3.west) {gpu3};
-	\node[font=\scriptsize,anchor=east] (line4) at (node4.west) {gpu4};
+	\node[node,minimum width=3.7*1.2em,anchor=north west] (node5) at (grad1.north east) {};
+	\node[node,minimum width=2.8*1.2em,anchor=north west] (node6) at (node5.south west) {};
+	\node[node,minimum width=3.2*1.2em,anchor=north west] (node7) at (node6.south west) {};
+	\node[node,minimum width=4.0*1.2em,anchor=north west] (node8) at (node7.south west) {};
+	\node[font=\footnotesize,anchor=east] (line1) at (node1.west) {gpu1};
+	\node[font=\footnotesize,anchor=east] (line2) at (node2.west) {gpu2};
+	\node[font=\footnotesize,anchor=east] (line3) at (node3.west) {gpu3};
+	\node[font=\footnotesize,anchor=east] (line4) at (node4.west) {gpu4};
 	\node[node2,anchor = north west] (grad2) at ([xshift=0.3em]node5.north east) {};
-	\draw[->] (-1.4em,-3.62em) -- (9.5em,-3.62em);
+	\draw[->] (-1.4em*1.2,-3.62*1.2em) -- (9em*1.2,-3.62*1.2em);

-	\node[node,minimum width=2.8em] (node9) at (15em,0) {};
-	\node[node,minimum width=4.0em,anchor=north west] (node10) at (node9.south west) {};
-	\node[node,minimum width=3.2em,anchor=north west] (node11) at (node10.south west) {};
-	\node[node,minimum width=3.0em,anchor=north west] (node12) at (node11.south west) {};
+	\node[node,minimum width=2.8*1.2em] (node9) at (16em,0) {};
+	\node[node,minimum width=4.0*1.2em,anchor=north west] (node10) at (node9.south west) {};
+	\node[node,minimum width=3.2*1.2em,anchor=north west] (node11) at (node10.south west) {};
+	\node[node,minimum width=3.0*1.2em,anchor=north west] (node12) at (node11.south west) {};

-	\node[node,minimum width=3.7em,anchor=north west] (node13) at (node9.north east) {};
-	\node[node,minimum width=2.8em,anchor=north west] (node14) at (node10.north east) {};
-	\node[node,minimum width=3.2em,anchor=north west] (node15) at (node11.north east) {};
-	\node[node,minimum width=4.0em,anchor=north west] (node16) at (node12.north east) {};
+	\node[node,minimum width=3.7*1.2em,anchor=north west] (node13) at (node9.north east) {};
+	\node[node,minimum width=2.8*1.2em,anchor=north west] (node14) at (node10.north east) {};
+	\node[node,minimum width=3.2*1.2em,anchor=north west] (node15) at (node11.north east) {};
+	\node[node,minimum width=4.0*1.2em,anchor=north west] (node16) at (node12.north east) {};
 	\node[node2,anchor = north west] (grad3) at ([xshift=0.5em]node13.north east) {};
-	\node[font=\scriptsize,anchor=east] (line1) at (node9.west) {gpu1};
-	\node[font=\scriptsize,anchor=east] (line2) at (node10.west) {gpu2};
-	\node[font=\scriptsize,anchor=east] (line3) at (node11.west) {gpu3};
-	\node[font=\scriptsize,anchor=east] (line4) at (node12.west) {gpu4};
-	\draw[->] (13.6em,-3.62em) -- (22.2em,-3.62em);
+	\node[font=\footnotesize,anchor=east] (line1) at (node9.west) {gpu1};
+	\node[font=\footnotesize,anchor=east] (line2) at (node10.west) {gpu2};
+	\node[font=\footnotesize,anchor=east] (line3) at (node11.west) {gpu3};
+	\node[font=\footnotesize,anchor=east] (line4) at (node12.west) {gpu4};
+	\draw[->] (13.6*1.2em,-3.62*1.2em) -- (20.5*1.2em,-3.62*1.2em);
 	\begin{pgfonlayer}{background}
 	\node [rectangle,inner sep=-0.0em,draw] [fit = (node1) (node2) (node3) (node4)] (box1) {};
 	\node [rectangle,inner sep=-0.0em,draw] [fit = (node5) (node6) (node7) (node8)] (box2) {};
 	\node [rectangle,inner sep=-0.0em,draw] [fit = (node9) (node13) (node12) (node16)] (box2) {};
 	\end{pgfonlayer}
-	\node[font=\scriptsize,anchor=north] (legend1) at ([xshift=3em]node4.south) {一步一更新};
-	\node[font=\scriptsize,anchor=north] (legend2) at ([xshift=2.5em]node12.south) {累积两步更新};
-	\node[font=\scriptsize,anchor=north] (time1) at (grad2.south) {time};
-	\node[font=\scriptsize,anchor=north] (time1) at (grad3.south) {time};
+	\node[font=\footnotesize,anchor=north] (legend1) at ([xshift=3em]node4.south) {一步一更新};
+	\node[font=\footnotesize,anchor=north] (legend2) at ([xshift=2.5em]node12.south) {累积两步更新};
+	\node[font=\footnotesize,anchor=north] (time1) at (grad2.south) {time};
+	\node[font=\footnotesize,anchor=north] (time1) at (grad3.south) {time};

 	\node[legend] (legend3) at (2em,2em) {};
-	\node[font=\scriptsize,anchor=west] (idle) at (legend3.east) {:空闲};
+	\node[font=\footnotesize,anchor=west] (idle) at (legend3.east) {:空闲};
 	\node[legend,anchor=west,draw=teal,fill=teal!10] (legend4) at ([xshift = 2em]idle.east) {};
-	\node[font=\scriptsize,anchor=west] (FB) at (legend4.east) {:前向/反向};
+	\node[font=\footnotesize,anchor=west] (FB) at (legend4.east) {:前向/反向};
 	\node[legend,anchor=west,draw=blue,fill=blue!10] (legend5) at ([xshift = 2em]FB.east) {};
-	\node[font=\scriptsize,anchor=west] (grad_sync) at (legend5.east) {:梯度更新};
+	\node[font=\footnotesize,anchor=west] (grad_sync) at (legend5.east) {:梯度更新};

 \end{tikzpicture}
\ No newline at end of file
--- a/Book/Chapter7/Figures/figure-randomly-generation-vs-generate-by-sentence-length.tex
+++ b/Book/Chapter7/Figures/figure-randomly-generation-vs-generate-by-sentence-length.tex

 \begin{tikzpicture}
-	\tikzstyle{node} = [minimum height=1.0em,draw=teal,fill=teal!10]
-	\node[node,minimum width=2.0em] (sent1) at (0,0) {};
-	\node[node,minimum width=5.0em,anchor=north west] (sent2) at (sent1.south west) {};
-	\node[node,minimum width=1.0em,anchor=north west] (sent3) at (sent2.south west) {};
-	\node[node,minimum width=3.0em,anchor=north west] (sent4) at (sent3.south west) {};
+	\tikzstyle{node} = [minimum height=1.0*1.2em,draw=teal,fill=teal!10]
+	\node[node,minimum width=2.0*1.2em] (sent1) at (0,0) {};
+	\node[node,minimum width=5.0*1.2em,anchor=north west] (sent2) at (sent1.south west) {};
+	\node[node,minimum width=1.0*1.2em,anchor=north west] (sent3) at (sent2.south west) {};
+	\node[node,minimum width=3.0*1.2em,anchor=north west] (sent4) at (sent3.south west) {};

-	\node[node,minimum width=4.0em] (sent5) at (12em,0) {};
-	\node[node,minimum width=4.5em,anchor=north west] (sent6) at (sent5.south west) {};
-	\node[node,minimum width=4.5em,anchor=north west] (sent7) at (sent6.south west) {};
-	\node[node,minimum width=5em,anchor=north west] (sent8) at (sent7.south west) {};
+	\node[node,minimum width=4.0*1.2em] (sent5) at (14em,0) {};
+	\node[node,minimum width=4.5*1.2em,anchor=north west] (sent6) at (sent5.south west) {};
+	\node[node,minimum width=4.5*1.2em,anchor=north west] (sent7) at (sent6.south west) {};
+	\node[node,minimum width=5*1.2em,anchor=north west] (sent8) at (sent7.south west) {};

-	\node[font=\scriptsize,anchor=east] (line1) at (sent1.west) {sent1};
-	\node[font=\scriptsize,anchor=east] (line2) at (sent2.west) {sent2};
-	\node[font=\scriptsize,anchor=east] (line3) at (sent3.west) {sent3};
-	\node[font=\scriptsize,anchor=east] (line4) at (sent4.west) {sent4};
+	\node[font=\footnotesize,anchor=east] (line1) at (sent1.west) {sent1};
+	\node[font=\footnotesize,anchor=east] (line2) at (sent2.west) {sent2};
+	\node[font=\footnotesize,anchor=east] (line3) at (sent3.west) {sent3};
+	\node[font=\footnotesize,anchor=east] (line4) at (sent4.west) {sent4};

-	\node[font=\scriptsize,anchor=east] (line5) at (sent5.west) {sent1};
-	\node[font=\scriptsize,anchor=east] (line6) at (sent6.west) {sent2};
-	\node[font=\scriptsize,anchor=east] (line7) at (sent7.west) {sent3};
-	\node[font=\scriptsize,anchor=east] (line8) at (sent8.west) {sent4};
+	\node[font=\footnotesize,anchor=east] (line5) at (sent5.west) {sent1};
+	\node[font=\footnotesize,anchor=east] (line6) at (sent6.west) {sent2};
+	\node[font=\footnotesize,anchor=east] (line7) at (sent7.west) {sent3};
+	\node[font=\footnotesize,anchor=east] (line8) at (sent8.west) {sent4};
 	\begin{pgfonlayer}{background}
 	\node [rectangle,inner sep=-0.0em,draw] [fit = (sent1) (sent2) (sent3) (sent4)] (box1) {};
 	\node [rectangle,inner sep=-0.0em,draw] [fit = (sent5) (sent6) (sent7) (sent8)] (box2) {};
 	\end{pgfonlayer}
-	\node[font=\scriptsize] (node1) at ([yshift=-3em]sent2.south) {随机生成};
-	\node[font=\scriptsize] (node2) at ([yshift=-1em]sent8.south) {排序生成};
+	\node[font=\footnotesize] (node1) at ([yshift=-3.4em]sent2.south) {随机生成};
+	\node[font=\footnotesize] (node2) at ([yshift=-1em]sent8.south) {排序生成};

 \end{tikzpicture}
\ No newline at end of file
--- a/Book/Chapter7/Figures/figure-word-change.tex
+++ b/Book/Chapter7/Figures/figure-word-change.tex

 \begin{center}
-	\centerline{以英语为例：}
-	\vspace{0.5em}
 	\begin{tikzpicture}
 		\node[rounded corners=3pt,minimum width=10.0em,minimum height=2.0em,draw,thick,fill=green!5,font=\scriptsize,drop shadow,inner sep=0.5em] (left) at (0,0) {
 		\begin{tabular}{c}