合并分支 'caorunzhe' 到 'master'

Caorunzhe 查看合并请求 !776

合并分支 'caorunzhe' 到 'master'
Caorunzhe 查看合并请求 !776
6797dee8 · 曹润柘 · 154466b9 · 484276b0 · 6797dee8 · 6797dee8
Commit 6797dee8 authored Jan 04, 2021 by 曹润柘
--- a/Chapter14/chapter14.tex
+++ b/Chapter14/chapter14.tex
@@ -105,7 +105,7 @@

 \parinterval 机器翻译有两种常用的推断方式\ \dash \ 自左向右推断和自右向左推断。自左向右推断符合现实世界中人类的语言使用规律，因为人在翻译一个句子时，总是习惯从句子开始的部分往后生成\footnote{有些语言中，文字是自右向左书写，这时自右向左推断更符合人类使用这种语言的习惯。}。不过，有时候人也会使用当前单词后面的译文信息。也就是说，翻译也需要“未来” 的文字信息。于是很容易想到使用自右向左的方法对译文进行生成。

-\parinterval 以上两种推断方式在神经机器翻译中都有应用，对于源语言句子$\seq{x}=\{x_1,x_2,\dots,x_m\}$和目标语言句子$\seq{y}=\{y_1,y_2,\dots,y_n\}$，自左向右的翻译可以被描述为公式\eqref{eq:14-1}：
+\parinterval 以上两种推断方式在神经机器翻译中都有应用，对于源语言句子$\seq{x}=\{x_1,\dots,x_m\}$和目标语言句子$\seq{y}=\{y_1,\dots,y_n\}$，自左向右的翻译可以被描述为公式\eqref{eq:14-1}：

 \begin{eqnarray}
 \funp{P}(\seq{y}\vert\seq{x}) &=& \prod_{j=1}^n \funp{P}(y_j\vert\seq{y}_{<j},\seq{x})
@@ -118,7 +118,7 @@
 \label{eq:14-2}
 \end{eqnarray}

-\noindent 其中，$\seq{y}_{<j}=\{y_1,y_2,\dots,y_{j-1}\}$，$\seq{y}_{>j}=\{y_{j+1},y_{j+2},\dots,y_n\}$。可以看到，自左向右推断和自右向左推断本质上是一样的。{\chapterten}到{\chaptertwelve}均使用了自左向右的推断方法。自右向左推断比较简单的实现方式是：在训练过程中直接将双语数据中的目标语言句子进行反转，之后仍然使用原始的模型进行训练即可。在推断的时候，生成的目标语言词串也需要进行反转得到最终的译文。有时候，使用自右向左的推断方式会取得更好的效果\upcite{DBLP:conf/wmt/SennrichHB16}。不过更多情况下需要同时使用词串左端（历史）和右端（未来）的信息。有多种思路可以融合左右两端信息：
+\noindent 其中，$\seq{y}_{<j}=\{y_1,\dots,y_{j-1}\}$，$\seq{y}_{>j}=\{y_{j+1},\dots,y_n\}$。可以看到，自左向右推断和自右向左推断本质上是一样的。{\chapterten}到{\chaptertwelve}均使用了自左向右的推断方法。自右向左推断比较简单的实现方式是：在训练过程中直接将双语数据中的目标语言句子进行反转，之后仍然使用原始的模型进行训练即可。在推断的时候，生成的目标语言词串也需要进行反转得到最终的译文。有时候，使用自右向左的推断方式会取得更好的效果\upcite{DBLP:conf/wmt/SennrichHB16}。不过更多情况下需要同时使用词串左端（历史）和右端（未来）的信息。有多种思路可以融合左右两端信息：

 \begin{itemize}
 \vspace{0.5em}

--- a/Chapter16/Figures/figure-comparison-of-structure-between-gpt-and-bert-model.tex
+++ b/Chapter16/Figures/figure-comparison-of-structure-between-gpt-and-bert-model.tex
@@ -103,7 +103,7 @@
 \node [anchor=north] (pos1) at ([xshift=1.5em,yshift=-1.0em]node0-2.south) {\small{(a) GPT模型结构}};
 \node [anchor=north] (pos2) at ([xshift=1.5em,yshift=-1.0em]node0-6.south) {\small{(b) BERT模型结构}};

-\node [anchor=south] (ex) at ([xshift=2.1em,yshift=0.5em]node3-1.north) {\small{TRM：transformer}};
+\node [anchor=south] (ex) at ([xshift=2.1em,yshift=0.5em]node3-1.north) {\small{TRM：Transformer}};




--- a/Chapter16/Figures/figure-example-of-iterative-back-translation.tex
+++ b/Chapter16/Figures/figure-example-of-iterative-back-translation.tex
@@ -60,7 +60,7 @@
 \node [anchor=west,fill=red!20,minimum width=1.5em](d2-1) at ([xshift=-0.0em]d2.east){};
 \node [anchor=west,fill=yellow!20,minimum width=1.5em](d3-1) at ([xshift=-0.0em]d3.east){};
 \node [anchor=north] (d4) at ([xshift=1em]d1.south) {\small{训练：}};
-\node [anchor=north] (d5) at ([xshift=0.5em]d2.south) {\small{推理：}};
+\node [anchor=north] (d5) at ([xshift=0.5em]d2.south) {\small{推断：}};
 \draw [->,thick] ([xshift=0em]d4.east)--([xshift=1.5em]d4.east);
 \draw [->,thick,dashed] ([xshift=0em]d5.east)--([xshift=1.5em]d5.east);


--- a/Chapter16/Figures/figure-examples-of-comparable-corpora.tex
+++ b/Chapter16/Figures/figure-examples-of-comparable-corpora.tex
 \begin{tikzpicture}
 \begin{scope}
-\node [anchor=center] (node1) at (0,0) {\textbf{Machine translation}, sometiomes referred to by the abbreviation \textbf{MT} (not to be };
-\node [anchor=north] (node2) at (node1.south) {confused with computer-aided translation,,machine-aided human translation inter};
+\node [anchor=center] (node1) at (0,0) {\textbf{Machine Translation}, sometimes referred to by the abbreviation \textbf{MT} (not to be };
+\node [anchor=north] (node2) at (node1.south) {confused with computer-aided translation,machine-aided human translation inter};
 \node [anchor=north] (node3) at (node2.south) {-active translation), is a subfield of computational linguistics that investigates the};
 \node [anchor=north] (node4) at ([xshift=-1.8em]node3.south) {use of software to translate text or speech from one language to another.};
 \node [anchor=south] (node5) at ([xshift=-12.8em,yshift=0.5em]node1.north) {\Large{WIKIPEDIA}};

--- a/Chapter16/Figures/figure-parameter-initialization-method-diagram.tex
+++ b/Chapter16/Figures/figure-parameter-initialization-method-diagram.tex
@@ -12,8 +12,8 @@
 \node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!20,line width=0.6pt] (decoder2) at ([xshift=4em,yshift=0em]decoder1.east){\small 解码器};
 \node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!30,line width=0.6pt] (decoder3) at ([xshift=3em]decoder2.east){\small 解码器};

-\node[anchor=north,font=\scriptsize,fill=yellow!20] (w1) at ([yshift=-1.6em]decoder1.south){知识 \ 就是 \ 力量 \ 。 \ <EOS>};
-\node[anchor=north,font=\scriptsize,fill=green!20] (w3) at ([yshift=-1.6em]decoder3.south){Wissen  \ ist \ Machit \ . \ <EOS>};
+\node[anchor=north,font=\scriptsize,fill=yellow!20] (w1) at ([yshift=-1.6em]decoder1.south){知识 \ 就是 \ 力量 \ 。 \ <eos>};
+\node[anchor=north,font=\scriptsize,fill=green!20] (w3) at ([yshift=-1.6em]decoder3.south){Wissen  \ ist \ Machit \ . \ <eos>};
 \node[anchor=south,font=\scriptsize,fill=orange!20] (w2) at ([yshift=1.6em]encoder1.north){Knowledge \ is \ power \ . };
 \node[anchor=south,font=\scriptsize,fill=orange!20] (w4) at ([yshift=1.6em]encoder3.north){Knowledge \ is \ power \ . };


--- a/Chapter16/chapter16.tex
+++ b/Chapter16/chapter16.tex
--- a/Chapter17/Figures/figure-cache.tex
+++ b/Chapter17/Figures/figure-cache.tex
@@ -18,7 +18,7 @@
 \node[anchor=south,font=\footnotesize,inner sep=0pt] (cache)at ([yshift=2em,xshift=1.5em]key.north){\small\bfnew{Cache}};

 \node[draw,anchor=east,minimum size=1.8em,fill=orange!15] (dt) at ([yshift=2.1em,xshift=-4em]key.west){${\mathbi{d}}_{t}$};
-\node[anchor=north,font=\footnotesize] (readlab) at ([xshift=2.8em,yshift=0.3em]dt.north){\red{reading}};
+\node[anchor=north,font=\footnotesize] (readlab) at ([xshift=2.8em,yshift=0.3em]dt.north){\red{读取}};
 \node[draw,anchor=east,minimum size=1.8em,fill=ugreen!15] (st) at ([xshift=-3.7em]dt.west){${\mathbi{s}}_{t}$};
 \node[draw,anchor=east,minimum size=1.8em,fill=red!15] (st2) at ([xshift=-0.85em,yshift=3.5em]dt.west){$ \widetilde{\mathbi{s}}_{t}$};

@@ -27,10 +27,10 @@
 \draw[-,thick] (add.0) -- (add.180);
 \draw[-,thick] (add.90) -- (add.-90);

-\node[anchor=north,inner sep=0pt,font=\footnotesize,text=red] at ([xshift=-0.08em,yshift=-1em]add.south){combining};
+\node[anchor=north,inner sep=0pt,font=\footnotesize,text=red] at ([xshift=-0em,yshift=-0.5em]add.south){融合};

 \node[draw,anchor=east,minimum size=1.8em,fill=yellow!15] (ct) at ([xshift=-2em,yshift=-3.5em]st.west){$ {\mathbi{C}}_{t}$};
-\node[anchor=north,font=\footnotesize] (matchlab) at ([xshift=6.7em,yshift=-0.1em]ct.north){\red{mathching}};
+\node[anchor=north,font=\footnotesize] (matchlab) at ([xshift=6.7em,yshift=-0.1em]ct.north){\red{匹配}};

 \node[anchor=east] (y) at ([xshift=-6em,yshift=1em]st.west){$\mathbi{y}_{t-1}$};


--- a/Chapter17/chapter17.tex
+++ b/Chapter17/chapter17.tex
--- a/bibliography.bib
+++ b/bibliography.bib