合并分支 'caorunzhe' 到 'zengxin'

Caorunzhe 查看合并请求 !553

合并分支 'caorunzhe' 到 'zengxin'
Caorunzhe 查看合并请求 !553
f74c2603 · zengxin · d3812c52 · d4c2adbd · f74c2603 · f74c2603
Commit f74c2603 authored Dec 07, 2020 by zengxin
--- a/Chapter10/chapter10.tex
+++ b/Chapter10/chapter10.tex
@@ -376,7 +376,7 @@ NMT                     & 21.7          & 18.7           & -13.7      \\
 %    NEW SECTION   10.3
 %----------------------------------------------------------------------------------------
 \sectionnewpage
-\section{基于循环神经网络的模型}
+\section{基于循环神经网络的翻译建模}

 \parinterval 早期神经机器翻译的进展主要来自两个方面：1）使用循环神经网络对单词序列进行建模；2）注意力机制的使用。表\ref{tab:10-6}列出了2013-2015年间有代表性的部分研究工作。从这些工作的内容上看，当时的研究重点还是如何有效地使用循环神经网络进行翻译建模以及使用注意力机制捕捉双语单词序列间的对应关系。


--- a/Chapter11/chapter11.tex
+++ b/Chapter11/chapter11.tex
@@ -231,7 +231,7 @@
 %    NEW SECTION
 %----------------------------------------------------------------------------------------

-\section{基于卷积神经网络的模型}
+\section{基于卷积神经网络的翻译建模}

 \parinterval 正如之前所讲，卷积神经网络可以用于序列建模，同时具有并行性高和易于学习的特点，一个很自然的想法就是将其用作神经机器翻译模型中的特征提取器。因此，在神经机器翻译被提出之初，研究人员就已经开始利用卷积神经网络对句子进行特征提取。比较经典的模型是使用卷积神经网络作为源语言句子的编码器，使用循环神经网络作为目标语译文生成的解码器\upcite{kalchbrenner-blunsom-2013-recurrent,Gehring2017ACE}。之后也有研究人员提出完全基于卷积神经网络的翻译模型（ConvS2S）\upcite{DBLP:journals/corr/GehringAGYD17}，或者针对卷积层进行改进，提出效率更高、性能更好的模型\upcite{Kaiser2018DepthwiseSC,Wu2019PayLA}。本节将基于ConvS2S模型，阐述如何使用卷积神经网络搭建端到端神经机器翻译模型。


--- a/Chapter16/Figures/figure-application-process-of-back-translation.tex
+++ b/Chapter16/Figures/figure-application-process-of-back-translation.tex
@@ -41,7 +41,7 @@

 \node [anchor=west,fill=red!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-1) at ([xshift=2.0em,yshift=1.6em]node3-2.east){\scriptsize{英语}};
 \node [anchor=north,fill=green!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-2) at (node4-1.south){\scriptsize{英语}};
-\node [anchor=west,fill=green!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-3) at (node4-1.east){\scriptsize{汉语}};
+\node [anchor=west,fill=yellow!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-3) at (node4-1.east){\scriptsize{汉语}};
 \node [anchor=north,fill=green!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-4) at (node4-3.south){\scriptsize{汉语}};



--- a/Chapter16/Figures/figure-bilingual-dictionary-Induction.png
+++ b/Chapter16/Figures/figure-bilingual-dictionary-Induction.png
--- a/Chapter16/Figures/figure-bilingual-dictionary-Induction.tex
+++ b/Chapter16/Figures/figure-bilingual-dictionary-Induction.tex
+\begin{tikzpicture}
+
+%%%%%%%%词典推断------------------------------------------------------------
+\begin{scope}
+\draw [-,ublue,line width=0.5pt] (0,0)..controls (0.3,0.2) and (0.5,0)..(0.7,-0.2)..controls (0.8,-0.3) and (0.9,-0.4)..(1.1,-0.4)..controls (1.3,-0.4) and (1.3,-0.1)..(1.28,0)..controls (1.26,0.1) and (1.25,0.2)..(1.2,0.3)..controls (1.15,0.4)and (1.2,0.5)..(1.6,0.55)..controls (1.7,0.56) and (1.78,0.5)..(1.85,0.35)..controls (2.0,0.0) and (2.05,-0.1)..(2.05,-0.5)..controls (2.04,-1.1) and (1.5,-1.1)..(0.6,-0.78)..controls (0.5,-0.74) and (0.4,-0.7)..(0.2,-0.5)..controls(0.1,-0.4) and (-0.15,-0.1)..(0,0) ;
+
+\draw [-,red!70,line width=0.5pt] (0.04,-0.5) .. controls (0,-0.4) and (0.4,-0.1)..(0.7,-0.3)..controls (0.9,-0.45) and (1.1,-0.4)..(1.2,-0.3)..controls (1.3,-0.2) and (1.2,0.1).. (1.0,0.3)..controls (0.8,0.5) and (1.0,0.6)..(1.2,0.67)..controls (1.5,0.78) and (1.8,0.5)..(1.9,0.2)..controls(2.1,-0.3) and (2,-0.5)..(1.8,-0.75)..controls (1.5,-1.1) and (1.2,-1.0)..(0.4,-0.8)..controls (0.3,-0.77) and (0.14,-0.755)..(0.04,-0.5);
+
+\draw [-,thick] (-0.7,1.0)--(-0.7,-1.0);
+
+\node [anchor=center](c1) at (-0.1,0){\tiny{$\mathbi{Y}$}};
+\node [anchor=center](c2) at (-0.3,-0.7){\tiny{$\mathbi{W}\cdot \mathbi{X}$}};
+\node [anchor=center,red!70](cr1) at (0.65,-0.65){\scriptsize{$\bullet$}}; 
+\node [anchor=center,ublue](cb1) at (0.6,-0.5){\scriptsize{$\bullet$}};
+\node [anchor=center,red!70](cr2) at (1.65,-0.65){\scriptsize{$\bullet$}}; 
+\node [anchor=center,ublue](cb2) at (1.55,-0.8){\scriptsize{$\bullet$}};
+\node [anchor=center,red!70](cr3) at (1.5,0.1){\scriptsize{$\bullet$}}; 
+\node [anchor=center,ublue](cb3) at (1.6,-0.05){\scriptsize{$\bullet$}}; 
+\draw [-,red](0.65,-0.65)--(0.60,-0.62)--(0.66,-0.58)--(0.6,-0.55)--(0.63,-0.52)--(0.6,-0.5);
+\draw [-,red](1.65,-0.65)--(1.60,-0.68)--(1.64,-0.72)--(1.56,-0.72)--(1.60,-0.76)--(1.55,-0.8);
+\draw [-,red](1.5,0.1)--(1.53,0.08)--(1.49,0.04)--(1.58,0.03)--(1.54,-0.01)--(1.6,-0.05);
+\end{scope}
+
+%%%%%%%%X映射到Y空间------------------------------------------------------------
+\begin{scope}[xshift=-8.0em]
+\draw [-,ublue,line width=0.5pt] (0,0)..controls (0.3,0.2) and (0.5,0)..(0.7,-0.2)..controls (0.8,-0.3) and (0.9,-0.4)..(1.1,-0.4)..controls (1.3,-0.4) and (1.3,-0.1)..(1.28,0)..controls (1.26,0.1) and (1.25,0.2)..(1.2,0.3)..controls (1.15,0.4)and (1.2,0.5)..(1.6,0.55)..controls (1.7,0.56) and (1.78,0.5)..(1.85,0.35)..controls (2.0,0.0) and (2.05,-0.1)..(2.05,-0.5)..controls (2.04,-1.1) and (1.5,-1.1)..(0.6,-0.78)..controls (0.5,-0.74) and (0.4,-0.7)..(0.2,-0.5)..controls(0.1,-0.4) and (-0.15,-0.1)..(0,0) ;
+
+\draw [-,red!70,line width=0.5pt] (0.04,-0.5) .. controls (0,-0.4) and (0.4,-0.1)..(0.7,-0.3)..controls (0.9,-0.45) and (1.1,-0.4)..(1.2,-0.3)..controls (1.3,-0.2) and (1.2,0.1).. (1.0,0.3)..controls (0.8,0.5) and (1.0,0.6)..(1.2,0.67)..controls (1.5,0.78) and (1.8,0.5)..(1.9,0.2)..controls(2.1,-0.3) and (2,-0.5)..(1.8,-0.75)..controls (1.5,-1.1) and (1.2,-1.0)..(0.4,-0.8)..controls (0.3,-0.77) and (0.14,-0.755)..(0.04,-0.5);
+
+\draw [-,thick] (-0.7,1.0)--(-0.7,-1.0);
+
+\node [anchor=center](c1) at (-0.1,0){\tiny{$\mathbi{Y}$}};
+\node [anchor=center](c2) at (-0.3,-0.7){\tiny{$\mathbi{W}\cdot \mathbi{X}$}};
+\node [anchor=center,red!70](cr1) at (0.65,-0.65){\scriptsize{$\bullet$}}; 
+\node [anchor=center,ublue](cb1) at (0.6,-0.5){\scriptsize{$\bullet$}};
+\node [anchor=center,red!70](cr2) at (1.65,-0.65){\scriptsize{$\bullet$}}; 
+\node [anchor=center,ublue](cb2) at (1.55,-0.8){\scriptsize{$\bullet$}};
+\node [anchor=center,red!70](cr3) at (1.5,0.1){\scriptsize{$\bullet$}}; 
+\node [anchor=center,ublue](cb3) at (1.6,-0.05){\scriptsize{$\bullet$}}; 
+%%%%%%一堆红色的球
+\node [anchor=center,red!70](cr4) at (0.15,-0.6){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr5) at (0.3,-0.6){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr6) at (0.5,-0.55){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr7) at (0.35,-0.4){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr8) at (0.4,-0.7){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr8) at (0.55,-0.8){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr9) at (0.9,-0.8){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr10) at (0.9,-0.5){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr11) at (1.4,-0.8){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr12) at (1.45,-0.3){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr13) at (1.35,0.3){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr14) at (1.2,0.4){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr15) at (1.6,0.45){\Large{$\cdot$}};
+%%%%%%一堆蓝色的球
+\node [anchor=center,ublue](cb4) at (0.1,-0.2){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb5) at (0.3,-0.2){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb6) at (0.5,-0.25){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb7) at (0.4,-0.1){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb8) at (0.35,-0.45){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb9) at (0.45,-0.6){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb10) at (0.85,-0.45){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb11) at (1.45,-0.45){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb12) at (1.3,-0.85){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb13) at (1.8,-0.5){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb14) at (1.75,0.2){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb15) at (1.6,0.2){\Large{$\cdot$}};
+\end{scope}
+
+%%%%%%%%X、Y词嵌入空间------------------------------------------------------------
+\begin{scope}[xshift=-16em]
+\draw [-,ublue,line width=0.5pt] (0,0)..controls (0.3,0.2) and (0.5,0)..(0.7,-0.2)..controls (0.8,-0.3) and (0.9,-0.4)..(1.1,-0.4)..controls (1.3,-0.4) and (1.3,-0.1)..(1.28,0)..controls (1.26,0.1) and (1.25,0.2)..(1.2,0.3)..controls (1.15,0.4)and (1.2,0.5)..(1.6,0.55)..controls (1.7,0.56) and (1.78,0.5)..(1.85,0.35)..controls (2.0,0.0) and (2.05,-0.1)..(2.05,-0.5)..controls (2.04,-1.1) and (1.5,-1.1)..(0.6,-0.78)..controls (0.5,-0.74) and (0.4,-0.7)..(0.2,-0.5)..controls(0.1,-0.4) and (-0.15,-0.1)..(0,0) ;
+
+\node [anchor=center](x1) at (-1.45,0.2){\tiny{$\mathbi{X}$}};
+\node [anchor=center](y1) at (1.1,0.1){\tiny{$\mathbi{Y}$}};
+
+\node [anchor=center,ublue](cb1) at (0.6,-0.5){\scriptsize{$\bullet$}};
+\node [anchor=center,ublue](cb2) at (1.55,-0.8){\scriptsize{$\bullet$}};
+\node [anchor=center,ublue](cb3) at (1.6,-0.05){\scriptsize{$\bullet$}}; 
+%%%%%%一堆蓝色的球
+\node [anchor=center,ublue](cb4) at (0.1,-0.2){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb5) at (0.3,-0.2){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb6) at (0.5,-0.25){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb7) at (0.4,-0.1){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb8) at (0.35,-0.45){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb9) at (0.45,-0.6){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb10) at (0.85,-0.45){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb11) at (1.45,-0.45){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb12) at (1.3,-0.85){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb13) at (1.8,-0.5){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb14) at (1.75,0.2){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb15) at (1.6,0.2){\Large{$\cdot$}};
+
+\node [anchor=center](rw1) at (-0.5,0.45){\tiny{cat}}; 
+\node [anchor=center](rw2) at (0.05,0.4){\tiny{feline}}; 
+\node [anchor=center](rw3) at (-1.17,-0.07){\tiny{car}}; 
+\node [anchor=center](rw4) at (-0.7,-0.65){\tiny{deep}};
+\node [anchor=center](bw1) at (0.2,-0.1){\tiny{felin}};
+\node [anchor=center](bw2) at (0.75,-0.65){\tiny{katze}};
+\node [anchor=center](bw3) at (1.55,-0.65){\tiny{auto}};
+\node [anchor=center](bw4) at (1.6,-0.2){\tiny{tief}};
+\node [anchor=center](de1) at (0.3,-1.5) {\small{(a) $\mathbi{X}$、$\mathbi{Y}$词嵌入空间}};
+\node [anchor=center](de2) at (3.9,-1.5) {\small{(b) $\mathbi{X}$映射到$\mathbi{Y}$空间}};
+\node [anchor=center](de3) at (7,-1.5) {\small{(c) 词典推断}};
+\node [anchor=center](de4) at (10.1,-1.5) {\small{(d) 微调结果}};
+
+\end{scope}
+
+\begin{scope}[xshift=-14.5em,yshift=0.8em,rotate=-150]
+\draw [-,red!70,line width=0.5pt] (0.04,-0.5) .. controls (0,-0.4) and (0.4,-0.1)..(0.7,-0.3)..controls (0.9,-0.45) and (1.1,-0.4)..(1.2,-0.3)..controls (1.3,-0.2) and (1.2,0.1).. (1.0,0.3)..controls (0.8,0.5) and (1.0,0.6)..(1.2,0.67)..controls (1.5,0.78) and (1.8,0.5)..(1.9,0.2)..controls(2.1,-0.3) and (2,-0.5)..(1.8,-0.75)..controls (1.5,-1.1) and (1.2,-1.0)..(0.4,-0.8)..controls (0.3,-0.77) and (0.14,-0.755)..(0.04,-0.5);
+
+\node [anchor=center,red!70](cr1) at (0.65,-0.65){\scriptsize{$\bullet$}}; 
+\node [anchor=center,red!70](cr2) at (1.65,-0.65){\scriptsize{$\bullet$}}; 
+\node [anchor=center,red!70](cr3) at (1.5,0.1){\scriptsize{$\bullet$}}; 
+%%%%%%一堆红色的球
+\node [anchor=center,red!70](cr4) at (0.15,-0.6){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr5) at (0.3,-0.6){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr6) at (0.5,-0.55){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr7) at (0.35,-0.4){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr8) at (0.4,-0.7){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr8) at (0.55,-0.8){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr9) at (0.9,-0.8){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr10) at (0.9,-0.5){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr11) at (1.4,-0.8){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr12) at (1.45,-0.3){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr13) at (1.35,0.3){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr14) at (1.2,0.4){\Large{$\cdot$}};
+\node [anchor=center,red!70](cr15) at (1.6,0.45){\Large{$\cdot$}};
+\end{scope}
+
+%%%%%%%%%%%微调结果------------------------------------------------------------
+\begin{scope}[xshift=8.2em]
+\draw [-,red!70,line width=0.5pt] (0,0.4688)..controls (0.3,0.45) and (0.5,0.2)..(0.7,-0.25)..controls (0.8,-0.45) and (0.9,-0.4)..(1.1,-0.4)..controls (1.3,-0.42) and (1.3,-0.12)..(1.28,0)..controls (1.26,0.1) and (1.25,0.2)..(1.2,0.3)..controls (1.13,0.4) and (1.18,0.5)..(1.6,0.55)..controls (1.7,0.56) and (1.78,0.5)..(1.85,0.35)..controls (2.03,0.0) and (2.08,-0.1)..(2.07,-0.5)..controls (2.04,-1.1) and (1.5,-1.16)..(0.6,-0.91)..controls (0.05,-0.71) and (-0.2,-0.53)..(-0.25,-0.45)..controls (-0.55,0.0) and (-0.5,0.501)..(0,0.4688);
+
+\draw [-,ublue,line width=0.5pt] (0,0.5)..controls (0.3,0.5) and (0.5,0.2)..(0.7,-0.25)..controls (0.8,-0.45) and (0.9,-0.4)..(1.1,-0.4)..controls (1.3,-0.40) and (1.3,-0.1)..(1.28,0)..controls (1.26,0.1) and (1.25,0.2)..(1.2,0.3)..controls (1.15,0.4)and (1.2,0.5)..(1.6,0.55)..controls (1.7,0.56) and (1.78,0.5)..(1.85,0.35)..controls (2.0,0.0) and (2.05,-0.1)..(2.05,-0.5)..controls (2.04,-1.1) and (1.5,-1.1)..(0.6,-0.91)..controls (0.0,-0.75) and (-0.2,-0.53)..(-0.25,-0.45)..controls (-0.5,0.0) and (-0.5,0.501)..(0,0.5);
+
+\draw [-,thick] (-0.8,1.0)--(-0.8,-1.0);
+
+\node [anchor=center](c1) at (0.1,0.6){\tiny{$\mathbi{Y}$}};
+\node [anchor=center](c2) at (-0.45,-0.7){\tiny{$\mathbi{W}\cdot \mathbi{X}$}};
+
+\node [anchor=center,red!70](cr1) at (0.2,-0.35){\scriptsize{$\bullet$}};
+\node [anchor=center,red!70](cr2) at (1.58,-0.78){\scriptsize{$\bullet$}};
+\node [anchor=center,red!70](cr3) at (1.6,0){\scriptsize{$\bullet$}}; 
+
+\node [anchor=center,ublue](cb1) at (0.2,-0.3){\scriptsize{$\bullet$}};
+\node [anchor=center,ublue](cb2) at (1.55,-0.8){\scriptsize{$\bullet$}};
+\node [anchor=center,ublue](cb3) at (1.6,-0.05){\scriptsize{$\bullet$}}; 
+%%%%%%一堆红色的球
+\node [anchor=center,red!70](cb4) at (-0.35,0.16){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb5) at (-0.03,0.37){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb6) at (-0.03,0.12){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb7) at (0.37,0.02){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb8) at (-0.18,-0.18){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb9) at (0.65,-0.43){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb10) at (0.32,-0.68){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb11) at (0.82,-0.73){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb12) at (1.23,-0.85){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb13) at (1.8,-0.47){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb14) at (1.75,0.23){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb15) at (1.38,-0.44){\Large{$\cdot$}};
+\node [anchor=center,red!70](cb16) at (1.42,0.26){\Large{$\cdot$}};
+%%%%%%一堆蓝色的球
+\node [anchor=center,ublue](cb4) at (-0.35,0.2){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb5) at (0,0.4){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb6) at (0,0.15){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb7) at (0.4,0.05){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb8) at (-0.15,-0.15){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb9) at (0.65,-0.4){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb10) at (0.3,-0.65){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb11) at (0.8,-0.7){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb12) at (1.2,-0.85){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb13) at (1.8,-0.5){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb14) at (1.75,0.2){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb15) at (1.4,-0.45){\Large{$\cdot$}};
+\node [anchor=center,ublue](cb16) at (1.45,0.3){\Large{$\cdot$}};
+\node [anchor=center](rw1) at (0.22,-0.45){\tiny{cat}};
+\node [anchor=center](rw2) at (0.20,-0.15){\tiny{katze}};
+\end{scope}
+
+\end{tikzpicture}
\ No newline at end of file
--- a/Chapter16/Figures/figure-contrast-of-traditional-machine-learning&transfer-learning.tex
+++ b/Chapter16/Figures/figure-contrast-of-traditional-machine-learning&transfer-learning.tex
@@ -4,26 +4,26 @@
 \begin{tikzpicture}
 	\tikzstyle{node}=[rounded corners=2pt,draw,minimum width=5em,minimum height=2em,drop shadow,font=\footnotesize]

-\node[node,fill=blue!20] (nmt1) at (0,0){NMT系统1};
-\node[node,anchor=west,fill=yellow!20] (nmt2) at ([xshift=1em]nmt1.east){NMT系统2};
-\node[node,anchor=west,fill=red!20] (nmt3) at ([xshift=1em]nmt2.east){NMT系统3};
+\node[node,fill=blue!20,line width=0.6pt] (nmt1) at (0,0){NMT系统1};
+\node[node,anchor=west,fill=yellow!20,line width=0.6pt] (nmt2) at ([xshift=1em]nmt1.east){NMT系统2};
+\node[node,anchor=west,fill=red!20,line width=0.6pt] (nmt3) at ([xshift=1em]nmt2.east){NMT系统3};

-\node[node,anchor=south,fill=blue!20] (n1) at ([yshift=2.4em]nmt1.north){我不悦};
-\node[node,anchor=west,fill=yellow!20] (n2) at ([xshift=1em]n1.east){我不开心};
-\node[node,anchor=west,fill=red!20] (n3) at ([xshift=1em]n2.east){吾怀忳忳};
+\node[node,anchor=south,fill=blue!20,line width=0.6pt] (n1) at ([yshift=2.4em]nmt1.north){我不悦};
+\node[node,anchor=west,fill=yellow!20,line width=0.6pt] (n2) at ([xshift=1em]n1.east){我不开心};
+\node[node,anchor=west,fill=red!20,line width=0.6pt] (n3) at ([xshift=1em]n2.east){吾怀忳忳};

-\node[node,anchor=south,fill=green!20,minimum height=1.6em] (task1) at ([yshift=2.6em]n2.north){不同任务};
+\node[node,anchor=south,fill=green!20,minimum height=1.6em,line width=0.6pt] (task1) at ([yshift=2.6em]n2.north){不同任务};

-\node[node,anchor=west,fill=green!20,minimum height=1.6em] (task2) at ([xshift=8em]task1.east){源任务};
-\node[node,anchor=north,minimum height=3.2em,fill=orange!20] (n4) at ([yshift=-2em]task2.south){};
-\node[draw,anchor=north,cylinder,shape border rotate=90,minimum width=3em,aspect=0.4,fill=orange!20] (kd) at ([yshift=-1.7em]n4.south){\footnotesize 知识};
+\node[node,anchor=west,fill=green!20,minimum height=1.6em,line width=0.6pt] (task2) at ([xshift=8em]task1.east){源任务};
+\node[node,anchor=north,minimum height=3.2em,fill=orange!20,line width=0.6pt] (n4) at ([yshift=-2em]task2.south){};
+\node[draw,anchor=north,cylinder,shape border rotate=90,minimum width=3em,aspect=0.4,fill=orange!20,line width=0.6pt] (kd) at ([yshift=-1.7em]n4.south){\footnotesize 知识};

-\node[draw,minimum width=4em,font=\scriptsize,anchor=north,inner ysep=2pt,fill=blue!20] at ([yshift=-2.35em]task2.south){我不悦};
-\node[draw,minimum width=4em,font=\scriptsize,anchor=north,inner ysep=2pt,fill=yellow!20] at ([yshift=-3.75em]task2.south){我不开心};
+\node[draw,minimum width=4em,font=\scriptsize,anchor=north,inner ysep=2pt,fill=blue!20,line width=0.6pt] at ([yshift=-2.35em]task2.south){我不悦};
+\node[draw,minimum width=4em,font=\scriptsize,anchor=north,inner ysep=2pt,fill=yellow!20,line width=0.6pt] at ([yshift=-3.75em]task2.south){我不开心};

-\node[node,anchor=west,fill=green!20,minimum height=1.6em] (task3) at ([xshift=3em]task2.east){目标任务};
-\node[node,anchor=north,fill=red!20] (n5) at ([yshift=-2.5em]task3.south){吾怀忳忳};
-\node[node,anchor=north,fill=red!20] (sys) at ([yshift=-2.5em]n5.south){学习系统};
+\node[node,anchor=west,fill=green!20,minimum height=1.6em,line width=0.6pt] (task3) at ([xshift=3em]task2.east){目标任务};
+\node[node,anchor=north,fill=red!20,line width=0.6pt] (n5) at ([yshift=-2.5em]task3.south){吾怀忳忳};
+\node[node,anchor=north,fill=red!20,line width=0.6pt] (sys) at ([yshift=-2.5em]n5.south){学习系统};

 \draw[->,thick] ([yshift=-0.2em,xshift=-0.7em]task1.-145) -- node[left,font=\scriptsize,yshift=0.2em]{书面语}([yshift=0.2em]n1.90);
 \draw[->,thick] ([yshift=-0.2em]task1.-90) -- node[right,font=\scriptsize,yshift=0.2em,xshift=-0.2em]{口语}([yshift=0.2em]n2.90);

--- a/Chapter16/Figures/figure-knowledge-distillation-based-translation-process.tex
+++ b/Chapter16/Figures/figure-knowledge-distillation-based-translation-process.tex
@@ -3,11 +3,11 @@
 %-------------------------------------------------------------------------
 \begin{tikzpicture}

-\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20] (x) at (0,0) {$\seq{x}$};
+\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20,line width=0.6pt] (x) at (0,0) {$\seq{x}$};

-\node[draw,circle,inner sep=2pt,minimum size=2em,fill=red!15] (p) at (0,-2.4) {$\seq{p}$};
+\node[draw,circle,inner sep=2pt,minimum size=2em,fill=red!20,line width=0.6pt] (p) at (0,-2.4) {$\seq{p}$};

-\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20] (y) at (2.4,-1.2) {$\seq{y}$};
+\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20,line width=0.6pt] (y) at (2.4,-1.2) {$\seq{y}$};

 \draw[-,dashed,thick,black!50] (x.-90) -- (p.90);
 \draw[-,dashed,thick,black!50] (p.0) -- (y.-135);

--- a/Chapter16/Figures/figure-multi-language-single-model-system-diagram.tex
+++ b/Chapter16/Figures/figure-multi-language-single-model-system-diagram.tex
@@ -3,12 +3,12 @@
 %-------------------------------------------------------------------------
 \begin{tikzpicture}
 \tikzstyle{lan}=[font=\footnotesize,inner ysep=2pt,minimum height=1em]
-\node[minimum height=3em,minimum width=8em,fill=orange!20,draw,rounded corners=2pt,align=center] (sys) at (0,0){多语言 \\ 单模型系统};
-\node[draw,font=\footnotesize,minimum width=4em,fill=blue!20,rounded corners=1pt] (en) at (-3em,4em){英语};
-\node[draw,font=\footnotesize,minimum width=4em,fill=blue!20,rounded corners=1pt] (fr) at (3em,4em){法语};
+\node[minimum height=3em,minimum width=8em,fill=orange!20,draw,rounded corners=2pt,align=center,line width=0.6pt] (sys) at (0,0){多语言 \\ 单模型系统};
+\node[draw,font=\footnotesize,minimum width=4em,fill=red!20,rounded corners=1pt,line width=0.6pt] (en) at (-3em,4em){英语};
+\node[draw,font=\footnotesize,minimum width=4em,fill=red!20,rounded corners=1pt,line width=0.6pt] (fr) at (3em,4em){法语};
 \node[minimum width=4em]  at (6.6em,4em){$\dots$};
-\node[draw,font=\footnotesize,minimum width=4em,fill=yellow!20,rounded corners=1pt] (de) at (-3em,-4em){德语};
-\node[draw,font=\footnotesize,minimum width=4em,fill=yellow!20,rounded corners=1pt] (sp) at (3em,-4em){西班牙语};
+\node[draw,font=\footnotesize,minimum width=4em,fill=blue!20,rounded corners=1pt,line width=0.6pt] (de) at (-3em,-4em){德语};
+\node[draw,font=\footnotesize,minimum width=4em,fill=blue!20,rounded corners=1pt,line width=0.6pt] (sp) at (3em,-4em){西班牙语};
 \node[minimum width=4em]  at (6.6em,-4em){$\dots$};

 \draw[->,thick] (en.-90) -- ([xshift=-1em]sys.90);
@@ -18,27 +18,21 @@

 \node[font=\footnotesize] (train) at (11em,7em) {\small\bfnew{训练阶段：}};
 \node[anchor=north,font=\footnotesize] (pair1) at ([yshift=-1em,xshift=1em]train.south) {双语句对1：};
-\node[anchor=west,draw=blue!40,lan,minimum width=9.8em,fill=blue!20] (box1) at ([yshift=.7em,xshift=0.4em]pair1.east) {};
-\node[anchor=west,lan] at ([yshift=.7em,xshift=0.4em]pair1.east) {英语：{\color{red}<spanish>} \ hello};
-\node[anchor=west,draw=yellow!40,lan,minimum width=9.8em,fill=yellow!20] (box2) at ([yshift=-.7em,xshift=0.4em]pair1.east) {};
-\node[anchor=west,lan] at ([yshift=-.7em,xshift=0.4em]pair1.east) {西班牙语：hola};
+\node[anchor=west,lan](train1) at ([yshift=.7em,xshift=0.4em]pair1.east) {英语：{\color{red}<spanish>} \ hello};
+\node[anchor=west,lan](train2) at ([yshift=-.7em,xshift=0.4em]pair1.east) {西班牙语：hola};
 \node[anchor=north,font=\footnotesize] (pair2) at ([yshift=-4.5em,xshift=1em]train.south) {双语句对2：};
-\node[anchor=west,draw=blue!40,lan,minimum width=9.8em,fill=blue!20] (box3) at ([yshift=.7em,xshift=0.4em]pair2.east) {};
-\node[anchor=west,lan] at ([yshift=.7em,xshift=0.4em]pair2.east) {法语：{\color{red}<german>} \ Bonjour};
-\node[anchor=west,draw=yellow!40,lan,minimum width=9.8em,fill=yellow!20] (box4) at ([yshift=-.7em,xshift=0.4em]pair2.east) {};
-\node[anchor=west,lan] at ([yshift=-.7em,xshift=0.4em]pair2.east) {德语：Hallo};
+\node[anchor=west,lan](train3) at ([yshift=.7em,xshift=0.4em]pair2.east) {法语：{\color{red}<german>} \ Bonjour};
+\node[anchor=west,lan](train4) at ([yshift=-.7em,xshift=0.4em]pair2.east) {德语：Hallo};
 \node[anchor=north,font=\footnotesize] (decode) at ([yshift=-8em]train.south) {\small\bfnew{解码阶段：}};
-\node[anchor=north,font=\footnotesize] (input) at ([yshift=-0.6em]decode.south) {输入：};
-\node[anchor=west,draw=blue!40,lan,minimum width=9.8em,fill=blue!20] (box5) at ([xshift=0.4em]input.east) {};
-\node[anchor=west,lan] at ([xshift=0.4em]input.east) {英语：{\color{red}<german>} \ hello};
-\node[anchor=north,font=\footnotesize] (output) at ([yshift=-2.6em]decode.south) {输出：};
-\node[anchor=west,draw=yellow!40,lan,minimum width=9.8em,fill=yellow!20] (box6) at ([xshift=0.4em]output.east) {};
-\node[anchor=west,lan] at ([xshift=0.4em]output.east) {德语：Hallo};
-\node[anchor=north,lan,minimum width=9.8em] (box7) at ([yshift=-2em]box4.south) {};
+\node[anchor=north,font=\footnotesize] (input) at ([xshift=2.13em,yshift=-0.6em]decode.south) {输入：};
+\node[anchor=west,lan](decode2) at ([xshift=0.4em]input.east) {英语：{\color{red}<german>} \ hello};
+\node[anchor=north,font=\footnotesize] (output) at ([xshift=2.13em,yshift=-2.6em]decode.south) {输出：};
+\node[anchor=west,lan](decode3) at ([xshift=0.4em]output.east) {德语：Hallo};
+\node[anchor=north,lan,minimum width=9.8em] (box7) at ([yshift=-4em]train3.south) {};

 \begin{pgfonlayer}{background}
-\node[fill=red!15,draw=red!30,rounded corners=2pt,inner ysep=6pt,line width=1pt][fit=(train)(box4)]{};
-\node[fill=green!20,,draw=green!40,rounded corners=2pt,inner ysep=6pt,line width=1pt][fit=(decode)(box7)(box6)]{};
+\node[fill=red!20,draw=black,rounded corners=2pt,inner ysep=6pt,line width=1pt][fit=(train)(train4)(train1)(train2)(train3)]{};
+\node[fill=blue!20,,draw=black,rounded corners=2pt,inner ysep=6pt,line width=1pt][fit=(decode)(output)(decode2)(decode3)(box7)]{};
 \end{pgfonlayer}
 \end{tikzpicture}


--- a/Chapter16/Figures/figure-multitask-learning-in-machine-translation-1.tex
+++ b/Chapter16/Figures/figure-multitask-learning-in-machine-translation-1.tex
+
+%%% outline
+%-------------------------------------------------------------------------
+\begin{tikzpicture}
+\tikzstyle{rec} = [line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em]
+
+
+
+\node [anchor=center] (node1-1) at (0,0) {\small{$y'$}};
+\node[anchor=north,rec,fill=blue!20](node1-2) at ([yshift=-2.0em]node1-1.south) {\small{解码器}};
+\node[anchor=north,rec,fill=red!20](node1-3) at ([yshift=-2em]node1-2.south) {\small{编码器}};
+\node[anchor=east](node1-5) at ([xshift=-2em]node1-2.west) {\small{$y$}};
+\node[anchor=north](node1-4) at ([yshift=-2em]node1-3.south) {\small{$x$}};
+\draw [->,thick](node1-4.north)--(node1-3.south);
+\draw [->,thick](node1-5.east)--(node1-2.west);
+\draw [->,thick](node1-3.north)--(node1-2.south);
+\draw [->,thick](node1-2.north)--(node1-1.south);
+
+\node [anchor=center] (node2-1) at ([xshift=12.0em]node1-1.east) {\small{$y'$}};
+\node[anchor=north,rec,fill=blue!20](node2-2) at ([yshift=-2.0em]node2-1.south) {\small{解码器}};
+\node[anchor=north,rec,fill=red!20](node2-3) at ([yshift=-2em]node2-2.south) {\small{编码器}};
+\node[anchor=east](node2-5) at ([xshift=-2em]node2-2.west) {\small{$y$}};
+\node[anchor=north](node2-4) at ([yshift=-2em]node2-3.south) {\small{$x$}};
+\node[anchor=west,rec,fill=yellow!20](node2-6) at ([xshift=3.0em]node2-3.east) {\small{解码器}};
+\node[anchor=south](node2-7) at ([yshift=2em]node2-6.north) {\small{$x'$}};
+
+\draw [->,thick](node2-4.north)--(node2-3.south);
+\draw [->,thick](node2-5.east)--(node2-2.west);
+\draw [->,thick](node2-3.north)--(node2-2.south)node[pos=0.5,left,font=\scriptsize]{翻译};
+\draw [->,thick](node2-2.north)--(node2-1.south);
+\draw [->,thick](node2-3.east)--(node2-6.west)node[pos=0.5,above,font=\scriptsize]{重排序};
+\draw [->,thick](node2-6.north)--(node2-7.south);
+
+
+
+\node [anchor=north](pos1) at ([yshift=0em]node1-4.south) {\small{(a)单任务学习}};
+\node [anchor=west](pos2) at ([xshift=10.0em]pos1.east) {\small{(b)多任务学习}};
+
+\end{tikzpicture}
\ No newline at end of file
--- a/Chapter16/Figures/figure-target-side-multi-task-learning.tex
+++ b/Chapter16/Figures/figure-target-side-multi-task-learning.tex
 \begin{tikzpicture}
 \begin{scope}
 \node [anchor=center] (node1-1) at (0,0) {\small{$y'$}};
-\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4em,fill=blue!20](node1-2) at ([yshift=-3em]node1-1.south) {\small{softmax}};
+\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4.3em,fill=blue!20](node1-2) at ([yshift=-3em]node1-1.south) {\small{Softmax}};

-\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node1-3) at ([yshift=-2.0em]node1-2.south) {\small{Decoder}};
-\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=yellow!20](node3-3) at ([yshift=-2.0em]node1-3.south) {\small{LM}};
+\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=blue!20](node1-3) at ([yshift=-2.0em]node1-2.south) {\small{解码器}};
+\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=yellow!20](node3-3) at ([yshift=-2.0em]node1-3.south) {\small{语言模型}};


-\node[anchor=west,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4em,fill=blue!20](node3-2) at ([xshift=2em]node3-3.east) {\small{softmax}};
+\node[anchor=west,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4.3em,fill=blue!20](node3-2) at ([xshift=2em]node3-3.east) {\small{Softmax}};
 \node [anchor=north] (node3-1) at ([yshift=3.0em]node3-2.north) {\small{$z'$}};


 \node[anchor=north](node3-41) at ([xshift=-0.6em,yshift=-2em]node3-3.south) {\small{$y$}};
 \node[anchor=north](node3-42) at ([xshift=0.6em,yshift=-2em]node3-3.south) {\small{$z$}};

-\node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node2-1) at ([xshift=-2em]node1-3.west) {\small{Encoder}};
+\node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=red!20](node2-1) at ([xshift=-2em]node1-3.west) {\small{编码器}};
 \node[anchor=north](node2-2) at ([yshift=-2em]node2-1.south) {\small{$x$}};


@@ -34,9 +34,9 @@


 \node [anchor=east] (node2-1-1) at ([xshift=-12.0em,yshift=-4.25em]node1-1.west) {\small{$y'$}};
-\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4em,fill=blue!20](node2-1-2) at ([yshift=-3em]node2-1-1.south) {\small{softmax}};
-\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node2-1-3) at ([yshift=-2.0em]node2-1-2.south) {\small{Decoder}};
-\node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node2-2-1) at ([xshift=-2em]node2-1-3.west) {\small{Encoder}};
+\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4.3em,fill=blue!20](node2-1-2) at ([yshift=-3em]node2-1-1.south) {\small{Softmax}};
+\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=blue!20](node2-1-3) at ([yshift=-2.0em]node2-1-2.south) {\small{解码器}};
+\node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=red!20](node2-2-1) at ([xshift=-2em]node2-1-3.west) {\small{编码器}};
 \node[anchor=north](node2-2-2) at ([yshift=-2em]node2-2-1.south) {\small{$x$}};
 \node[anchor=north](node2-2-3) at ([yshift=-2em]node2-1-3.south) {\small{$y$}};


--- a/Chapter16/Figures/figure-optimization-of-the-model-initialization-method.tex
+++ b/Chapter16/Figures/figure-optimization-of-the-model-initialization-method.tex
@@ -3,18 +3,18 @@
 \begin{tikzpicture}
 \begin{scope}
 % ,minimum height =1em,minimum width=2em
-\tikzstyle{circle} = [draw,black,very thick,inner sep=3.5pt,rounded corners=4pt,minimum width=2em]
+\tikzstyle{circle} = [draw,black,line width=0.6pt,inner sep=3.5pt,rounded corners=4pt,minimum width=2em]
 \tikzstyle{word} = [inner sep=3.5pt]

-\node[circle](data) at (0,0) {数据};
-\node[circle](model) at ([xshift=5em]data.east) {模型};
+\node[circle,fill=red!20](data) at (0,0) {数据};
+\node[circle,fill=blue!20](model) at ([xshift=5em]data.east) {模型};
 \node[word] (init) at ([xshift=-5em]data.west){初始化};

-\draw[->,very thick] (init.east) -- ([xshift=-0.2em]data.west);
-\draw [->,very thick] ([yshift=1pt]data.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt]model.north) node[above,midway] {参数优化};
-\draw [->,very thick] ([yshift=1pt]model.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt]data.south) node[below,midway] {数据优化};
+\draw[->,thick] (init.east) -- ([xshift=-0.2em]data.west);
+\draw [->,thick] ([yshift=1pt]data.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt]model.north) node[above,midway] {参数优化};
+\draw [->,thick] ([yshift=1pt]model.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt]data.south) node[below,midway] {数据优化};

-\node[word] at ([yshift=-5em]data.south){（a）思路1};
+\node[word] at ([xshift=-0.5em,yshift=-5em]data.south){（a）思路1};

 \end{scope}
 \end{tikzpicture}
@@ -22,18 +22,18 @@
 \begin{tikzpicture}
 \begin{scope}
 % ,minimum height =1em,minimum width=2em
-\tikzstyle{circle} = [draw,black,very thick,inner sep=3.5pt,rounded corners=4pt,minimum width=2em]
+\tikzstyle{circle} = [draw,black,line width=0.6pt,inner sep=3.5pt,rounded corners=4pt,minimum width=2em]
 \tikzstyle{word} = [inner sep=3.5pt]

-\node[circle](data) at (0,0) {数据};
-\node[circle](model) at ([xshift=5em]data.east) {模型};
+\node[circle,fill=red!20](data) at (0,0) {数据};
+\node[circle,fill=blue!20](model) at ([xshift=5em]data.east) {模型};
 \node[word] (init) at ([xshift=5em]model.east){初始化};

-\draw[->,very thick] (init.west) -- ([xshift=0.2em]model.east);
-\draw [->,very thick] ([yshift=1pt]data.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt]model.north) node[above,midway] {参数优化};
-\draw [->,very thick] ([yshift=1pt]model.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt]data.south) node[below,midway] {数据优化};
+\draw[->,thick] (init.west) -- ([xshift=0.2em]model.east);
+\draw [->,thick] ([yshift=1pt]data.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt]model.north) node[above,midway] {参数优化};
+\draw [->,thick] ([yshift=1pt]model.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt]data.south) node[below,midway] {数据优化};

-\node[word] at ([yshift=-5em]model.south){（b）思路2};
+\node[word] at ([xshift=-0.5em,yshift=-5em]model.south){（b）思路2};

 \end{scope}
 \end{tikzpicture}

--- a/Chapter16/Figures/figure-parameter-initialization-method-diagram.tex
+++ b/Chapter16/Figures/figure-parameter-initialization-method-diagram.tex
@@ -4,13 +4,13 @@
 \begin{tikzpicture}
 	\tikzstyle{node}=[rounded corners=4pt,draw,minimum height=3em,drop shadow,font=\footnotesize]

-\node[node,minimum width=6em,minimum height=2.4em,fill=blue!20] (encoder1) at (0,0){\small 编码器};
-\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!20] (encoder2) at ([xshift=4em,yshift=0em]encoder1.east){\small 编码器};
-\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=red!20] (encoder3) at ([xshift=3em]encoder2.east){\small 编码器};
+\node[node,minimum width=6em,minimum height=2.4em,fill=red!20,line width=0.6pt] (encoder1) at (0,0){\small 编码器};
+\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=red!20,line width=0.6pt] (encoder2) at ([xshift=4em,yshift=0em]encoder1.east){\small 编码器};
+\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=red!40,line width=0.6pt] (encoder3) at ([xshift=3em]encoder2.east){\small 编码器};

-\node[node,anchor=north,minimum width=6em,minimum height=2.4em,fill=blue!20] (decoder1) at ([yshift=-3em]encoder1.south){\small 解码器};
-\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!20] (decoder2) at ([xshift=4em,yshift=0em]decoder1.east){\small 解码器};
-\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=red!20] (decoder3) at ([xshift=3em]decoder2.east){\small 解码器};
+\node[node,anchor=north,minimum width=6em,minimum height=2.4em,fill=blue!20,line width=0.6pt] (decoder1) at ([yshift=-3em]encoder1.south){\small 解码器};
+\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!20,line width=0.6pt] (decoder2) at ([xshift=4em,yshift=0em]decoder1.east){\small 解码器};
+\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!40,line width=0.6pt] (decoder3) at ([xshift=3em]decoder2.east){\small 解码器};

 \node[anchor=north,font=\scriptsize,fill=yellow!20] (w1) at ([yshift=-1.6em]decoder1.south){知识 \ 就是 \ 力量 \ 。 \ <EOS>};
 \node[anchor=north,font=\scriptsize,fill=green!20] (w3) at ([yshift=-1.6em]decoder3.south){Wissen  \ ist \ Machit \ . \ <EOS>};
@@ -24,7 +24,7 @@
 \draw[->,thick] (w4.-90) -- (encoder3.90);

 \node [anchor=north,single arrow,minimum height=2.2em,fill=blue!20,rotate=-90] (arrow1) at ([yshift=-1.4em,xshift=0.4em]encoder1.south) {};
-\node [anchor=north,single arrow,minimum height=2.2em,fill=blue!20,rotate=-90] (arrow2) at ([yshift=-1.4em,xshift=0.4em]encoder2.south) {};
+\node [anchor=north,single arrow,minimum height=2.2em,fill=red!20,rotate=-90] (arrow2) at ([yshift=-1.4em,xshift=0.4em]encoder2.south) {};
 \node [anchor=north,single arrow,minimum height=2.2em,fill=red!20,rotate=-90] (arrow3) at ([yshift=-1.4em,xshift=0.4em]encoder3.south) {};

 \node[anchor=south,yshift=3.4em] at (encoder1.north){\small\bfnew{父模型}};

--- a/Chapter16/Figures/figure-pivot-based-translation-process.tex
+++ b/Chapter16/Figures/figure-pivot-based-translation-process.tex
@@ -3,11 +3,11 @@
 %-------------------------------------------------------------------------
 \begin{tikzpicture}

-\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20] (x) at (0,0) {$\seq{x}$};
+\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20,line width=0.6pt] (x) at (0,0) {$\seq{x}$};

-\node[draw,circle,inner sep=2pt,minimum size=2em,fill=red!15] (p) at (2,0) {$\seq{p}$};
+\node[draw,circle,inner sep=2pt,minimum size=2em,fill=red!20,line width=0.6pt] (p) at (2,0) {$\seq{p}$};

-\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20] (y) at (4,0) {$\seq{y}$};
+\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20,line width=0.6pt] (y) at (4,0) {$\seq{y}$};

 \draw[-,dashed,thick,black!50] (x.0) -- (p.180);
 \draw[-,dashed,thick,black!50] (p.0) -- (y.180);

--- a/Chapter16/Figures/figure-schematic-of-the-domain-discriminator.jpg
+++ b/Chapter16/Figures/figure-schematic-of-the-domain-discriminator.jpg
--- a/Chapter16/Figures/figure-schematic-of-the-domain-discriminator.tex
+++ b/Chapter16/Figures/figure-schematic-of-the-domain-discriminator.tex
+\begin{tikzpicture}
+\tikzstyle{rec} = [,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=blue!20]
+\node [anchor=center](node1) at (0,0) {源语言};
+
+\node [anchor=west,rec,fill=red!20](node2) at ([xshift=2.0em]node1.east){编码器};
+\node [anchor=west,rec](node3) at ([xshift=3.0em,yshift=2.0em]node2.east){解码器};
+\node [anchor=west,rec,fill=yellow!20](node4) at ([xshift=3.0em,yshift=-2.0em]node2.east){鉴别器};
+
+\draw [->,thick](node1.east)--(node2.west);
+\draw [->,thick](node2.east)--([xshift=1.5em]node2.east)--([xshift=1.5em,yshift=2.0em]node2.east)--(node3.west);
+\draw [->,thick](node2.east)--([xshift=1.5em]node2.east)--([xshift=1.5em,yshift=-2.0em]node2.east)--(node4.west);
+\node [anchor=west,minimum width=5.0em](node5) at ([xshift=2.0em]node3.east) {目标语言};
+\node [anchor=west,minimum width=5.0em](node6) at ([xshift=2.0em]node4.east) {< 领域 >};
+\draw [->,thick](node3.east)--(node5.west);
+\draw [->,thick](node4.east)--(node6.west);
+\end{tikzpicture}
\ No newline at end of file
--- a/Chapter16/Figures/figure-shared-space-inductive-bilingual-dictionary.png
+++ b/Chapter16/Figures/figure-shared-space-inductive-bilingual-dictionary.png
--- a/Chapter16/Figures/figure-shared-space-inductive-bilingual-dictionary.tex
+++ b/Chapter16/Figures/figure-shared-space-inductive-bilingual-dictionary.tex
@@ -47,38 +47,38 @@
 \node [anchor=south](pos2-2) at ([yshift=-0.5em]pos2.north){\scriptsize{词典}};

 %circle1
-\node[rec,anchor=center,rotate=60,fill=green!30](c1x1) at ([xshift=-7em,yshift=-1.4em]circle1.east){\tiny{1}};
-\node[rec,anchor=center,rotate=60,fill=green!30](c1x2) at ([xshift=-4.5em,yshift=1.8em]circle1.east){\tiny{2}};
-\node[rec,anchor=center,rotate=60,fill=green!30](c1x3) at ([xshift=-4em,yshift=-0.5em]circle1.east){\tiny{3}};
-\node[rec,anchor=center,rotate=60,fill=green!30](c1x4) at ([xshift=-3.5em,yshift=-2.5em]circle1.east){\tiny{4}};
-\node[rec,anchor=center,rotate=60,fill=green!30](c1x5) at ([xshift=-2em,yshift=1.0em]circle1.east){\tiny{5}};
+\node[rec,anchor=center,rotate=60,fill=green!40](c1x1) at ([xshift=-7em,yshift=-1.4em]circle1.east){\tiny{1}};
+\node[rec,anchor=center,rotate=60,fill=green!40](c1x2) at ([xshift=-4.5em,yshift=1.8em]circle1.east){\tiny{2}};
+\node[rec,anchor=center,rotate=60,fill=green!40](c1x3) at ([xshift=-4em,yshift=-0.5em]circle1.east){\tiny{3}};
+\node[rec,anchor=center,rotate=60,fill=green!40](c1x4) at ([xshift=-3.5em,yshift=-2.5em]circle1.east){\tiny{4}};
+\node[rec,anchor=center,rotate=60,fill=green!40](c1x5) at ([xshift=-2em,yshift=1.0em]circle1.east){\tiny{5}};

 %circle2
-\node[cir,anchor=center,rotate=-30,fill=red!30] (c2a) at ([xshift=-5.3em,yshift=2.15em]circle2.east){\tiny{a}};
-\node[cir,anchor=east,rotate=-30,fill=red!30] (c2b) at ([xshift=2.0em,yshift=-1.25em]c2a.east){\tiny{b}};
-\node[cir,anchor=east,rotate=-30,fill=red!30] (c2c) at ([xshift=0.8em,yshift=-3.9em]c2a.south){\tiny{c}};
-\node[cir,anchor=east,rotate=-30,fill=red!30] (c2x) at ([xshift=-0.3em,yshift=-1.9em]c2a.south){\tiny{x}};
-\node[cir,anchor=west,rotate=-30,fill=red!30] (c2y) at ([xshift=1.15em,yshift=-2.85em]c2a.east){\tiny{y}};
+\node[cir,anchor=center,rotate=-30,fill=red!40] (c2a) at ([xshift=-5.3em,yshift=2.15em]circle2.east){\tiny{a}};
+\node[cir,anchor=east,rotate=-30,fill=red!40] (c2b) at ([xshift=2.0em,yshift=-1.25em]c2a.east){\tiny{b}};
+\node[cir,anchor=east,rotate=-30,fill=red!40] (c2c) at ([xshift=0.8em,yshift=-3.9em]c2a.south){\tiny{c}};
+\node[cir,anchor=east,rotate=-30,fill=red!40] (c2x) at ([xshift=-0.3em,yshift=-1.9em]c2a.south){\tiny{x}};
+\node[cir,anchor=west,rotate=-30,fill=red!40] (c2y) at ([xshift=1.15em,yshift=-2.85em]c2a.east){\tiny{y}};

 %circle3
-\node[rec,anchor=center,rotate=-30,fill=green!30] (c3x1) at ([xshift=-6.7em,yshift=1.75em]circle3.east){\tiny{1}};
-\node[rec,anchor=east,rotate=-30,fill=green!30] (c3x2) at ([xshift=4.7em,yshift=-0.95em]c3x1.east){\tiny{2}};
-\node[rec,anchor=east,rotate=-30,fill=green!30] (c3x3) at ([xshift=2.6em,yshift=-2.4em]c3x1.south){\tiny{3}};
-\node[rec,anchor=east,rotate=-30,fill=green!30] (c3x4) at ([xshift=0.35em,yshift=-2.7em]c3x1.south){\tiny{4}};
-\node[rec,anchor=west,rotate=-30,fill=green!30] (c3x5) at ([xshift=2.35em,yshift=-3.85em]c3x1.east){\tiny{5}};
+\node[rec,anchor=center,rotate=-30,fill=green!40] (c3x1) at ([xshift=-6.7em,yshift=1.75em]circle3.east){\tiny{1}};
+\node[rec,anchor=east,rotate=-30,fill=green!40] (c3x2) at ([xshift=4.7em,yshift=-0.95em]c3x1.east){\tiny{2}};
+\node[rec,anchor=east,rotate=-30,fill=green!40] (c3x3) at ([xshift=2.6em,yshift=-2.4em]c3x1.south){\tiny{3}};
+\node[rec,anchor=east,rotate=-30,fill=green!40] (c3x4) at ([xshift=0.35em,yshift=-2.7em]c3x1.south){\tiny{4}};
+\node[rec,anchor=west,rotate=-30,fill=green!40] (c3x5) at ([xshift=2.35em,yshift=-3.85em]c3x1.east){\tiny{5}};

 %circle4
-\node[rec,anchor=center,rotate=-30,fill=green!30] (c4x1) at ([xshift=-6.7em,yshift=1.75em]circle4.east){\tiny{1}};
-\node[rec,anchor=east,rotate=-30,fill=green!30] (c4x2) at ([xshift=4.7em,yshift=-0.95em]c4x1.east){\tiny{2}};
-\node[rec,anchor=east,rotate=-30,fill=green!30] (c4x3) at ([xshift=2.6em,yshift=-2.4em]c4x1.south){\tiny{3}};
-\node[rec,anchor=east,rotate=-30,fill=green!30] (c4x4) at ([xshift=0.35em,yshift=-2.7em]c4x1.south){\tiny{4}};
-\node[rec,anchor=west,rotate=-30,fill=green!30] (c4x5) at ([xshift=2.35em,yshift=-3.85em]c4x1.east){\tiny{5}};
+\node[rec,anchor=center,rotate=-30,fill=green!40] (c4x1) at ([xshift=-6.7em,yshift=1.75em]circle4.east){\tiny{1}};
+\node[rec,anchor=east,rotate=-30,fill=green!40] (c4x2) at ([xshift=4.7em,yshift=-0.95em]c4x1.east){\tiny{2}};
+\node[rec,anchor=east,rotate=-30,fill=green!40] (c4x3) at ([xshift=2.6em,yshift=-2.4em]c4x1.south){\tiny{3}};
+\node[rec,anchor=east,rotate=-30,fill=green!40] (c4x4) at ([xshift=0.35em,yshift=-2.7em]c4x1.south){\tiny{4}};
+\node[rec,anchor=west,rotate=-30,fill=green!40] (c4x5) at ([xshift=2.35em,yshift=-3.85em]c4x1.east){\tiny{5}};

-\node[cir,anchor=center,rotate=-30,fill=red!30] (c4a) at ([xshift=-5.3em,yshift=2.15em]circle4.east){\tiny{a}};
-\node[cir,anchor=east,rotate=-30,fill=red!30] (c4b) at ([xshift=2.0em,yshift=-1.25em]c4a.east){\tiny{b}};
-\node[cir,anchor=east,rotate=-30,fill=red!30] (c4c) at ([xshift=0.8em,yshift=-3.9em]c4a.south){\tiny{c}};
-\node[cir,anchor=east,rotate=-30,fill=red!30] (c4x) at ([xshift=-0.3em,yshift=-1.9em]c4a.south){\tiny{x}};
-\node[cir,anchor=west,rotate=-30,fill=red!30] (c4y) at ([xshift=1.15em,yshift=-2.85em]c4a.east){\tiny{y}};
+\node[cir,anchor=center,rotate=-30,fill=red!40] (c4a) at ([xshift=-5.3em,yshift=2.15em]circle4.east){\tiny{a}};
+\node[cir,anchor=east,rotate=-30,fill=red!40] (c4b) at ([xshift=2.0em,yshift=-1.25em]c4a.east){\tiny{b}};
+\node[cir,anchor=east,rotate=-30,fill=red!40] (c4c) at ([xshift=0.8em,yshift=-3.9em]c4a.south){\tiny{c}};
+\node[cir,anchor=east,rotate=-30,fill=red!40] (c4x) at ([xshift=-0.3em,yshift=-1.9em]c4a.south){\tiny{x}};
+\node[cir,anchor=west,rotate=-30,fill=red!40] (c4y) at ([xshift=1.15em,yshift=-2.85em]c4a.east){\tiny{y}};

 \draw [color=red,line width=0.7pt,rotate=18] ([xshift=-5.1em,yshift=3.7em]circle4.east) ellipse (1.6em and 0.9em); 
 \draw [color=red,line width=0.7pt,rotate=-5] ([xshift=-2.8em,yshift=0.6em]circle4.east) ellipse (1.6em and 0.9em);

--- a/Chapter16/Figures/figure-the-meaning-of-pitch-in-different-fields.jpg
+++ b/Chapter16/Figures/figure-the-meaning-of-pitch-in-different-fields.jpg
--- a/Chapter16/Figures/figure-unmt-idea1.jpg
+++ b/Chapter16/Figures/figure-unmt-idea1.jpg
--- a/Chapter16/Figures/figure-unmt-idea2.jpg
+++ b/Chapter16/Figures/figure-unmt-idea2.jpg
--- a/Chapter16/Figures/figure-unmt-idea3.jpg
+++ b/Chapter16/Figures/figure-unmt-idea3.jpg
--- a/Chapter16/Figures/figure-unmt-process.jpg
+++ b/Chapter16/Figures/figure-unmt-process.jpg
--- a/Chapter16/Figures/figure-unmt-process.tex
+++ b/Chapter16/Figures/figure-unmt-process.tex

 \begin{tikzpicture}
 \begin{scope}
-% ,minimum height =1em,minimum width=2em
-\tikzstyle{circle} = [draw,black,very thick,inner sep=3.5pt,rounded corners=4pt,minimum width=2em,align=center]
+\tikzstyle{circle} = [draw,black,line width=0.6pt,inner sep=3.5pt,rounded corners=4pt,minimum width=2em,align=center,fill=blue!20]
 \tikzstyle{word} = [inner sep=3.5pt]

 \node[circle](center) at (0,0) {
 \begin{tabular}{c | c}
-$s\rightarrow t$ & $t\rightarrow s$ \\
+$x\rightarrow y$ & $y\rightarrow x$ \\
 模型 & 模型
 \end{tabular}
 };
-\node[circle] (left) at ([xshift=-9em]center.west) {$s\rightarrow t$ \\ 数据};
-\node[circle] (right) at ([xshift=9em]center.east) {$t\rightarrow s$ \\ 数据};
+\node[circle,fill=red!20] (left) at ([xshift=-9em]center.west) {$x\rightarrow y$ \\ 数据};
+\node[circle,fill=red!20] (right) at ([xshift=9em]center.east) {$y\rightarrow x$ \\ 数据};

 \node[word] (init) at ([yshift=6em]center.north){初始化};

-\node[circle] (down) at ([yshift=-8em]center.south) {$s,t$ \\ 数据};
+\node[circle,fill=red!20] (down) at ([yshift=-8em]center.south) {$x,y$ \\ 数据};

-\draw[->,very thick] (init.south) -- ([yshift=0.2em]center.north);
-\draw[->,very thick] ([yshift=0.2em]down.north) -- ([yshift=-0.2em]center.south) node[pos=.44,midway,align=center] {语言模型\\目标函数\\（模型优化）};
+\draw[->,thick] (init.south) -- ([yshift=0.2em]center.north);
+\draw[->,thick] ([yshift=0.2em]down.north) -- ([yshift=-0.2em]center.south) node[pos=0.6,midway,align=left,xshift=-2.5em,yshift=0.5em] {语言模型\\目标函数};
+\node [anchor=center] at ([yshift=2.0em,xshift=-2.5em]down.north){（模型优化）};
+\draw[->,thick] ([yshift=1pt]left.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt,xshift=-2.2em]center.north) node[above,midway,align=center] {翻译模型目标函数\\（模型优化）};
+\draw[->,thick] ([yshift=1pt,xshift=-1.8em]center.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt]right.north) node[above,pos=0.6,align=center] {回译\\（数据优化）};

-\draw[->,very thick] ([yshift=1pt]left.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt,xshift=-2.2em]center.north) node[above,midway,align=center] {正常MT目标函数\\（模型优化）};
-\draw[->,very thick] ([yshift=1pt,xshift=-1.8em]center.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt]right.north) node[above,pos=0.6,align=center] {回译\\（数据优化）};
-
-\draw [->,very thick] ([yshift=1pt]right.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt,xshift=2.2em]center.south) node[below,midway,align=center] {正常MT目标函数\\（模型优化）};
-\draw [->,very thick] ([yshift=1pt,xshift=1.8em]center.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt]left.south) node[below,pos=0.6,align=center] {回译\\（数据优化）};
-
-
-%\draw[->,very thick] (init.east) -- ([xshift=-0.2em]data.west);
-%\draw [->,very thick] ([yshift=1pt]data.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt]model.north) node[above,midway] {参数优化};
-%\draw [->,very thick] ([yshift=1pt]model.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt]data.south) node[below,midway] {数据优化};
-
-%\node[word] at ([yshift=-5em]data.south){（a）思路1};
+\draw [->,thick] ([yshift=1pt]right.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt,xshift=2.2em]center.south) node[below,midway,align=center] {翻译模型目标函数\\（模型优化）};
+\draw [->,thick] ([yshift=1pt,xshift=1.8em]center.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt]left.south) node[below,pos=0.6,align=center] {回译\\（数据优化）};

 \end{scope}
 \end{tikzpicture}
--- a/Chapter16/Figures/figure-unsupervised-dual-learning-process.png
+++ b/Chapter16/Figures/figure-unsupervised-dual-learning-process.png
--- a/Chapter16/Figures/figure-unsupervised-dual-learning-process.tex
+++ b/Chapter16/Figures/figure-unsupervised-dual-learning-process.tex
+\begin{tikzpicture}
+
+\tikzstyle{circle} = [draw,black,line width=0.6pt,inner sep=3.5pt,rounded corners=4pt,minimum width=2em]
+\tikzstyle{word} = [inner sep=3.5pt]
+
+\node [anchor=center] (node1-1) at (0,0) {\small{\seq{x}}};
+\node [anchor=west] (node1-2) at ([xshift=0.8em]node1-1.east) {\small{\seq{y}}};
+\node [anchor=north] (node1-3) at ([xshift=1.0em]node1-1.south) {\small{翻译模型f}};
+\draw [->,line width=0.6pt](node1-1.east)--(node1-2.west);
+
+\begin{pgfonlayer}{background}
+{
+\node[fill=blue!20,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=5em,drop shadow,rounded corners=2pt] [fit =(node1-1)(node1-2)(node1-3)]  (remark1) {};
+}
+\end{pgfonlayer}
+
+\node[anchor=north,circle,fill=red!20,minimum width=6.8em](node2) at ([xshift=-6.0em,yshift=-2.0em]remark1.south) {源语言句子$\seq{x}$};
+\node[anchor=north,circle,fill=red!20,minimum width=6.8em](node2-2) at ([yshift=-0.2em]node2.south) {新生成句子$\seq{x'}$};
+\draw [->,thick]([yshift=0.2em]node2.north).. controls (-1.93,-1.5) and (-2.0,-0.2)..([xshift=-0.2em]remark1.west);
+\node[anchor=north,circle,fill=red!20](node3) at ([xshift=6.5em,yshift=-2.0em]remark1.south) {目标语言句子$\seq{x}$};
+\draw [->,thick]([xshift=0.2em]remark1.east).. controls (2.9,-0.25) and (2.9,-0.7) ..([yshift=0.2em]node3.north);
+
+
+\node [anchor=north] (node4-1) at ([xshift=-1.0em,yshift=-7.0em]remark1.south) {\small{\seq{y}}};
+\node [anchor=west] (node4-2) at ([xshift=0.8em]node4-1.east) {\small{\seq{x}}};
+\node [anchor=north] (node4-3) at ([xshift=1.0em]node4-1.south) {\small{翻译模型g}};
+\draw [->,line width=0.6pt](node4-1.east)--(node4-2.west);
+
+\begin{pgfonlayer}{background}
+{
+\node[fill=yellow!20,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=5em,drop shadow,rounded corners=2pt] [fit =(node4-1)(node4-2)(node4-3)]  (remark2) {};
+}
+\end{pgfonlayer}
+
+\draw [->,thick]([xshift=-0.2em]remark2.west).. controls (-0.8,-4.12) and (-1.95,-4.12)..([yshift=-0.2em]node2-2.south);
+\draw [->,thick]([yshift=-0.2em]node3.south).. controls (2.9,-3) and (2.9,-4.1)..([xshift=0.2em]remark2.east);
+
+\end{tikzpicture}
\ No newline at end of file
--- a/Chapter16/chapter16.aux
+++ b/Chapter16/chapter16.aux
-\relax 
-\providecommand\zref@newlabel[2]{}
-\providecommand\hyper@newdestlabel[2]{}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {chapter}{\numberline {1}低资源神经机器翻译}{11}{chapter.1}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\addvspace {10\p@ }}
-\@writefile{lot}{\defcounter {refsection}{0}\relax }\@writefile{lot}{\addvspace {10\p@ }}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {1.1}数据的有效使用}{11}{section.1.1}\protected@file@percent }
-\newlabel{effective-use-of-data}{{1.1}{11}{数据的有效使用}{section.1.1}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.1.1}数据增强}{12}{subsection.1.1.1}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{1. 回译}{12}{section*.3}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.1}{\ignorespaces \color  {red}{回译方法的流程(新)} {\color  {blue} 图比以前清晰了，但是还是有些乱，可能你陷入到固有思维里了，可以找我再讨论下！}\relax }}{12}{figure.caption.4}\protected@file@percent }
-\providecommand*\caption@xref[2]{\@setref\relax\@undefined{#1}}
-\newlabel{fig:16-1-xc}{{1.1}{12}{\red {回译方法的流程(新)} {\color {blue} 图比以前清晰了，但是还是有些乱，可能你陷入到固有思维里了，可以找我再讨论下！}\relax }{figure.caption.4}{}}
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.2}{\ignorespaces \color  {red}{迭代式回译方法的流程，未修改} {\color  {blue} 这个图的逻辑我觉得是ok的，主要是这些线和过程需要再清晰一下，再找我讨论下！}\relax }}{13}{figure.caption.5}\protected@file@percent }
-\newlabel{fig:16-2-xc}{{1.2}{13}{\red {迭代式回译方法的流程，未修改} {\color {blue} 这个图的逻辑我觉得是ok的，主要是这些线和过程需要再清晰一下，再找我讨论下！}\relax }{figure.caption.5}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{2. 修改双语数据}{14}{section*.6}\protected@file@percent }
-\newlabel{add-noise}{{1.1.1}{14}{2. 修改双语数据}{section*.6}{}}
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.3}{\ignorespaces 三种加噪方法\relax }}{15}{figure.caption.7}\protected@file@percent }
-\newlabel{fig:16-4-xc}{{1.3}{15}{三种加噪方法\relax }{figure.caption.7}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{3. 双语句对挖掘}{16}{section*.8}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.4}{\ignorespaces 维基百科中的可比语料\relax }}{17}{figure.caption.9}\protected@file@percent }
-\newlabel{fig:16-5-xc}{{1.4}{17}{维基百科中的可比语料\relax }{figure.caption.9}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.1.2}基于语言模型的方法}{17}{subsection.1.1.2}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{1. 语言模型在目标端的融合}{18}{section*.10}\protected@file@percent }
-\newlabel{eq:16-1-xc}{{1.1}{18}{1. 语言模型在目标端的融合}{equation.1.1.1}{}}
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.5}{\ignorespaces \color  {red}{语言模型的浅融合与深融合，未修改} {\color  {blue} 图可以考虑删除了，要不也增加阅读的负担！}\relax }}{18}{figure.caption.11}\protected@file@percent }
-\newlabel{fig:16-6-xc}{{1.5}{18}{\red {语言模型的浅融合与深融合，未修改} {\color {blue} 图可以考虑删除了，要不也增加阅读的负担！}\relax }{figure.caption.11}{}}
-\newlabel{eq:16-2-xc}{{1.2}{18}{1. 语言模型在目标端的融合}{equation.1.1.2}{}}
-\newlabel{eq:16-3-xc}{{1.3}{19}{1. 语言模型在目标端的融合}{equation.1.1.3}{}}
-\newlabel{eq:16-4-xc}{{1.4}{19}{1. 语言模型在目标端的融合}{equation.1.1.4}{}}
-\newlabel{eq:16-5-xc}{{1.5}{19}{1. 语言模型在目标端的融合}{equation.1.1.5}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{2. 预训练词嵌入}{19}{section*.12}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{3. 预训练模型}{21}{section*.13}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.6}{\ignorespaces \color  {red}{MASS 预训练方法，重画}\relax }}{22}{figure.caption.14}\protected@file@percent }
-\newlabel{fig:16-8-xc}{{1.6}{22}{\red {MASS 预训练方法，重画}\relax }{figure.caption.14}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{4. 多任务学习}{23}{section*.15}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.7}{\ignorespaces \color  {red}{机器翻译中的多任务学习，重画}\relax }}{24}{figure.caption.16}\protected@file@percent }
-\newlabel{fig:16-9-xc}{{1.7}{24}{\red {机器翻译中的多任务学习，重画}\relax }{figure.caption.16}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {1.2}双向翻译模型}{24}{section.1.2}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.2.1}双向训练}{24}{subsection.1.2.1}\protected@file@percent }
-\newlabel{eq:16-6-xc}{{1.6}{24}{双向训练}{equation.1.2.6}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.2.2}对偶学习}{25}{subsection.1.2.2}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{1. 有监督对偶学习}{25}{section*.18}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.8}{\ignorespaces 双向训练的迭代过程\relax }}{26}{figure.caption.17}\protected@file@percent }
-\newlabel{fig:16-1-fk}{{1.8}{26}{双向训练的迭代过程\relax }{figure.caption.17}{}}
-\newlabel{eq:16-7-xc}{{1.7}{26}{1. 有监督对偶学习}{equation.1.2.7}{}}
-\newlabel{eq:16-8-xc}{{1.8}{26}{1. 有监督对偶学习}{equation.1.2.8}{}}
-\newlabel{eq:16-2-fk}{{1.9}{27}{1. 有监督对偶学习}{equation.1.2.9}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{2. 无监督对偶学习}{27}{section*.19}\protected@file@percent }
-\newlabel{eq:16-9-xc}{{1.10}{27}{2. 无监督对偶学习}{equation.1.2.10}{}}
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.9}{\ignorespaces 无监督对偶学习流程\relax }}{28}{figure.caption.20}\protected@file@percent }
-\newlabel{fig:16-10-xc}{{1.9}{28}{无监督对偶学习流程\relax }{figure.caption.20}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {1.3}多语言翻译模型}{28}{section.1.3}\protected@file@percent }
-\newlabel{multilingual-translation-model}{{1.3}{28}{多语言翻译模型}{section.1.3}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.3.1}基于枢轴语言的方法}{29}{subsection.1.3.1}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.10}{\ignorespaces 基于枢轴语言的翻译过程\relax }}{29}{figure.caption.21}\protected@file@percent }
-\newlabel{fig:16-1-ll}{{1.10}{29}{基于枢轴语言的翻译过程\relax }{figure.caption.21}{}}
-\newlabel{eq:ll-1}{{1.11}{29}{基于枢轴语言的方法}{equation.1.3.11}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.3.2}基于知识蒸馏的方法}{30}{subsection.1.3.2}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.11}{\ignorespaces 基于知识蒸馏的翻译过程\relax }}{30}{figure.caption.22}\protected@file@percent }
-\newlabel{fig:16-2-ll}{{1.11}{30}{基于知识蒸馏的翻译过程\relax }{figure.caption.22}{}}
-\newlabel{eq:ll-2}{{1.12}{30}{基于知识蒸馏的方法}{equation.1.3.12}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.3.3}基于迁移学习的方法}{31}{subsection.1.3.3}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.12}{\ignorespaces 传统机器学习\&迁移学习对比\relax }}{31}{figure.caption.23}\protected@file@percent }
-\newlabel{fig:16-3-ll}{{1.12}{31}{传统机器学习\&迁移学习对比\relax }{figure.caption.23}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{1. 参数初始化方法}{32}{section*.24}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.13}{\ignorespaces 参数初始化方法图\relax }}{32}{figure.caption.25}\protected@file@percent }
-\newlabel{fig:16-4-ll}{{1.13}{32}{参数初始化方法图\relax }{figure.caption.25}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{2. 多语言单模型系统}{32}{section*.26}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.14}{\ignorespaces 参数初始化方法图\relax }}{33}{figure.caption.27}\protected@file@percent }
-\newlabel{fig:16-5-ll}{{1.14}{33}{参数初始化方法图\relax }{figure.caption.27}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{3. 零资源翻译}{33}{section*.28}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {1.4}无监督机器翻译}{34}{section.1.4}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.4.1}无监督词典归纳}{35}{subsection.1.4.1}\protected@file@percent }
-\newlabel{unsupervised-dictionary-induction}{{1.4.1}{35}{无监督词典归纳}{subsection.1.4.1}{}}
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.15}{\ignorespaces 词典归纳原理图\relax }}{35}{figure.caption.29}\protected@file@percent }
-\newlabel{fig:16-1-lyf}{{1.15}{35}{词典归纳原理图\relax }{figure.caption.29}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{1. 方法框架}{35}{section*.30}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.16}{\ignorespaces 无监督词典归纳流程图（{\color  {red} A->a}）\textsuperscript  {\textsuperscript  {\cite {DBLP:conf/iclr/LampleCRDJ18}}}\relax }}{36}{figure.caption.31}\protected@file@percent }
-\newlabel{fig:16-2-lyf}{{1.16}{36}{无监督词典归纳流程图（{\color {red} A->a}）\upcite {DBLP:conf/iclr/LampleCRDJ18}\relax }{figure.caption.31}{}}
-\newlabel{eq:16-1}{{1.14}{37}{1. 方法框架}{equation.1.4.13}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{2. 鲁棒性问题}{37}{section*.32}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.4.2}无监督统计机器翻译}{38}{subsection.1.4.2}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{1. 无监督短语归纳}{38}{section*.33}\protected@file@percent }
-\newlabel{eq:16-2}{{1.15}{38}{1. 无监督短语归纳}{equation.1.4.15}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{2. 无监督权重调优}{39}{section*.34}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.4.3}无监督神经机器翻译}{39}{subsection.1.4.3}\protected@file@percent }
-\newlabel{unsupervised-NMT}{{1.4.3}{39}{无监督神经机器翻译}{subsection.1.4.3}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{1. 基于无监督统计机器翻译的方法}{39}{section*.35}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.17}{\ignorespaces 用无监督统计机器翻译训练神经机器翻译\relax }}{40}{figure.caption.36}\protected@file@percent }
-\newlabel{fig:16-1}{{1.17}{40}{用无监督统计机器翻译训练神经机器翻译\relax }{figure.caption.36}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{2. 基于无监督词典归纳的方法}{40}{section*.37}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{3. 更深层的融合}{40}{section*.39}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.18}{\ignorespaces 基于无监督词典归纳的方法\relax }}{41}{figure.caption.38}\protected@file@percent }
-\newlabel{fig:16-2}{{1.18}{41}{基于无监督词典归纳的方法\relax }{figure.caption.38}{}}
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.19}{\ignorespaces 模型初始化方法的优化\relax }}{41}{figure.caption.40}\protected@file@percent }
-\newlabel{fig:16-3}{{1.19}{41}{模型初始化方法的优化\relax }{figure.caption.40}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{4. 其它问题}{41}{section*.41}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.20}{\ignorespaces 无监督神经机器翻译模型训练流程\relax }}{43}{figure.caption.42}\protected@file@percent }
-\newlabel{fig:16-4}{{1.20}{43}{无监督神经机器翻译模型训练流程\relax }{figure.caption.42}{}}
-\@writefile{lot}{\defcounter {refsection}{0}\relax }\@writefile{lot}{\contentsline {table}{\numberline {1.1}{\ignorespaces 三种噪声函数（原句为``我\ 喜欢\ 吃\ 苹果\ 。''）。\relax }}{44}{table.caption.43}\protected@file@percent }
-\newlabel{tab:16-1}{{1.1}{44}{三种噪声函数（原句为``我\ 喜欢\ 吃\ 苹果\ 。''）。\relax }{table.caption.43}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {1.5}领域适应}{44}{section.1.5}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.21}{\ignorespaces 单词pitch（图里标红）在不同领域的不同词义实例\relax }}{44}{figure.caption.44}\protected@file@percent }
-\newlabel{fig:16-1-wbh}{{1.21}{44}{单词pitch（图里标红）在不同领域的不同词义实例\relax }{figure.caption.44}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.5.1}统计机器翻译中的领域适应}{45}{subsection.1.5.1}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{1. 基于混合模型的方法}{45}{section*.45}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{2. 基于数据加权的方法}{45}{section*.46}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{3. 基于数据选择的方法}{46}{section*.47}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{4. 基于伪数据的方法}{46}{section*.48}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.5.2}基于数据的神经机器翻译领域适应}{46}{subsection.1.5.2}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{1. 基于多领域数据的方法}{46}{section*.49}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{2. 基于数据选择的方法}{47}{section*.50}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{3. 基于单语数据的方法}{47}{section*.51}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {1.5.3}基于模型的神经机器翻译领域适应}{48}{subsection.1.5.3}\protected@file@percent }
-\newlabel{modeling-methods-in neural-machine-translation}{{1.5.3}{48}{基于模型的神经机器翻译领域适应}{subsection.1.5.3}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{1. 基于模型结构的方法}{48}{section*.52}\protected@file@percent }
-\@writefile{lof}{\defcounter {refsection}{0}\relax }\@writefile{lof}{\contentsline {figure}{\numberline {1.22}{\ignorespaces 领域判别器示意图\relax }}{48}{figure.caption.53}\protected@file@percent }
-\newlabel{fig:16-2-wbh}{{1.22}{48}{领域判别器示意图\relax }{figure.caption.53}{}}
-\newlabel{eq:16-1-wbh}{{1.16}{48}{1. 基于模型结构的方法}{equation.1.5.16}{}}
-\newlabel{eq:16-2-wbh}{{1.17}{48}{1. 基于模型结构的方法}{equation.1.5.17}{}}
-\newlabel{eq:16-3-wbh}{{1.18}{49}{1. 基于模型结构的方法}{equation.1.5.18}{}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{2. 基于训练策略的方法}{49}{section*.54}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsubsection}{3. 基于模型推断的方法}{50}{section*.55}\protected@file@percent }
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {1.6}小结及扩展阅读}{50}{section.1.6}\protected@file@percent }
-\@setckpt{Chapter16/chapter16}{
-\setcounter{page}{52}
-\setcounter{equation}{18}
-\setcounter{enumi}{0}
-\setcounter{enumii}{0}
-\setcounter{enumiii}{0}
-\setcounter{enumiv}{0}
-\setcounter{footnote}{0}
-\setcounter{mpfootnote}{0}
-\setcounter{part}{0}
-\setcounter{chapter}{1}
-\setcounter{section}{6}
-\setcounter{subsection}{0}
-\setcounter{subsubsection}{0}
-\setcounter{paragraph}{0}
-\setcounter{subparagraph}{0}
-\setcounter{figure}{22}
-\setcounter{table}{1}
-\setcounter{tabx@nest}{0}
-\setcounter{listtotal}{0}
-\setcounter{listcount}{0}
-\setcounter{liststart}{0}
-\setcounter{liststop}{0}
-\setcounter{citecount}{0}
-\setcounter{citetotal}{0}
-\setcounter{multicitecount}{0}
-\setcounter{multicitetotal}{0}
-\setcounter{instcount}{348}
-\setcounter{maxnames}{3}
-\setcounter{minnames}{1}
-\setcounter{maxitems}{3}
-\setcounter{minitems}{1}
-\setcounter{citecounter}{0}
-\setcounter{maxcitecounter}{0}
-\setcounter{savedcitecounter}{0}
-\setcounter{uniquelist}{0}
-\setcounter{uniquename}{0}
-\setcounter{refsection}{0}
-\setcounter{refsegment}{0}
-\setcounter{maxextratitle}{0}
-\setcounter{maxextratitleyear}{0}
-\setcounter{maxextraname}{10}
-\setcounter{maxextradate}{0}
-\setcounter{maxextraalpha}{0}
-\setcounter{abbrvpenalty}{50}
-\setcounter{highnamepenalty}{50}
-\setcounter{lownamepenalty}{25}
-\setcounter{maxparens}{3}
-\setcounter{parenlevel}{0}
-\setcounter{mincomprange}{10}
-\setcounter{maxcomprange}{100000}
-\setcounter{mincompwidth}{1}
-\setcounter{afterword}{0}
-\setcounter{savedafterword}{0}
-\setcounter{annotator}{0}
-\setcounter{savedannotator}{0}
-\setcounter{author}{0}
-\setcounter{savedauthor}{0}
-\setcounter{bookauthor}{0}
-\setcounter{savedbookauthor}{0}
-\setcounter{commentator}{0}
-\setcounter{savedcommentator}{0}
-\setcounter{editor}{0}
-\setcounter{savededitor}{0}
-\setcounter{editora}{0}
-\setcounter{savededitora}{0}
-\setcounter{editorb}{0}
-\setcounter{savededitorb}{0}
-\setcounter{editorc}{0}
-\setcounter{savededitorc}{0}
-\setcounter{foreword}{0}
-\setcounter{savedforeword}{0}
-\setcounter{holder}{0}
-\setcounter{savedholder}{0}
-\setcounter{introduction}{0}
-\setcounter{savedintroduction}{0}
-\setcounter{namea}{0}
-\setcounter{savednamea}{0}
-\setcounter{nameb}{0}
-\setcounter{savednameb}{0}
-\setcounter{namec}{0}
-\setcounter{savednamec}{0}
-\setcounter{translator}{0}
-\setcounter{savedtranslator}{0}
-\setcounter{shortauthor}{0}
-\setcounter{savedshortauthor}{0}
-\setcounter{shorteditor}{0}
-\setcounter{savedshorteditor}{0}
-\setcounter{labelname}{0}
-\setcounter{savedlabelname}{0}
-\setcounter{institution}{0}
-\setcounter{savedinstitution}{0}
-\setcounter{lista}{0}
-\setcounter{savedlista}{0}
-\setcounter{listb}{0}
-\setcounter{savedlistb}{0}
-\setcounter{listc}{0}
-\setcounter{savedlistc}{0}
-\setcounter{listd}{0}
-\setcounter{savedlistd}{0}
-\setcounter{liste}{0}
-\setcounter{savedliste}{0}
-\setcounter{listf}{0}
-\setcounter{savedlistf}{0}
-\setcounter{location}{0}
-\setcounter{savedlocation}{0}
-\setcounter{organization}{0}
-\setcounter{savedorganization}{0}
-\setcounter{origlocation}{0}
-\setcounter{savedoriglocation}{0}
-\setcounter{origpublisher}{0}
-\setcounter{savedorigpublisher}{0}
-\setcounter{publisher}{0}
-\setcounter{savedpublisher}{0}
-\setcounter{language}{0}
-\setcounter{savedlanguage}{0}
-\setcounter{origlanguage}{0}
-\setcounter{savedoriglanguage}{0}
-\setcounter{pageref}{0}
-\setcounter{savedpageref}{0}
-\setcounter{textcitecount}{0}
-\setcounter{textcitetotal}{0}
-\setcounter{textcitemaxnames}{0}
-\setcounter{biburlbigbreakpenalty}{100}
-\setcounter{biburlbreakpenalty}{200}
-\setcounter{biburlnumpenalty}{0}
-\setcounter{biburlucpenalty}{0}
-\setcounter{biburllcpenalty}{0}
-\setcounter{smartand}{1}
-\setcounter{bbx:relatedcount}{0}
-\setcounter{bbx:relatedtotal}{0}
-\setcounter{parentequation}{0}
-\setcounter{notation}{0}
-\setcounter{dummy}{0}
-\setcounter{problem}{0}
-\setcounter{exerciseT}{0}
-\setcounter{exampleT}{0}
-\setcounter{vocabulary}{0}
-\setcounter{definitionT}{0}
-\setcounter{mdf@globalstyle@cnt}{0}
-\setcounter{mdfcountframes}{0}
-\setcounter{mdf@env@i}{0}
-\setcounter{mdf@env@ii}{0}
-\setcounter{mdf@zref@counter}{0}
-\setcounter{Item}{0}
-\setcounter{Hfootnote}{0}
-\setcounter{Hy@AnnotLevel}{0}
-\setcounter{bookmark@seq@number}{0}
-\setcounter{caption@flags}{0}
-\setcounter{continuedfloat}{0}
-\setcounter{cp@cnt}{0}
-\setcounter{cp@tempcnt}{0}
-\setcounter{subfigure}{0}
-\setcounter{lofdepth}{1}
-\setcounter{subtable}{0}
-\setcounter{lotdepth}{1}
-\setcounter{@pps}{0}
-\setcounter{@ppsavesec}{0}
-\setcounter{@ppsaveapp}{0}
-\setcounter{tcbbreakpart}{0}
-\setcounter{tcblayer}{0}
-\setcounter{tcolorbox@number}{0}
-\setcounter{section@level}{1}
-}
--- a/Chapter16/chapter16.tex
+++ b/Chapter16/chapter16.tex
--- a/bibliography.bib
+++ b/bibliography.bib
@@ -8836,7 +8836,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2017}
 }
-
 @inproceedings{DBLP:conf/emnlp/EdunovOAG18,
  author    = {Sergey Edunov and
               Myle Ott and
@@ -8959,15 +8958,6 @@ author    = {Zhuang Liu and
  volume    = {abs/1706.05098},
  year      = {2017}
 }
-@inproceedings{DBLP:conf/emnlp/DomhanH17,
-  author    = {Tobias Domhan and
-               Felix Hieber},
-  title     = {Using Target-side Monolingual Data for Neural Machine Translation
-               through Multi-task Learning},
-  pages     = {1500--1505},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2017}
-}
 @inproceedings{DBLP:conf/icml/XiaQCBYL17,
  author    = {Yingce Xia and
               Tao Qin and
@@ -9014,13 +9004,6 @@ author    = {Zhuang Liu and
  publisher = {The {MIT} Press},
  year      = {1999}
 }
-@inproceedings{lample2019cross,
-  author    = {Alexis Conneau and
-               Guillaume Lample},
-  title     = {Cross-lingual Language Model Pretraining},
-  pages     = {7057--7067},
-  year      = {2019}
-}
 @inproceedings{DBLP:conf/aclnmt/HoangKHC18,
  author    = {Cong Duy Vu Hoang and
               Philipp Koehn and
@@ -9042,15 +9025,6 @@ author    = {Zhuang Liu and
  publisher = {{PMLR}},
  year      = {2018}
 }
-@inproceedings{DBLP:conf/acl/FadaeeBM17a,
-  author    = {Marzieh Fadaee and
-               Arianna Bisazza and
-               Christof Monz},
-  title     = {Data Augmentation for Low-Resource Neural Machine Translation},
-  pages     = {567--573},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2017}
-}
 @inproceedings{finding2006adafre,
  author    = {S. F. Adafre and Maarten de Rijke},
  title     = {Finding Similar Sentences across Multiple Languages in Wikipedia },
@@ -9074,24 +9048,6 @@ author    = {Zhuang Liu and
  pages     = {477--504},
  year      = {2005}
 }
-@inproceedings{DBLP:conf/naacl/SmithQT10,
-  author    = {Jason R. Smith and
-               Chris Quirk and
-               Kristina Toutanova},
-  title     = {Extracting Parallel Sentences from Comparable Corpora using Document
-               Level Alignment},
-  pages     = {403--411},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2010}
-}
-@inproceedings{DBLP:conf/emnlp/ZhangZ16,
-  author    = {Jiajun Zhang and
-               Chengqing Zong},
-  title     = {Exploiting Source-side Monolingual Data in Neural Machine Translation},
-  pages     = {1535--1545},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2016}
-}
 @inproceedings{DBLP:conf/acl/XiaKAN19,
  author    = {Mengzhou Xia and
               Xiang Kong and
@@ -9102,17 +9058,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:conf/emnlp/WangPDN18,
-  author    = {Xinyi Wang and
-               Hieu Pham and
-               Zihang Dai and
-               Graham Neubig},
-  title     = {SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine
-               Translation},
-  pages     = {856--861},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2018}
-}
 @inproceedings{DBLP:conf/acl/GaoZWXQCZL19,
  author    = {Fei Gao and
               Jinhua Zhu and
@@ -9127,17 +9072,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:conf/emnlp/WangLWLS19,
-  author    = {Shuo Wang and
-               Yang Liu and
-               Chao Wang and
-               Huanbo Luan and
-               Maosong Sun},
-  title     = {Improving Back-Translation with Uncertainty-based Confidence Estimation},
-  pages     = {791--802},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2019}
-}
 @inproceedings{DBLP:conf/emnlp/WuWXQLL19,
  author    = {Lijun Wu and
               Yiren Wang and
@@ -9176,7 +9110,6 @@ author    = {Zhuang Liu and
  journal = {Computer Science},
  year = {2015},
 }
-
 @phdthesis{黄书剑0统计机器翻译中的词对齐研究,
  title={统计机器翻译中的词对齐研究},
  author={黄书剑},
@@ -9199,16 +9132,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2016}
 }
-@inproceedings{DBLP:conf/iclr/SmithTHH17,
-  author    = {Samuel L. Smith and
-               David H. P. Turban and
-               Steven Hamblin and
-               Nils Y. Hammerla},
-  title     = {Offline bilingual word vectors, orthogonal transformations and the
-               inverted softmax},
-  publisher = {International Conference on Learning Representations},
-  year      = {2017}
-}
 @inproceedings{DBLP:conf/acl/ArtetxeLA17,
  author    = {Mikel Artetxe and
               Gorka Labaka and
@@ -9227,7 +9150,6 @@ author    = {Zhuang Liu and
  pages={1-10},
  year={1966},
 }
-
 @inproceedings{DBLP:conf/iclr/LampleCRDJ18,
  author    = {Guillaume Lample and
               Alexis Conneau and
@@ -9248,16 +9170,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2017}
 }
-@inproceedings{DBLP:conf/emnlp/XuYOW18,
-  author    = {Ruochen Xu and
-               Yiming Yang and
-               Naoki Otani and
-               Yuexin Wu},
-  title     = {Unsupervised Cross-lingual Transfer of Word Embedding Spaces},
-  pages     = {2465--2474},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2018}
-}
 @inproceedings{DBLP:conf/emnlp/Alvarez-MelisJ18,
  author    = {David Alvarez-Melis and
               Tommi S. Jaakkola},
@@ -9310,15 +9222,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:conf/acl/SogaardVR18,
-  author    = {Anders S{\o}gaard and
-               Sebastian Ruder and
-               Ivan Vulic},
-  title     = {On the Limitations of Unsupervised Bilingual Dictionary Induction},
-  pages     = {778--788},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2018}
-}
 @article{DBLP:journals/talip/MarieF20,
  author    = {Benjamin Marie and
               Atsushi Fujita},
@@ -9351,15 +9254,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:conf/iclr/LampleCDR18,
-  author    = {Guillaume Lample and
-               Alexis Conneau and
-               Ludovic Denoyer and
-               Marc'Aurelio Ranzato},
-  title     = {Unsupervised Machine Translation Using Monolingual Corpora Only},
-  publisher = {International Conference on Learning Representations},
-  year      = {2018}
-}
 @inproceedings{DBLP:conf/nips/ConneauL19,
  author    = {Alexis Conneau and
               Guillaume Lample},
@@ -9388,7 +9282,6 @@ author    = {Zhuang Liu and
  publisher={International Conference on Computational Linguistics},
  year={2020}
 }
-
 @inproceedings{2018When,
  title={When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?},
  author={ Qi, Ye  and  Sachan, Devendra Singh  and  Felix, Matthieu  and  Padmanabhan, Sarguna Janani  and  Neubig, Graham },
@@ -9404,16 +9297,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:conf/emnlp/ImamuraS19,
-  author    = {Kenji Imamura and
-               Eiichiro Sumita},
-  title     = {Recycling a Pre-trained {BERT} Encoder for Neural Machine Translation},
-  booktitle = {Proceedings of the 3rd Workshop on Neural Generation and Translation@EMNLP-IJCNLP
-               2019, Hong Kong, November 4, 2019},
-  pages     = {23--31},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2019}
-}
 @inproceedings{DBLP:conf/aaai/YangW0Z00020,
  author    = {Jiacheng Yang and
               Mingxuan Wang and
@@ -9538,7 +9421,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-
 @article{DBLP:journals/corr/abs-1811-01124,
  author    = {Jean Alaux and
               Edouard Grave and
@@ -9549,16 +9431,6 @@ author    = {Zhuang Liu and
  volume    = {abs/1811.01124},
  year      = {2018}
 }
-@inproceedings{DBLP:conf/emnlp/XuYOW18,
-  author    = {Ruochen Xu and
-               Yiming Yang and
-               Naoki Otani and
-               Yuexin Wu},
-  title     = {Unsupervised Cross-lingual Transfer of Word Embedding Spaces},
-  pages     = {2465--2474},
-  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
-  year      = {2018}
-}
 @inproceedings{DBLP:conf/emnlp/DouZH18,
  author    = {Zi-Yi Dou and
               Zhi-Hao Zhou and
@@ -9595,18 +9467,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
  year      = {2018}
 }
-@inproceedings{DBLP:conf/emnlp/JoulinBMJG18,
-  author    = {Armand Joulin and
-               Piotr Bojanowski and
-               Tomas Mikolov and
-               Herv{\'{e}} J{\'{e}}gou and
-               Edouard Grave},
-  title     = {Loss in Translation: Learning Bilingual Word Mapping with a Retrieval
-               Criterion},
-  pages     = {2979--2984},
-  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
-  year      = {2018}
-}
 @inproceedings{DBLP:conf/emnlp/ChenC18,
  author    = {Xilun Chen and
               Claire Cardie},
@@ -9615,15 +9475,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
  year      = {2018}
 }
-@inproceedings{DBLP:conf/naacl/MohiuddinJ19,
-  author    = {Tasnim Mohiuddin and
-               Shafiq R. Joty},
-  title     = {Revisiting Adversarial Autoencoder for Unsupervised Word Translation
-               with Cycle Consistency and Improved Training},
-  pages     = {3857--3867},
-  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
-  year      = {2019}
-}
 @inproceedings{DBLP:conf/emnlp/TaitelbaumCG19,
  author    = {Hagai Taitelbaum and
               Gal Chechik and
@@ -9675,7 +9526,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
  year      = {2020}
 }
-
 @article{hartmann2018empirical,
  title={Empirical observations on the instability of aligning word vector spaces with GANs},
  author={Hartmann, Mareike and Kementchedjhieva, Yova and S{\o}gaard, Anders},
@@ -9699,7 +9549,6 @@ author    = {Zhuang Liu and
  pages     = {6031--6041},
  year      = {2019}
 }
-
 @inproceedings{DBLP:conf/emnlp/HartmannKS18,
  author    = {Mareike Hartmann and
               Yova Kementchedjhieva and
@@ -9710,17 +9559,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
  year      = {2018}
 }
-
-@inproceedings{DBLP:conf/emnlp/VulicGRK19,
-  author    = {Ivan Vulic and
-               Goran Glavas and
-               Roi Reichart and
-               Anna Korhonen},
-  title     = {Do We Really Need Fully Unsupervised Cross-Lingual Embeddings?},
-  pages     = {4406--4417},
-  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
-  year      = {2019}
-}
 @inproceedings{DBLP:conf/emnlp/JoulinBMJG18,
  author    = {Armand Joulin and
               Piotr Bojanowski and
@@ -9766,36 +9604,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  year      = {2016}
 }
-@inproceedings{DBLP:conf/naacl/FiratCB16,
-  author    = {Orhan Firat and
-               Kyunghyun Cho and
-               Yoshua Bengio},
-  title     = {Multi-Way, Multilingual Neural Machine Translation with a Shared Attention
-               Mechanism},
-  pages     = {866--875},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2016}
-}
-@article{DBLP:journals/tacl/JohnsonSLKWCTVW17,
-  author    = {Melvin Johnson and
-               Mike Schuster and
-               Quoc V. Le and
-               Maxim Krikun and
-               Yonghui Wu and
-               Zhifeng Chen and
-               Nikhil Thorat and
-               Fernanda B. Vi{\'{e}}gas and
-               Martin Wattenberg and
-               Greg Corrado and
-               Macduff Hughes and
-               Jeffrey Dean},
-  title     = {Google's Multilingual Neural Machine Translation System: Enabling
-               Zero-Shot Translation},
-  journal   = {Trans. Assoc. Comput. Linguistics},
-  volume    = {5},
-  pages     = {339--351},
-  year      = {2017}
-}
 @inproceedings{DBLP:conf/emnlp/KimPPKN19,
  author    = {Yunsu Kim and
               Petre Petrov and
@@ -9877,16 +9685,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2007}
 }
-@article{DBLP:journals/mt/WuW07,
-  author    = {Hua Wu and
-               Haifeng Wang},
-  title     = {Pivot language approach for phrase-based statistical machine translation},
-  journal   = {Mach. Transl.},
-  volume    = {21},
-  number    = {3},
-  pages     = {165--181},
-  year      = {2007}
-}
 @inproceedings{DBLP:conf/acl/WuW09,
  author    = {Hua Wu and
               Haifeng Wang},
@@ -9987,17 +9785,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2015}
 }
-@article{DBLP:journals/tacl/LeeCH17,
-  author    = {Jason Lee and
-               Kyunghyun Cho and
-               Thomas Hofmann},
-  title     = {Fully Character-Level Neural Machine Translation without Explicit
-               Segmentation},
-  journal   = {Trans. Assoc. Comput. Linguistics},
-  volume    = {5},
-  pages     = {365--378},
-  year      = {2017}
-}
 @inproceedings{DBLP:conf/lrec/RiktersPK18,
  author    = {Matiss Rikters and
               Marcis Pinnis and
@@ -10017,26 +9804,6 @@ author    = {Zhuang Liu and
  pages     = {1345--1359},
  year      = {2010}
 }
-@article{DBLP:journals/tacl/JohnsonSLKWCTVW17,
-  author    = {Melvin Johnson and
-               Mike Schuster and
-               Quoc V. Le and
-               Maxim Krikun and
-               Yonghui Wu and
-               Zhifeng Chen and
-               Nikhil Thorat and
-               Fernanda B. Vi{\'{e}}gas and
-               Martin Wattenberg and
-               Greg Corrado and
-               Macduff Hughes and
-               Jeffrey Dean},
-  title     = {Google's Multilingual Neural Machine Translation System: Enabling
-               Zero-Shot Translation},
-  journal   = {Trans. Assoc. Comput. Linguistics},
-  volume    = {5},
-  pages     = {339--351},
-  year      = {2017}
-}
 @book{2009Handbook,
  title={Handbook Of Research On Machine Learning Applications and Trends: Algorithms, Methods and Techniques - 2 Volumes},
  author={ Olivas, Emilio Soria  and  Guerrero, Jose David Martin  and  Sober, Marcelino Martinez  and  Benedito, Jose Rafael Magdalena  and  Lopez, Antonio Jose Serrano },
@@ -10122,35 +9889,6 @@ author    = {Zhuang Liu and
  pages={1--38},
  year={2020}
 }
-@inproceedings{DBLP:conf/emnlp/VulicGRK19,
-  author    = {Ivan Vulic and
-               Goran Glavas and
-               Roi Reichart and
-               Anna Korhonen},
-  title     = {Do We Really Need Fully Unsupervised Cross-Lingual Embeddings?},
-  pages     = {4406--4417},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2019}
-}
-@article{DBLP:journals/corr/MikolovLS13,
-  author    = {Tomas Mikolov and
-               Quoc V. Le and
-               Ilya Sutskever},
-  title     = {Exploiting Similarities among Languages for Machine Translation},
-  journal   = {CoRR},
-  volume    = {abs/1309.4168},
-  year      = {2013}
-}
-@article{DBLP:journals/corr/MikolovLS13,
-  author    = {Tomas Mikolov and
-               Quoc V. Le and
-               Ilya Sutskever},
-  title     = {Exploiting Similarities among Languages for Machine Translation},
-  journal   = {CoRR},
-  volume    = {abs/1309.4168},
-  year      = {2013}
-}
-
 @inproceedings{DBLP:conf/emnlp/XuYOW18,
  author    = {Ruochen Xu and
               Yiming Yang and
@@ -10161,17 +9899,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2018}
 }
-@inproceedings{DBLP:conf/iclr/LampleCRDJ18,
-  author    = {Guillaume Lample and
-               Alexis Conneau and
-               Marc'Aurelio Ranzato and
-               Ludovic Denoyer and
-               Herv{\'{e}} J{\'{e}}gou},
-  title     = {Word translation without parallel data},
-  publisher = {International Conference on Learning Representations},
-  year      = {2018}
-}
-
 @inproceedings{DBLP:conf/emnlp/ZhangLLS17,
  author    = {Meng Zhang and
               Yang Liu and
@@ -10183,17 +9910,6 @@ author    = {Zhuang Liu and
  publisher = {Conference on Empirical Methods in Natural Language Processing},
  year      = {2017}
 }
-@inproceedings{DBLP:conf/naacl/MohiuddinJ19,
-  author    = {Tasnim Mohiuddin and
-               Shafiq R. Joty},
-  title     = {Revisiting Adversarial Autoencoder for Unsupervised Word Translation
-               with Cycle Consistency and Improved Training},
-  pages     = {3857--3867},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2019}
-}
-
-
 @inproceedings{DBLP:conf/emnlp/ArtetxeLA18,
  author    = {Mikel Artetxe and
               Gorka Labaka and
@@ -10203,7 +9919,6 @@ author    = {Zhuang Liu and
  publisher = {Conference on Empirical Methods in Natural Language Processing},
  year      = {2018}
 }
-
 @article{DBLP:journals/tacl/LeeCH17,
  author    = {Jason Lee and
               Kyunghyun Cho and
@@ -10231,29 +9946,9 @@ author    = {Zhuang Liu and
               Alexander H. Waibel},
  title     = {Toward Multilingual Neural Machine Translation with Universal Encoder
               and Decoder},
-  journal   = {CoRR},
-  volume    = {abs/1611.04798},
-  year      = {2016}
-}
-@article{DBLP:journals/tacl/JohnsonSLKWCTVW17,
-  author    = {Melvin Johnson and
-               Mike Schuster and
-               Quoc V. Le and
-               Maxim Krikun and
-               Yonghui Wu and
-               Zhifeng Chen and
-               Nikhil Thorat and
-               Fernanda B. Vi{\'{e}}gas and
-               Martin Wattenberg and
-               Greg Corrado and
-               Macduff Hughes and
-               Jeffrey Dean},
-  title     = {Google's Multilingual Neural Machine Translation System: Enabling
-               Zero-Shot Translation},
-  journal   = {Transactions of the Association for Computational Linguistics},
-  volume    = {5},
-  pages     = {339--351},
-  year      = {2017}
+  journal   = {CoRR},
+  volume    = {abs/1611.04798},
+  year      = {2016}
 }
 @inproceedings{DBLP:conf/coling/BlackwoodBW18,
  author    = {Graeme W. Blackwood and
@@ -10318,13 +10013,6 @@ author    = {Zhuang Liu and
  publisher = {Conference on Empirical Methods in Natural Language Processing},
  year      = {2019}
 }
-
-@inproceedings{2019Consistency,
-  title={Consistency by Agreement in Zero-Shot Neural Machine Translation},
-  author={Al-Shedivat, Maruan  and  Parikh, Ankur },
-  publisher={Proceedings of the 2019 Conference of the North},
-  year={2019},
-}
 @article{DBLP:journals/corr/abs-1903-07091,
  author    = {Naveen Arivazhagan and
               Ankur Bapna and
@@ -10421,15 +10109,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2009}
 }
-@inproceedings{DBLP:conf/eacl/LapataSM17,
-  author    = {Jonathan Mallinson and
-               Rico Sennrich and
-               Mirella Lapata},
-  title     = {Paraphrasing Revisited with Neural Machine Translation},
-  pages     = {881--893},
-  publisher = {European Association of Computational Linguistics},
-  year      = {2017}
-}
 @inproceedings{DBLP:conf/aclnmt/ImamuraFS18,
  author    = {Kenji Imamura and
               Atsushi Fujita and
@@ -10451,21 +10130,6 @@ author    = {Zhuang Liu and
  pages     = {1096--1103},
  publisher = {International Conference on Machine Learning}
 }
-@article{DBLP:journals/ipm/FarhanTAJATT20,
-  author    = {Wael Farhan and
-               Bashar Talafha and
-               Analle Abuammar and
-               Ruba Jaikat and
-               Mahmoud Al-Ayyoub and
-               Ahmad Bisher Tarakji and
-               Anas Toma},
-  title     = {Unsupervised dialectal neural machine translation},
-  journal   = {Inform Process Manag},
-  volume    = {57},
-  number    = {3},
-  pages     = {102181},
-  year      = {2020}
-}
 @inproceedings{DBLP:conf/iclr/LampleCDR18,
  author    = {Guillaume Lample and
               Alexis Conneau and
@@ -10521,13 +10185,6 @@ author    = {Zhuang Liu and
  publisher = {European Association of Computational Linguistics},
  year      = {2017}
 }
-@inproceedings{yasuda2008method,
-  title={Method for building sentence-aligned corpus from wikipedia},
-  author={Yasuda, Keiji and Sumita, Eiichiro},
-  publisher={2008 AAAI Workshop on Wikipedia and Artificial Intelligence},
-  pages={263--268},
-  year={2008}
-}
 @article{2005Improving,
  title={Improving Machine Translation Performance by Exploiting Non-Parallel Corpora},
  author={ Munteanu, Ds  and  Marcu, D },
@@ -10698,54 +10355,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2017}
 }
-@inproceedings{DBLP:conf/naacl/PetersNIGCLZ18,
-  author    = {Matthew E. Peters and
-               Mark Neumann and
-               Mohit Iyyer and
-               Matt Gardner and
-               Christopher Clark and
-               Kenton Lee and
-               Luke Zettlemoyer},
-  title     = {Deep Contextualized Word Representations},
-  pages     = {2227--2237},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2018}
-}
-@inproceedings{DBLP:conf/naacl/PetersNIGCLZ18,
-  author    = {Matthew E. Peters and
-               Mark Neumann and
-               Mohit Iyyer and
-               Matt Gardner and
-               Christopher Clark and
-               Kenton Lee and
-               Luke Zettlemoyer},
-  title     = {Deep Contextualized Word Representations},
-  pages     = {2227--2237},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2018}
-}
-@inproceedings{DBLP:conf/naacl/PetersNIGCLZ18,
-  author    = {Matthew E. Peters and
-               Mark Neumann and
-               Mohit Iyyer and
-               Matt Gardner and
-               Christopher Clark and
-               Kenton Lee and
-               Luke Zettlemoyer},
-  title     = {Deep Contextualized Word Representations},
-  pages     = {2227--2237},
-  publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
-  year      = {2018}
-}
-@inproceedings{DBLP:conf/emnlp/ClinchantJN19,
-  author    = {St{\'{e}}phane Clinchant and
-               Kweon Woo Jung and
-               Vassilina Nikoulina},
-  title     = {On the use of {BERT} for Neural Machine Translation},
-  pages     = {108--117},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2019}
-}
 @inproceedings{DBLP:conf/emnlp/ImamuraS19,
  author    = {Kenji Imamura and
               Eiichiro Sumita},
@@ -10773,34 +10382,6 @@ author    = {Zhuang Liu and
  volume    = {abs/1908.06259},
  year      = {2019}
 }
-@inproceedings{DBLP:conf/aaai/YangW0Z00020,
-  author    = {Jiacheng Yang and
-               Mingxuan Wang and
-               Hao Zhou and
-               Chengqi Zhao and
-               Weinan Zhang and
-               Yong Yu and
-               Lei Li},
-  title     = {Towards Making the Most of {BERT} in Neural Machine Translation},
-  pages     = {9378--9385},
-  publisher = {AAAI Conference on Artificial Intelligence},
-  year      = {2020}
-}
-@inproceedings{DBLP:conf/acl/LewisLGGMLSZ20,
-  author    = {Mike Lewis and
-               Yinhan Liu and
-               Naman Goyal and
-               Marjan Ghazvininejad and
-               Abdelrahman Mohamed and
-               Omer Levy and
-               Veselin Stoyanov and
-               Luke Zettlemoyer},
-  title     = {{BART:} Denoising Sequence-to-Sequence Pre-training for Natural Language
-               Generation, Translation, and Comprehension},
-  pages     = {7871--7880},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2020}
-}
 @inproceedings{DBLP:conf/emnlp/QiYGLDCZ020,
  author    = {Weizhen Qi and
               Yu Yan and
@@ -10941,13 +10522,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2013}
 }
-@article{joty2015using,
-  title={Using joint models for domain adaptation in statistical machine translation},
-  author={Joty, Nadir Durrani Hassan Sajjad Shafiq and Vogel, Ahmed Abdelali Stephan},
-  journal={Proceedings of MT Summit XV},
-  pages={117},
-  year={2015}
-}
 @article{imamura2016multi,
  title={Multi-domain adaptation for statistical machine translation based on feature augmentation},
  author={Imamura, Kenji and Sumita, Eiichiro},
@@ -11025,17 +10599,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2010}
 }
-@inproceedings{DBLP:conf/acl/DuhNST13,
-  author    = {Kevin Duh and
-               Graham Neubig and
-               Katsuhito Sudoh and
-               Hajime Tsukada},
-  title     = {Adaptation Data Selection using Neural Language Models: Experiments
-               in Machine Translation},
-  pages     = {678--683},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2013}
-}
 @inproceedings{DBLP:conf/coling/HoangS14,
  author    = {Cuong Hoang and
               Khalil Sima'an},
@@ -11110,33 +10673,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2012}
 }
-@inproceedings{DBLP:conf/wmt/FosterK07,
-  author    = {George F. Foster and
-               Roland Kuhn},
-  title     = {Mixture-Model Adaptation for {SMT}},
-  pages     = {128--135},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2007}
-}
-@inproceedings{DBLP:conf/emnlp/MatsoukasRZ09,
-  author    = {Spyros Matsoukas and
-               Antti-Veikko I. Rosti and
-               Bing Zhang},
-  title     = {Discriminative Corpus Weight Estimation for Machine Translation},
-  pages     = {708--717},
-  publisher = {Conference on Empirical Methods in Natural Language Processing},
-  year      = {2009}
-}
-@inproceedings{DBLP:conf/emnlp/FosterGK10,
-  author    = {George F. Foster and
-               Cyril Goutte and
-               Roland Kuhn},
-  title     = {Discriminative Instance Weighting for Domain Adaptation in Statistical
-               Machine Translation},
-  pages     = {451--459},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2010}
-}
 @inproceedings{DBLP:conf/wmt/ShahBS10,
  author    = {Kashif Shah and
               Lo{\"{\i}}c Barrault and
@@ -11152,24 +10688,6 @@ author    = {Zhuang Liu and
  publisher={International Workshop on Spoken Language Translation},
  year={2011}
 }
-@inproceedings{DBLP:conf/lrec/EckVW04,
-  author    = {Matthias Eck and
-               Stephan Vogel and
-               Alex Waibel},
-  title     = {Language Model Adaptation for Statistical Machine Translation Based
-               on Information Retrieval},
-  publisher = {European Language Resources Association},
-  year      = {2004}
-}
-@inproceedings{DBLP:conf/coling/ZhaoEV04,
-  author    = {Bing Zhao and
-               Matthias Eck and
-               Stephan Vogel},
-  title     = {Language Model Adaptation for Statistical Machine Translation via
-               Structured Query Models},
-  publisher = {International Conference on Computational Linguistics},
-  year      = {2004}
-}
 @article{moore2010intelligent,
  title = {Intelligent selection of language model training data},
  author = {Moore, Robert C and Lewis, Will},
@@ -11311,12 +10829,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{2019Non,
-  title={Non-Parametric Adaptation for Neural Machine Translation},
-  author={Bapna, Ankur  and  Firat, Orhan },
-  booktitle={Conference of the North},
-  year={2019},
-}
 @inproceedings{britz2017effective,
  title={Effective domain mixing for neural machine translation},
  author={Britz, Denny and Le, Quoc and Pryzant, Reid},
@@ -11472,17 +10984,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  year      = {2019}
 }
-@article{DBLP:journals/corr/abs-1906-03129,
-  author    = {Shen Yan and
-               Leonard Dahlmann and
-               Pavel Petrushkov and
-               Sanjika Hewavitharana and
-               Shahram Khadivi},
-  title     = {Word-based Domain Adaptation for Neural Machine Translation},
-  journal   = {CoRR},
-  volume    = {abs/1906.03129},
-  year      = {2019}
-}
 @inproceedings{DBLP:conf/emnlp/WeesBM17,
  author    = {Marlies van der Wees and
               Arianna Bisazza and
@@ -11514,15 +11015,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2017}
 }
-@inproceedings{DBLP:conf/emnlp/DomhanH17,
-  author    = {Tobias Domhan and
-               Felix Hieber},
-  title     = {Using Target-side Monolingual Data for Neural Machine Translation
-               through Multi-task Learning},
-  pages     = {1500--1505},
-  publisher = {Conference on Empirical Methods in Natural Language Processing},
-  year      = {2017}
-}
 @inproceedings{DBLP:conf/naacl/BapnaF19,
  author    = {Ankur Bapna and
               Orhan Firat},
@@ -11531,8 +11023,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  year      = {2019}
 }
-
-
 @article{DBLP:journals/corr/abs-2010-11125,
  author    = {Angela Fan and
               Shruti Bhosale and
@@ -11570,7 +11060,6 @@ author    = {Zhuang Liu and
  publisher = {Conference on Empirical Methods in Natural Language Processing},
  year      = {2020}
 }
-
 @inproceedings{DBLP:conf/emnlp/ZhuH07,
  author    = {Jingbo Zhu and
               Eduard H. Hovy},
@@ -11604,8 +11093,6 @@ author    = {Zhuang Liu and
  publisher = {AAAI Conference on Artificial Intelligence},
  year      = {2018}
 }
-
-
 @inproceedings{DBLP:conf/wmt/SunJXHWW19,
  author    = {Meng Sun and
               Bojian Jiang and
@@ -11618,8 +11105,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-
-
 @inproceedings{DBLP:conf/acl/SuHC19,
  author    = {Shang-Yu Su and
               Chao-Wei Huang and
@@ -11629,8 +11114,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-
-
 @article{DBLP:journals/ejasmp/RadzikowskiNWY19,
  author    = {Kacper Radzikowski and
               Robert Nowak and
@@ -11670,6 +11153,155 @@ author    = {Zhuang Liu and
  pages     = {170248--170260},
  year      = {2020}
 }
+@inproceedings{DBLP:conf/acl/MarieRF20,
+  author    = {Benjamin Marie and
+               Raphael Rubino and
+               Atsushi Fujita},
+  title     = {Tagged Back-translation Revisited: Why Does It Really Work?},
+  pages     = {5990--5997},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2020}
+}
+@inproceedings{DBLP:conf/nips/YangDYCSL19,
+  author    = {Zhilin Yang and
+               Zihang Dai and
+               Yiming Yang and
+               Jaime G. Carbonell and
+               Ruslan Salakhutdinov and
+               Quoc V. Le},
+  title     = {XLNet: Generalized Autoregressive Pretraining for Language Understanding},
+  pages     = {5754--5764},
+  year      = {2019}
+}
+@article{lewis2019bart,
+  title={Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension},
+  author={Lewis, Mike and Liu, Yinhan and Goyal, Naman and Ghazvininejad, Marjan and Mohamed, Abdelrahman and Levy, Omer and Stoyanov, Ves and Zettlemoyer, Luke},
+  journal={arXiv preprint arXiv:1910.13461},
+  year={2019}
+}
+@inproceedings{DBLP:conf/iclr/LanCGGSS20,
+  author    = {Zhenzhong Lan and
+               Mingda Chen and
+               Sebastian Goodman and
+               Kevin Gimpel and
+               Piyush Sharma and
+               Radu Soricut},
+  title     = {{ALBERT:} {A} Lite {BERT} for Self-supervised Learning of Language
+               Representations},
+  publisher = {International Conference on Learning Representations},
+  year      = {2020}
+}
+@inproceedings{DBLP:conf/acl/ZhangHLJSL19,
+  author    = {Zhengyan Zhang and
+               Xu Han and
+               Zhiyuan Liu and
+               Xin Jiang and
+               Maosong Sun and
+               Qun Liu},
+  title     = {{ERNIE:} Enhanced Language Representation with Informative Entities},
+  pages     = {1441--1451},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2019}
+}
+@inproceedings{DBLP:conf/emnlp/HuangLDGSJZ19,
+  author    = {Haoyang Huang and
+               Yaobo Liang and
+               Nan Duan and
+               Ming Gong and
+               Linjun Shou and
+               Daxin Jiang and
+               Ming Zhou},
+  title     = {Unicoder: {A} Universal Language Encoder by Pre-training with Multiple
+               Cross-lingual Tasks},
+  pages     = {2485--2494},
+  publisher = {Conference on Empirical Methods in Natural Language Processing},
+  year      = {2019}
+}
+@inproceedings{DBLP:conf/iccv/SunMV0S19,
+  author    = {Chen Sun and
+               Austin Myers and
+               Carl Vondrick and
+               Kevin Murphy and
+               Cordelia Schmid},
+  title     = {VideoBERT: {A} Joint Model for Video and Language Representation Learning},
+  pages     = {7463--7472},
+  publisher = {International Conference on Computer Vision},
+  year      = {2019}
+}
+@article{DBLP:journals/corr/abs-2010-12831,
+  author    = {Liunian Harold Li and
+               Haoxuan You and
+               Zhecan Wang and
+               Alireza Zareian and
+               Shih-Fu Chang and
+               Kai-Wei Chang},
+  title     = {Weakly-supervised VisualBERT: Pre-training without Parallel Images
+               and Captions},
+  journal   = {CoRR},
+  volume    = {abs/2010.12831},
+  year      = {2020}
+}
+@inproceedings{DBLP:conf/nips/LuBPL19,
+  author    = {Jiasen Lu and
+               Dhruv Batra and
+               Devi Parikh and
+               Stefan Lee},
+  title     = {ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations
+               for Vision-and-Language Tasks},
+  publisher = {Annual Conference and Workshop on Neural Information Processing Systems},
+  pages     = {13--23},
+  year      = {2019}
+}
+@inproceedings{DBLP:conf/interspeech/ChuangLLL20,
+  author    = {Yung-Sung Chuang and
+               Chi-Liang Liu and
+               Hung-yi Lee and
+               Lin-Shan Lee},
+  title     = {SpeechBERT: An Audio-and-Text Jointly Learned Language Model for End-to-End
+               Spoken Question Answering},
+  pages     = {4168--4172},
+  publisher = {Annual Conference of the International Speech Communication Association},
+  year      = {2020}
+}
+@inproceedings{DBLP:conf/rep4nlp/PetersRS19,
+  author    = {Matthew E. Peters and
+               Sebastian Ruder and
+               Noah A. Smith},
+  title     = {To Tune or Not to Tune? Adapting Pretrained Representations to Diverse
+               Tasks},
+  pages     = {7--14},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2019}
+}
+@inproceedings{DBLP:conf/cncl/SunQXH19,
+  author    = {Chi Sun and
+               Xipeng Qiu and
+               Yige Xu and
+               Xuanjing Huang},
+  title     = {How to Fine-Tune {BERT} for Text Classification?},
+  volume    = {11856},
+  pages     = {194--206},
+  publisher = {Springer},
+  year      = {2019}
+}
+@inproceedings{shen2020q,
+  title={Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT.},
+  author={Shen, Sheng and Dong, Zhen and Ye, Jiayu and Ma, Linjian and Yao, Zhewei and Gholami, Amir and Mahoney, Michael W and Keutzer, Kurt},
+  booktitle={AAAI Conference on Artificial Intelligence},
+  pages={8815--8821},
+  year={2020}
+}
+@article{DBLP:journals/corr/abs-1910-01108,
+  author    = {Victor Sanh and
+               Lysandre Debut and
+               Julien Chaumond and
+               Thomas Wolf},
+  title     = {DistilBERT, a distilled version of {BERT:} smaller, faster, cheaper
+               and lighter},
+  journal   = {CoRR},
+  volume    = {abs/1910.01108},
+  year      = {2019}
+}
 %%%%% chapter 16------------------------------------------------------
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%