合并分支 'caorunzhe' 到 'master'

Caorunzhe 查看合并请求 !810

合并分支 'caorunzhe' 到 'master'
Caorunzhe 查看合并请求 !810
b2f1d7cc · 曹润柘 · 5352a7df · 40f9e93d · b2f1d7cc · b2f1d7cc
Commit b2f1d7cc authored Jan 07, 2021 by 曹润柘
--- a/Chapter16/Figures/figure-data-based-domain-adaptation-approach.tex
+++ b/Chapter16/Figures/figure-data-based-domain-adaptation-approach.tex
@@ -110,13 +110,13 @@
 \node [rectangle,rounded corners=1pt,fill=cyan!10] [fit = (w4-3) (new_-3)] (box2) {};
 \end{pgfonlayer}

-\node[word,draw=orange!50,dotted,very thick,inner sep=2.5pt] (realdata-3) at ([xshift=-4.5em,yshift=-2em]box1.south) {真实数据};
-\node[word,draw=cyan!50,dotted,very thick,inner sep=2.5pt] (fake-3) at ([xshift=1em,yshift=-2em]box2.south) {伪数据};
-\node[word,draw,dotted,very thick,inner sep=2.5pt] (monodata-3) at ([xshift=-0.5em,yshift=2em]monolingual-3.north) {单语数据};
+\node[word,draw=orange!50,dotted,very thick,inner sep=2.5pt] (realdata-3) at ([xshift=-3.5em,yshift=-2em]box1.south) {真实数据};
+\node[word,draw=cyan!50,dotted,very thick,inner sep=2.5pt] (fake-3) at ([xshift=0em,yshift=-2em]box2.south) {伪数据};
+\node[word,draw,dotted,very thick,inner sep=2.5pt] (monodata-3) at ([xshift=0em,yshift=2em]monolingual-3.north) {单语数据};

-\draw[->,dotted,very thick] ([yshift=0.0em]monolingual-3.north)-- ([yshift=-0.2em,xshift=0.45em]monodata-3.south);
-\draw[->,dotted,very thick,cyan] (box2.south) -- ([xshift=-1em,yshift=0.2em]fake-3.north);
-\draw[->,dotted,very thick,orange] ([xshift=-3.5em]box1.south) -- ([xshift=1em,yshift=0.2em]realdata-3.north);
+\draw[->,dotted,very thick] ([yshift=0.0em]monolingual-3.north)-- ([yshift=-0.2em,xshift=0.0em]monodata-3.south);
+\draw[->,dotted,very thick,cyan] (box2.south) -- ([xshift=-0em,yshift=0.2em]fake-3.north);
+\draw[->,dotted,very thick,orange] ([xshift=-3.5em]box1.south) -- ([xshift=0em,yshift=0.2em]realdata-3.north);

 \end{scope}
 \end{tikzpicture}
--- a/Chapter16/Figures/figure-multitask-learning-in-machine-translation-1.tex
+++ b/Chapter16/Figures/figure-multitask-learning-in-machine-translation-1.tex
@@ -6,32 +6,33 @@



-\node [anchor=center] (node1-1) at (0,0) {\small{$y'$}};
+\node [anchor=center] (node1-1) at (0,0) {\small{$y$}};
 \node[anchor=north,rec,fill=blue!20](node1-2) at ([yshift=-2.0em]node1-1.south) {\small{解码器}};
 \node[anchor=north,rec,fill=red!20](node1-3) at ([yshift=-2em]node1-2.south) {\small{编码器}};
-\node[anchor=east](node1-5) at ([xshift=-2em]node1-2.west) {\small{$y$}};
+\node[anchor=east](node1-5) at ([xshift=-2em]node1-2.west) {\small{$y_{<}$}};
 \node[anchor=north](node1-4) at ([yshift=-2em]node1-3.south) {\small{$x$}};
 \draw [->,thick](node1-4.north)--(node1-3.south);
 \draw [->,thick](node1-5.east)--(node1-2.west);
 \draw [->,thick](node1-3.north)--(node1-2.south);
 \draw [->,thick](node1-2.north)--(node1-1.south);

-\node [anchor=center] (node2-1) at ([xshift=12.0em]node1-1.east) {\small{$y'$}};
+\node [anchor=center] (node2-1) at ([xshift=12.0em]node1-1.east) {\small{$y$}};
 \node[anchor=north,rec,fill=blue!20](node2-2) at ([yshift=-2.0em]node2-1.south) {\small{解码器}};
 \node[anchor=north,rec,fill=red!20](node2-3) at ([yshift=-2em]node2-2.south) {\small{编码器}};
-\node[anchor=east](node2-5) at ([xshift=-2em]node2-2.west) {\small{$y$}};
+\node[anchor=east](node2-5) at ([xshift=-2em]node2-2.west) {\small{$y_{<}$}};
 \node[anchor=north](node2-4) at ([yshift=-2em]node2-3.south) {\small{$x$}};
 \node[anchor=west,rec,fill=yellow!20](node2-6) at ([xshift=3.0em]node2-3.east) {\small{解码器}};
-\node[anchor=south](node2-7) at ([yshift=2em]node2-6.north) {\small{$x'$}};
+\node[anchor=south](node2-7) at ([yshift=2em]node2-6.north) {\small{$\hat{x}$}};

 \draw [->,thick](node2-4.north)--(node2-3.south);
 \draw [->,thick](node2-5.east)--(node2-2.west);
-\draw [->,thick](node2-3.north)--(node2-2.south)node[pos=0.5,left,font=\scriptsize]{翻译};
-\draw [->,thick](node2-2.north)--(node2-1.south);
-\draw [->,thick](node2-3.east)--(node2-6.west)node[pos=0.5,above,font=\scriptsize]{重排序};
-\draw [->,thick](node2-6.north)--(node2-7.south);
+\draw [->,thick](node2-3.north)--(node2-2.south);
+\draw [->,thick](node2-2.north)--(node2-1.south)node[pos=0.5,left,font=\scriptsize]{翻译};
+\draw [->,thick](node2-3.east)--(node2-6.west);
+\draw [->,thick](node2-6.north)--(node2-7.south)node[pos=0.5,left,font=\scriptsize]{调整语序};

 \node [anchor=east] (node1) at ([xshift=-2.0em]node1-1.west) {\small{$x,y$：双语数据}};
+\node [anchor=south] (node2) at ([xshift=1.96em]node1.north) {\small{$y_{<}$：目标语言文本数据}};

 \node [anchor=north](pos1) at ([yshift=0em]node1-4.south) {\small{(a)单任务学习}};
 \node [anchor=west](pos2) at ([xshift=10.0em]pos1.east) {\small{(b)多任务学习}};

--- a/Chapter16/Figures/figure-multitask-learning-in-machine-translation-2.tex
+++ b/Chapter16/Figures/figure-multitask-learning-in-machine-translation-2.tex
 \begin{tikzpicture}
 \begin{scope}
-\node [anchor=center] (node1-1) at (0,0) {\small{$y'$}};
-\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4.3em,fill=blue!20](node1-2) at ([yshift=-3em]node1-1.south) {\small{Softmax}};
+\node [anchor=center] (node1-1) at (0,0) {\small{$y$}};

-\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=blue!20](node1-3) at ([yshift=-2.0em]node1-2.south) {\small{解码器}};
+\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=blue!20](node1-3) at ([yshift=-2.0em]node1-1.south) {\small{解码器}};
 \node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=yellow!20](node3-3) at ([yshift=-2.0em]node1-3.south) {\small{语言模型}};

+\node [anchor=west] (node3-1) at ([xshift=4.0em]node3-3.east) {\small{$z$}};

-\node[anchor=west,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4.3em,fill=blue!20](node3-2) at ([xshift=2em]node3-3.east) {\small{Softmax}};
-\node [anchor=north] (node3-1) at ([yshift=3.0em]node3-2.north) {\small{$z'$}};

-
-\node[anchor=north](node3-41) at ([xshift=-0.6em,yshift=-2em]node3-3.south) {\small{$y$}};
-\node[anchor=north](node3-42) at ([xshift=0.6em,yshift=-2em]node3-3.south) {\small{$z$}};
+\node[anchor=north](node3-41) at ([yshift=-2em]node3-3.south) {\small{$y_{<}+z_{<}$}};

 \node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=red!20](node2-1) at ([xshift=-2em]node1-3.west) {\small{编码器}};
 \node[anchor=north](node2-2) at ([yshift=-2em]node2-1.south) {\small{$x$}};



-\node [rectangle,rounded corners,draw=red,line width=0.2mm,densely dashed,inner sep=0.4em] [fit = (node3-2) (node3-3)] (inputshadow) {};
-\draw [->,thick](node1-3.north)--(node1-2);
-\draw [->,thick](node1-2.north)--(node1-1);
+\node [rectangle,rounded corners,draw=red,line width=0.2mm,densely dashed,inner sep=0.4em] [fit = (node3-1) (node3-3)] (inputshadow) {};
+\draw [->,thick](node1-3.north)--(node1-1)node[pos=0.5,left,font=\scriptsize]{Softmax};
 \draw [->,thick](node2-2.north)--(node2-1);
 \draw[->,thick](node2-1.east)--(node1-3.west);

-\draw [->,thick](node3-41.north)--([xshift=-0.6em]node3-3.south);
-\draw [->,thick](node3-42.north)--([xshift=0.6em]node3-3.south);
+\draw [->,thick](node3-41.north)--(node3-3.south);
 \draw [->,thick](node3-3.north)--(node1-3.south);
-\draw [->,thick](node3-2.north)--(node3-1);
-\draw[->,thick](node3-3.east)--(node3-2.west);
+\draw[->,thick](node3-3.east)--(node3-1.west)node[pos=0.5,above,font=\scriptsize]{Softmax};



-\node [anchor=east] (node2-1-1) at ([xshift=-12.0em,yshift=-4.25em]node1-1.west) {\small{$y'$}};
-\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4.3em,fill=blue!20](node2-1-2) at ([yshift=-3em]node2-1-1.south) {\small{Softmax}};
-\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=blue!20](node2-1-3) at ([yshift=-2.0em]node2-1-2.south) {\small{解码器}};
+\node [anchor=east] (node2-1-1) at ([xshift=-12.0em,yshift=-4.25em]node1-1.west) {\small{$y$}};
+\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=blue!20](node2-1-3) at ([yshift=-2.0em]node2-1-1.south) {\small{解码器}};
 \node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=red!20](node2-2-1) at ([xshift=-2em]node2-1-3.west) {\small{编码器}};
 \node[anchor=north](node2-2-2) at ([yshift=-2em]node2-2-1.south) {\small{$x$}};
-\node[anchor=north](node2-2-3) at ([yshift=-2em]node2-1-3.south) {\small{$y$}};
+\node[anchor=north](node2-2-3) at ([yshift=-2em]node2-1-3.south) {\small{$y_{<}$}};

-\draw [->,thick](node2-1-2.north)--(node2-1-1);
 \draw [->,thick](node2-2-2.north)--(node2-2-1);
 \draw[->,thick](node2-2-1.east)--(node2-1-3.west);
-\draw [->,thick](node2-1-3.north)--(node2-1-2.south);
+\draw [->,thick](node2-1-3.north)--(node2-1-1)node[pos=0.5,left,font=\scriptsize]{Softmax};
 \draw [->,thick](node2-2-3.north)--(node2-1-3);

-\node [anchor=east] (node1) at ([xshift=-2.0em,yshift=4em]node2-1-1.west) {\small{$x,y$：双语数据}};
+\node [anchor=east] (node1) at ([xshift=-2.0em,yshift=3em]node2-1-1.west) {\small{$x,y$：双语数据}};
+\node [anchor=south] (node3) at ([xshift=1.96em]node1.north) {\small{$y_{<}$：目标语言文本数据}};
 \node [anchor=north] (node2) at ([xshift=0.45em]node1.south) {\small{$z$}：单语数据};

 \node [anchor=north](pos1) at ([yshift=-3.5em]node3-3.south) {\small{(b)多任务学习}};

--- a/Chapter16/Figures/figure-parameter-initialization-method-diagram.tex
+++ b/Chapter16/Figures/figure-parameter-initialization-method-diagram.tex
@@ -5,35 +5,33 @@
 	\tikzstyle{node}=[rounded corners=4pt,draw,minimum height=3em,drop shadow,font=\footnotesize]

 \node[node,minimum width=6em,minimum height=2.4em,fill=red!20,line width=0.6pt] (encoder1) at (0,0){\small 编码器};
-\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=red!20,line width=0.6pt] (encoder2) at ([xshift=4em,yshift=0em]encoder1.east){\small 编码器};
-\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=red!30,line width=0.6pt] (encoder3) at ([xshift=3em]encoder2.east){\small 编码器};
+\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=red!30,line width=0.6pt] (encoder2) at ([xshift=7em,yshift=0em]encoder1.east){\small 编码器};
+
+
+\node[node,anchor=north,minimum width=6em,minimum height=2.4em,fill=blue!20,line width=0.6pt] (decoder1) at ([yshift=-2em]encoder1.south){\small 解码器};
+\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!30,line width=0.6pt] (decoder2) at ([xshift=7em,yshift=0em]decoder1.east){\small 解码器};

-\node[node,anchor=north,minimum width=6em,minimum height=2.4em,fill=blue!20,line width=0.6pt] (decoder1) at ([yshift=-3em]encoder1.south){\small 解码器};
-\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!20,line width=0.6pt] (decoder2) at ([xshift=4em,yshift=0em]decoder1.east){\small 解码器};
-\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!30,line width=0.6pt] (decoder3) at ([xshift=3em]decoder2.east){\small 解码器};

 \node[anchor=north,font=\scriptsize,fill=yellow!20,drop shadow,draw] (w1) at ([yshift=-1.6em]decoder1.south){知识 \ 就是 \ 力量 \ 。 \ <eos>};
-\node[anchor=north,font=\scriptsize,fill=green!20,drop shadow,draw] (w3) at ([yshift=-1.6em]decoder3.south){El conocimiento es poder . <eos>};
+\node[anchor=north,font=\scriptsize,fill=green!20,drop shadow,draw] (w3) at ([yshift=-1.6em]decoder2.south){El conocimiento es poder . <eos>};
 \node[anchor=south,font=\scriptsize,fill=orange!20,drop shadow,draw] (w2) at ([yshift=1.6em]encoder1.north){Knowledge \ is \ power \ . };
-\node[anchor=south,font=\scriptsize,fill=orange!20,drop shadow,draw] (w4) at ([yshift=1.6em]encoder3.north){Knowledge \ is \ power \ . };
+\node[anchor=south,font=\scriptsize,fill=orange!20,drop shadow,draw] (w4) at ([yshift=1.6em]encoder2.north){Knowledge \ is \ power \ . };


 \draw[->,thick] (decoder1.-90) -- (w1.north);
-\draw[->,thick] (decoder3.-90) -- (w3.north);
+\draw[->,thick] (decoder2.-90) -- (w3.north);
 \draw[->,thick] (w2.-90) -- (encoder1.90);
-\draw[->,thick] (w4.-90) -- (encoder3.90);
+\draw[->,thick] (w4.-90) -- (encoder2.90);
+
+\draw[->,thick](encoder1.south)--(decoder1.north);
+\draw[->,thick](encoder2.south)--(decoder2.north);

-\node [anchor=north,single arrow,minimum height=2.2em,fill=blue!20,rotate=-90] (arrow1) at ([yshift=-1.4em,xshift=0.4em]encoder1.south) {};
-\node [anchor=north,single arrow,minimum height=2.2em,fill=red!20,rotate=-90] (arrow2) at ([yshift=-1.4em,xshift=0.4em]encoder2.south) {};
-\node [anchor=north,single arrow,minimum height=2.2em,fill=red!20,rotate=-90] (arrow3) at ([yshift=-1.4em,xshift=0.4em]encoder3.south) {};

 \node[anchor=south,yshift=3.4em] at (encoder1.north){\small\bfnew{父模型}};
-\node[anchor=south,yshift=3.4em] at (encoder3.north){\small\bfnew{子模型}};
+\node[anchor=south,yshift=3.4em] at (encoder2.north){\small\bfnew{子模型}};

-\draw[->,dash pattern=on 3pt off 2pt,thick] ([yshift=0em]encoder1.0) -- node[above,font=\scriptsize]{参数复用}(encoder2.180);
-\draw[->,dash pattern=on 3pt off 2pt,thick] (encoder2.0) -- node[above,font=\scriptsize]{微调}(encoder3.180);
-\draw[->,dash pattern=on 3pt off 2pt,thick] ([yshift=0em]decoder1.0) -- node[above,font=\scriptsize]{参数复用}(decoder2.180);
-\draw[->,dash pattern=on 3pt off 2pt,thick] (decoder2.0) -- node[above,font=\scriptsize]{微调}(decoder3.180);
+\draw[->,dash pattern=on 3pt off 2pt,thick] ([yshift=0em]encoder1.0) -- node[above,font=\scriptsize]{参数复用\&微调}(encoder2.180);
+\draw[->,dash pattern=on 3pt off 2pt,thick] ([yshift=0em]decoder1.0) -- node[above,font=\scriptsize]{参数复用\&微调}(decoder2.180);


 \end{tikzpicture}

--- a/Chapter16/chapter16.tex
+++ b/Chapter16/chapter16.tex
@@ -235,7 +235,7 @@

 \parinterval 在训练一个神经网络的时候，如果过分地关注单个训练目标，可能使模型忽略掉其他可能有帮助的信息，这些信息可能来自于一些其他相关的任务\upcite{DBLP:journals/corr/Ruder17a}。通过联合多个独立但相关的任务共同学习，任务之间相互``促进''，就是多任务学习\upcite{DBLP:journals/corr/Ruder17a,DBLP:books/sp/98/Caruana98,liu2019multi}。多任务学习的常用做法是，针对多个相关的任务，共享模型的部分参数来学习不同任务之间相似的特征，并通过特定的模块来学习每个任务独立的特征（见\chapterfifteen）。常用的策略是对底层的模型参数进行共享，顶层的模型参数用于独立学习各个不同的任务。

-\parinterval 在神经机器翻译中，应用多任务学习的主要策略是将翻译任务作为主任务，同时设置一些仅使用单语数据的子任务，通过这些子任务来捕捉单语数据中的语言知识\upcite{DBLP:conf/emnlp/DomhanH17,DBLP:conf/emnlp/ZhangZ16,DBLP:journals/corr/LuongLSVK15}。一种多任务学习的方法是利用源语言单语数据，通过单个编码器对源语言数据进行建模，再分别使用两个解码器来学习源语言排序和翻译任务。源语言排序任务是指利用预排序规则对源语言句子中词的顺序进行调整\upcite{DBLP:conf/emnlp/WangCK07}，可以通过单语数据来构造训练数据，从而使编码器被训练得更加充分\upcite{DBLP:conf/emnlp/ZhangZ16}，如图\ref{fig:16-7}所示。
+\parinterval 在神经机器翻译中，应用多任务学习的主要策略是将翻译任务作为主任务，同时设置一些仅使用单语数据的子任务，通过这些子任务来捕捉单语数据中的语言知识\upcite{DBLP:conf/emnlp/DomhanH17,DBLP:conf/emnlp/ZhangZ16,DBLP:journals/corr/LuongLSVK15}。一种多任务学习的方法是利用源语言单语数据，通过单个编码器对源语言数据进行建模，再分别使用两个解码器来学习源语言排序和翻译任务。源语言排序任务是指利用预排序规则对源语言句子中词的顺序进行调整\upcite{DBLP:conf/emnlp/WangCK07}，可以通过单语数据来构造训练数据，从而使编码器被训练得更加充分\upcite{DBLP:conf/emnlp/ZhangZ16}，如图\ref{fig:16-7}所示，图中$y_{<}$表示当前时刻之前的译文，$x_{<}$表示源语言句子中词的顺序调整后的句子。
 %----------------------------------------------
 \begin{figure}[htp]
    \centering
@@ -245,7 +245,7 @@
 \end{figure}
 %----------------------------------------------

-\parinterval 虽然神经机器翻译模型可以看作一种语言生成模型，但生成过程中却依赖于源语言信息，因此无法直接利用目标语言单语数据进行多任务学习。针对这个问题，可以对原有翻译模型结构进行修改，在解码器底层增加一个语言模型子层，这个子层用于学习语言模型任务，与编码器端是完全独立的，如图\ref{fig:16-8}所示\upcite{DBLP:conf/emnlp/DomhanH17}。在训练过程中，分别将双语数据和单语数据送入翻译模型和语言模型进行计算，双语数据训练产生的梯度用于对整个模型进行参数更新，而单语数据产生的梯度只对语言模型子层进行参数更新。
+\parinterval 虽然神经机器翻译模型可以看作一种语言生成模型，但生成过程中却依赖于源语言信息，因此无法直接利用目标语言单语数据进行多任务学习。针对这个问题，可以对原有翻译模型结构进行修改，在解码器底层增加一个语言模型子层，这个子层用于学习语言模型任务，与编码器端是完全独立的，如图\ref{fig:16-8}所示\upcite{DBLP:conf/emnlp/DomhanH17}，图中$y_{<}$表示当前时刻之前的译文，$z_{<}$表示当前时刻之前的单语数据。在训练过程中，分别将双语数据和单语数据送入翻译模型和语言模型进行计算，双语数据训练产生的梯度用于对整个模型进行参数更新，而单语数据产生的梯度只对语言模型子层进行参数更新。

 %----------------------------------------------
 \begin{figure}[htp]

--- a/Chapter18/chapter18.tex
+++ b/Chapter18/chapter18.tex
@@ -148,7 +148,7 @@
 \parinterval 交互式机器翻译体现了一种用户的行为“干预”机器翻译结果的思想。实际上，在机器翻译出现错误时，人们总是希望用一种直接有效的方式“改变”译文，最短时间内达到改善翻译质量的目的。比如，如果机器翻译系统可以输出多个候选译文，用户可以在其中挑选最好的译文进行输出。也就是，人干预了译文候选的排序过程。另一个例子是{\small\bfnew{翻译记忆}}\index{翻译记忆}（Translation Memory\index{Translation Memory}）。翻译记忆记录了高质量的源语言-目标语言句对，有时也可以被看作是一种先验知识或“记忆”。因此，当进行机器翻译时，使用翻译记忆指导翻译过程也可以被看作是一种干预手段\upcite{DBLP:conf/acl/WangZS13,DBLP:conf/aaai/XiaHLS19}。


-\parinterval 虽然干预机器翻译系统的方式很多，最常用的还是对源语言特定片段翻译的干预，以期望最终句子的译文满足某些约束。这个问题也被称作{\small\bfnew{基于约束的翻译}}\index{基于约束的翻译} （Constraint-based Translation\index{Constraint-based Translation}）。比如，在翻译网页时，需要保持译文中的网页标签与源文一致。另一个典型例子是术语翻译。在实际应用中，经常会遇到公司名称、品牌名称、产品名称等专有名词和行业术语，以及不同含义的缩写，比如，对于“小牛翻译”这个专有名词，不同的机器翻译系统给出的结果不一样:“Maverick translation”、“Calf translation”、“The mavericks translation”…… 而它正确的翻译应该为“NiuTrans”。 对于这些类似的特殊词汇，机器翻译引擎很难翻译得准确。一方面，因为模型大多是在通用数据集上训练出来的，并不能保证数据集能涵盖所有的语言现象。另一方面，即使是这些术语在训练数据中出现，它们通常也是低频的，模型不容易捕捉它们的规律。为了保证翻译的准确性，对术语翻译进行干预是十分有必要的，这对领域适应等问题的求解也是非常有意义的。
+\parinterval 虽然干预机器翻译系统的方式很多，最常用的还是对源语言特定片段翻译的干预，以期望最终句子的译文满足某些约束。这个问题也被称作{\small\bfnew{基于约束的翻译}}\index{基于约束的翻译} （Constraint-based Translation\index{Constraint-based Translation}）。比如，在翻译网页时，需要保持译文中的网页标签与源文一致。另一个典型例子是术语翻译。在实际应用中，经常会遇到公司名称、品牌名称、产品名称等专有名词和行业术语，以及不同含义的缩写，比如，对于“小牛翻译”这个专有名词，不同的机器翻译系统给出的结果不一样:“Maverick translation”、“Calf translation”、“The mavericks translation”等等，而它正确的翻译应该为“NiuTrans”。 对于这些类似的特殊词汇，机器翻译引擎很难翻译得准确。一方面，因为模型大多是在通用数据集上训练出来的，并不能保证数据集能涵盖所有的语言现象。另一方面，即使是这些术语在训练数据中出现，它们通常也是低频的，模型不容易捕捉它们的规律。为了保证翻译的准确性，对术语翻译进行干预是十分有必要的，这对领域适应等问题的求解也是非常有意义的。

 \parinterval 就{\small\bfnew 词汇约束翻译}\index{词汇约束翻译}（Lexically Constrained Translation）\index{Lexically Constrained Translation}而言，在不干预的情况下让模型直接翻译出正确术语是很难的，因为术语的译文很可能是未登录词，因此必须人为提供额外的术语词典，那么我们的目标就是让模型的翻译输出遵守用户提供的术语约束。这个过程如图\ref{fig:18-3}所示。
 %----------------------------------------------