Commit 4c3a48d1 by 孟霞

合并分支 'caorunzhe' 到 'mengxia'

Caorunzhe

查看合并请求 !554
parents 29861056 d4c2adbd
......@@ -376,7 +376,7 @@ NMT & 21.7 & 18.7 & -13.7 \\
% NEW SECTION 10.3
%----------------------------------------------------------------------------------------
\sectionnewpage
\section{基于循环神经网络的翻译模型}
\section{基于循环神经网络的翻译建模}
\parinterval 早期神经机器翻译的进展主要来自两个方面:1)使用循环神经网络对单词序列进行建模;2)注意力机制的使用。表\ref{tab:10-6}列出了2013-2015年间有代表性的部分研究工作。从这些工作的内容上看,当时的研究重点还是如何有效地使用循环神经网络进行翻译建模以及使用注意力机制捕捉双语单词序列间的对应关系。
......
......@@ -231,7 +231,7 @@
% NEW SECTION
%----------------------------------------------------------------------------------------
\section{基于卷积神经网络的机器翻译模型}
\section{基于卷积神经网络的翻译建模}
\parinterval 正如之前所讲,卷积神经网络可以用于序列建模,同时具有并行性高和易于学习的特点,一个很自然的想法就是将其用作神经机器翻译模型中的特征提取器。因此,在神经机器翻译被提出之初,研究人员就已经开始利用卷积神经网络对句子进行特征提取。比较经典的模型是使用卷积神经网络作为源语言句子的编码器,使用循环神经网络作为目标语译文生成的解码器\upcite{kalchbrenner-blunsom-2013-recurrent,Gehring2017ACE}。之后也有研究人员提出完全基于卷积神经网络的翻译模型(ConvS2S)\upcite{DBLP:journals/corr/GehringAGYD17},或者针对卷积层进行改进,提出效率更高、性能更好的模型\upcite{Kaiser2018DepthwiseSC,Wu2019PayLA}。本节将基于ConvS2S模型,阐述如何使用卷积神经网络搭建端到端神经机器翻译模型。
......
......@@ -23,6 +23,10 @@
\chapter{神经机器翻译结构优化}
模型结构的设计是机器翻译系统研发中最重要的部分。在神经机器翻译中,虽然系统研发人员脱离了繁琐的特征工程,但是神经网络结构的设计仍然非常重要。无论是像循环神经网络、Transformer这样的整体架构的设计,还是注意力机制等局部结构的设计,都对机器翻译性能有着很大的影响。
本章主要讨论神经机器翻译中若干结构优化的方向,包括:注意力机制的改进、网络连接优化及深层网络建模、基于树结构的模型、神经网络结构自动搜索等。这些内容可以指导神经机器翻译系统的深入优化,其中涉及的一些模型和方法也可以应用于其他自然语言处理任务。
%----------------------------------------------------------------------------------------
% NEW SECTION
%----------------------------------------------------------------------------------------
......@@ -584,7 +588,7 @@ a = \funp{P}(\cdot|\mathbi{x};a)
\vspace{0.5em}
\item 设计搜索空间:理论上来说网络结构搜索应在所有潜在的模型结构所组成的空间中进行搜索(图\ref{fig:15-16})。在这种情况下如果不对候选模型结构进行限制的话,搜索空间会十分巨大。因此,在实际的结构搜索过程中往往会针对特定任务设计一个搜索空间,这个搜索空间是全体结构空间的一个子集,之后的搜索过程将在这个子空间中进行。如图\ref{fig:15-16}例子中的搜索空间所示,该空间由循环神经网络构成,其中候选的模型包括人工设计的LSTM、GRU等模型结构,也包括其他潜在的循环神经网络结构。
\vspace{0.5em}
\item 选择搜索策略:在设计好搜索空间之后,结构搜索的过程将选择一种合适的策略对搜索空间进行探索,找到最适用于当前任务的模型结构。不同于模型参数的学习,模型结构之间本身不存在直接可计算的关联,所以很难通过传统的最优化算法对其进行学习。因此,搜索策略往往选择采用遗传算法或强化学习等方法间接对模型结构进行设计或优化\upcite{DBLP:conf/icml/SoLL19,DBLP:conf/aaai/RealAHL19,DBLP:conf/icml/RealMSSSTLK17,DBLP:conf/iclr/ElskenMH19,DBLP:conf/iclr/ZophL17,DBLP:conf/cvpr/ZophVSL18,DBLP:conf/icml/PhamGZLD18,DBLP:conf/iclr/BakerGNR17,DBLP:conf/cvpr/TanCPVSHL19,DBLP:conf/iclr/LiuSVFK18}。不过近些年来也有研究人员开始尝试将模型结构建模为超网络中的参数,这样即可使用基于梯度的方式直接对最优结构进行搜索\upcite{DBLP:conf/nips/LuoTQCL18,DBLP:conf/iclr/LiuSY19,DBLP:conf/iclr/CaiZH19,DBLP:conf/cvpr/LiuCSAHY019,DBLP:conf/cvpr/WuDZWSWTVJK19,DBLP:conf/iclr/XieZLL19,DBLP:conf/uai/LiT19,DBLP:conf/cvpr/DongY19,DBLP:conf/iclr/XuX0CQ0X20,DBLP:conf/iclr/ZelaESMBH20,DBLP:conf/iclr/MeiLLJYYY20}
\item 选择搜索策略:在设计好搜索空间之后,结构搜索的过程将选择一种合适的策略对搜索空间进行探索,找到最适用于当前任务的模型结构。不同于模型参数的学习,模型结构之间本身不存在直接可计算的关联,所以很难通过传统的最优化算法对其进行学习。因此,搜索策略往往选择采用遗传算法或强化学习等方法间接对模型结构进行设计或优化\upcite{DBLP:conf/icml/SoLL19,DBLP:conf/aaai/RealAHL19,DBLP:conf/icml/RealMSSSTLK17,DBLP:conf/iclr/ElskenMH19,DBLP:conf/iclr/ZophL17,DBLP:conf/cvpr/ZophVSL18,DBLP:conf/icml/PhamGZLD18,DBLP:conf/iclr/BakerGNR17,DBLP:conf/cvpr/TanCPVSHL19,DBLP:conf/iclr/LiuSVFK18} 不过近些年来也有研究人员开始尝试将模型结构建模为超网络中的参数,这样即可使用基于梯度的方式直接对最优结构进行搜索\upcite{DBLP:conf/nips/LuoTQCL18,DBLP:conf/iclr/LiuSY19,DBLP:conf/iclr/CaiZH19,DBLP:conf/cvpr/LiuCSAHY019,DBLP:conf/cvpr/WuDZWSWTVJK19,DBLP:conf/iclr/XieZLL19,DBLP:conf/uai/LiT19,DBLP:conf/cvpr/DongY19,DBLP:conf/iclr/XuX0CQ0X20,DBLP:conf/iclr/ZelaESMBH20,DBLP:conf/iclr/MeiLLJYYY20}
\vspace{0.5em}
\item 进行性能评估:在搜索到模型结构之后需要对这种模型结构的性能进行验证,确定当前时刻找到的模型结构性能优劣。但是对于结构搜索任务来说,在搜索的过程中将产生大量中间模型结构,如果直接对所有可能的结构进行评价,其时间代价是难以接受的。因此在结构搜索任务中也有很多研究人员尝试如何快速获取模型性能(绝对性能或相对性能)\upcite{DBLP:conf/nips/LuoTQCL18,DBLP:journals/jmlr/LiJDRT17,DBLP:conf/eccv/LiuZNSHLFYHM18}
\vspace{0.5em}
......@@ -648,7 +652,7 @@ a = \funp{P}(\cdot|\mathbi{x};a)
\begin{itemize}
\vspace{0.5em}
\item 整体框架:如图\ref{fig:15-17}所示,不同任务下不同结构往往会表现出不同的建模能力,而类似的结构在结构空间中又相对集中,因此在搜索空间的设计中,整体框架部分一般根据不同任务特点选择已经得到验证的经验性结构,通过这种方式能够快速定位到更有潜力的搜索空间。如对于图像任务来说,一般会将卷积神经网络设计为候选搜索空间\upcite{DBLP:conf/iclr/ElskenMH19,DBLP:conf/icml/PhamGZLD18,DBLP:conf/iclr/LiuSY19,DBLP:conf/eccv/LiuZNSHLFYHM18,DBLP:conf/icml/CaiYZHY18},而对于包括机器翻译在内的自然语言处理任务而言,则会更倾向于使用循环神经网络或基于自注意力机制的Transformer模型附近的结构空间作为搜索空间\upcite{DBLP:conf/icml/SoLL19,DBLP:conf/iclr/ZophL17,DBLP:conf/icml/PhamGZLD18,DBLP:conf/iclr/LiuSY19,DBLP:journals/taslp/FanTXQLL20,DBLP:conf/ijcai/ChenLQWLDDHLZ20,DBLP:conf/acl/WangWLCZGH20}。此外,也可以拓展搜索空间以覆盖更多网络结构\upcite{DBLP:conf/acl/LiHZXJXZLL20}
\item 整体框架:如图\ref{fig:15-17}所示,不同任务下不同结构往往会表现出不同的建模能力,而类似的结构在结构空间中又相对集中,因此在搜索空间的设计中,整体框架部分一般根据不同任务特点选择已经得到验证的经验性结构,通过这种方式能够快速定位到更有潜力的搜索空间。如对于图像任务来说,一般会将卷积神经网络设计为候选搜索空间\upcite{DBLP:conf/iclr/ElskenMH19,DBLP:conf/icml/PhamGZLD18,DBLP:conf/iclr/LiuSY19,DBLP:conf/eccv/LiuZNSHLFYHM18,DBLP:conf/icml/CaiYZHY18},而对于包括机器翻译在内的自然语言处理任务而言,则会更倾向于使用循环神经网络或基于自注意力机制的Transformer模型附近的结构空间作为搜索空间\upcite{DBLP:conf/icml/SoLL19,DBLP:conf/iclr/ZophL17,DBLP:conf/icml/PhamGZLD18,DBLP:conf/iclr/LiuSY19,DBLP:journals/taslp/FanTXQLL20,DBLP:conf/ijcai/ChenLQWLDDHLZ20,DBLP:conf/acl/WangWLCZGH20} 此外,也可以拓展搜索空间以覆盖更多网络结构\upcite{DBLP:conf/acl/LiHZXJXZLL20}
\vspace{0.5em}
\item 内部结构:由于算力限制,网络结构搜索的任务通常使用经验性的架构作为模型的整体框架,之后通过对搜索到的内部结构进行堆叠得到完整的模型结构。而对于内部结构的设计需要考虑到搜索过程中的最小搜索单元以及搜索单元之间的连接方式,最小搜索单元指的是在结构搜索过程中可被选择的最小独立计算单元(或被称为搜索算子、操作),在不同搜索空间的设计中,最小搜索单元的颗粒度各有不同,相对较小的搜索粒度主要包括诸如矩阵乘法、张量缩放等基本数学运算\upcite{DBLP:journals/corr/abs-2003-03384},中等粒度的搜索单元包括例如常见的激活函数,如ReLU、Tanh等\upcite{DBLP:conf/iclr/LiuSY19,DBLP:conf/acl/LiHZXJXZLL20,Chollet2017XceptionDL},同时在搜索空间的设计上也有研究人员倾向于选择较大颗粒度的局部结构作为搜索单元,如注意力机制、层标准化等人工设计的经验性结构\upcite{DBLP:conf/icml/SoLL19,DBLP:conf/nips/LuoTQCL18,DBLP:journals/taslp/FanTXQLL20}。不过,对于搜索颗粒度的问题,目前还缺乏有效的方法针对不同任务进行自动优化。
\vspace{0.5em}
......@@ -666,7 +670,7 @@ a = \funp{P}(\cdot|\mathbi{x};a)
\begin{itemize}
\vspace{0.5em}
\item 进化算法{\red 检查这些词是不是第一次提到}:最初主要通过进化算法对神经网络中的模型结构以及权重参数进行优化\upcite{DBLP:conf/icga/MillerTH89,DBLP:journals/tnn/AngelineSP94,stanley2002evolving,DBLP:journals/alife/StanleyDG09}。而随着最优化算法的发展,近年来对于网络参数的学习更多地采用梯度下降法的方式,不过使用进化算法对模型结构进行优化却依旧被沿用至今\upcite{DBLP:conf/aaai/RealAHL19,DBLP:conf/icml/RealMSSSTLK17,DBLP:conf/iclr/ElskenMH19,DBLP:conf/ijcai/SuganumaSN18,Real2019AgingEF,DBLP:conf/iclr/LiuSVFK18,DBLP:conf/iccv/XieY17}。目前主流的方式主要是将模型结构看做是遗传算法中种群的个体,通过使用轮盘赌或锦标赛等抽取方式对种群中的结构进行取样作为亲本,之后通过亲本模型的突变产生新的模型结构,最终对这些新的模型结构进行适应度评估{\red (见XXX节)},根据模型结构在校验集上性能表现确定是否能够将其加入种群,整个过程如图\ref{fig:15-19}所示。对于进化算法中结构的突变主要指的是对模型中局部结构的改变,如增加跨层连接、替换局部操作等。
\item 进化算法{\red 检查这些词是不是第一次提到}:最初主要通过进化算法对神经网络中的模型结构以及权重参数进行优化\upcite{DBLP:conf/icga/MillerTH89,DBLP:journals/tnn/AngelineSP94,stanley2002evolving,DBLP:journals/alife/StanleyDG09}。而随着最优化算法的发展,近年来对于网络参数的学习更多地采用梯度下降法的方式,不过使用进化算法对模型结构进行优化却依旧被沿用至今\upcite{DBLP:conf/aaai/RealAHL19,DBLP:conf/icml/RealMSSSTLK17,DBLP:conf/iclr/ElskenMH19,DBLP:conf/ijcai/SuganumaSN18,Real2019AgingEF,DBLP:conf/iclr/LiuSVFK18,DBLP:conf/iccv/XieY17} 目前主流的方式主要是将模型结构看做是遗传算法中种群的个体,通过使用轮盘赌或锦标赛等抽取方式对种群中的结构进行取样作为亲本,之后通过亲本模型的突变产生新的模型结构,最终对这些新的模型结构进行适应度评估{\red (见XXX节)},根据模型结构在校验集上性能表现确定是否能够将其加入种群,整个过程如图\ref{fig:15-19}所示。对于进化算法中结构的突变主要指的是对模型中局部结构的改变,如增加跨层连接、替换局部操作等。
%----------------------------------------------
\begin{figure}[htp]
......
......@@ -41,7 +41,7 @@
\node [anchor=west,fill=red!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-1) at ([xshift=2.0em,yshift=1.6em]node3-2.east){\scriptsize{英语}};
\node [anchor=north,fill=green!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-2) at (node4-1.south){\scriptsize{英语}};
\node [anchor=west,fill=green!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-3) at (node4-1.east){\scriptsize{汉语}};
\node [anchor=west,fill=yellow!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-3) at (node4-1.east){\scriptsize{汉语}};
\node [anchor=north,fill=green!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-4) at (node4-3.south){\scriptsize{汉语}};
......
\begin{tikzpicture}
%%%%%%%%词典推断------------------------------------------------------------
\begin{scope}
\draw [-,ublue,line width=0.5pt] (0,0)..controls (0.3,0.2) and (0.5,0)..(0.7,-0.2)..controls (0.8,-0.3) and (0.9,-0.4)..(1.1,-0.4)..controls (1.3,-0.4) and (1.3,-0.1)..(1.28,0)..controls (1.26,0.1) and (1.25,0.2)..(1.2,0.3)..controls (1.15,0.4)and (1.2,0.5)..(1.6,0.55)..controls (1.7,0.56) and (1.78,0.5)..(1.85,0.35)..controls (2.0,0.0) and (2.05,-0.1)..(2.05,-0.5)..controls (2.04,-1.1) and (1.5,-1.1)..(0.6,-0.78)..controls (0.5,-0.74) and (0.4,-0.7)..(0.2,-0.5)..controls(0.1,-0.4) and (-0.15,-0.1)..(0,0) ;
\draw [-,red!70,line width=0.5pt] (0.04,-0.5) .. controls (0,-0.4) and (0.4,-0.1)..(0.7,-0.3)..controls (0.9,-0.45) and (1.1,-0.4)..(1.2,-0.3)..controls (1.3,-0.2) and (1.2,0.1).. (1.0,0.3)..controls (0.8,0.5) and (1.0,0.6)..(1.2,0.67)..controls (1.5,0.78) and (1.8,0.5)..(1.9,0.2)..controls(2.1,-0.3) and (2,-0.5)..(1.8,-0.75)..controls (1.5,-1.1) and (1.2,-1.0)..(0.4,-0.8)..controls (0.3,-0.77) and (0.14,-0.755)..(0.04,-0.5);
\draw [-,thick] (-0.7,1.0)--(-0.7,-1.0);
\node [anchor=center](c1) at (-0.1,0){\tiny{$\mathbi{Y}$}};
\node [anchor=center](c2) at (-0.3,-0.7){\tiny{$\mathbi{W}\cdot \mathbi{X}$}};
\node [anchor=center,red!70](cr1) at (0.65,-0.65){\scriptsize{$\bullet$}};
\node [anchor=center,ublue](cb1) at (0.6,-0.5){\scriptsize{$\bullet$}};
\node [anchor=center,red!70](cr2) at (1.65,-0.65){\scriptsize{$\bullet$}};
\node [anchor=center,ublue](cb2) at (1.55,-0.8){\scriptsize{$\bullet$}};
\node [anchor=center,red!70](cr3) at (1.5,0.1){\scriptsize{$\bullet$}};
\node [anchor=center,ublue](cb3) at (1.6,-0.05){\scriptsize{$\bullet$}};
\draw [-,red](0.65,-0.65)--(0.60,-0.62)--(0.66,-0.58)--(0.6,-0.55)--(0.63,-0.52)--(0.6,-0.5);
\draw [-,red](1.65,-0.65)--(1.60,-0.68)--(1.64,-0.72)--(1.56,-0.72)--(1.60,-0.76)--(1.55,-0.8);
\draw [-,red](1.5,0.1)--(1.53,0.08)--(1.49,0.04)--(1.58,0.03)--(1.54,-0.01)--(1.6,-0.05);
\end{scope}
%%%%%%%%X映射到Y空间------------------------------------------------------------
\begin{scope}[xshift=-8.0em]
\draw [-,ublue,line width=0.5pt] (0,0)..controls (0.3,0.2) and (0.5,0)..(0.7,-0.2)..controls (0.8,-0.3) and (0.9,-0.4)..(1.1,-0.4)..controls (1.3,-0.4) and (1.3,-0.1)..(1.28,0)..controls (1.26,0.1) and (1.25,0.2)..(1.2,0.3)..controls (1.15,0.4)and (1.2,0.5)..(1.6,0.55)..controls (1.7,0.56) and (1.78,0.5)..(1.85,0.35)..controls (2.0,0.0) and (2.05,-0.1)..(2.05,-0.5)..controls (2.04,-1.1) and (1.5,-1.1)..(0.6,-0.78)..controls (0.5,-0.74) and (0.4,-0.7)..(0.2,-0.5)..controls(0.1,-0.4) and (-0.15,-0.1)..(0,0) ;
\draw [-,red!70,line width=0.5pt] (0.04,-0.5) .. controls (0,-0.4) and (0.4,-0.1)..(0.7,-0.3)..controls (0.9,-0.45) and (1.1,-0.4)..(1.2,-0.3)..controls (1.3,-0.2) and (1.2,0.1).. (1.0,0.3)..controls (0.8,0.5) and (1.0,0.6)..(1.2,0.67)..controls (1.5,0.78) and (1.8,0.5)..(1.9,0.2)..controls(2.1,-0.3) and (2,-0.5)..(1.8,-0.75)..controls (1.5,-1.1) and (1.2,-1.0)..(0.4,-0.8)..controls (0.3,-0.77) and (0.14,-0.755)..(0.04,-0.5);
\draw [-,thick] (-0.7,1.0)--(-0.7,-1.0);
\node [anchor=center](c1) at (-0.1,0){\tiny{$\mathbi{Y}$}};
\node [anchor=center](c2) at (-0.3,-0.7){\tiny{$\mathbi{W}\cdot \mathbi{X}$}};
\node [anchor=center,red!70](cr1) at (0.65,-0.65){\scriptsize{$\bullet$}};
\node [anchor=center,ublue](cb1) at (0.6,-0.5){\scriptsize{$\bullet$}};
\node [anchor=center,red!70](cr2) at (1.65,-0.65){\scriptsize{$\bullet$}};
\node [anchor=center,ublue](cb2) at (1.55,-0.8){\scriptsize{$\bullet$}};
\node [anchor=center,red!70](cr3) at (1.5,0.1){\scriptsize{$\bullet$}};
\node [anchor=center,ublue](cb3) at (1.6,-0.05){\scriptsize{$\bullet$}};
%%%%%%一堆红色的球
\node [anchor=center,red!70](cr4) at (0.15,-0.6){\Large{$\cdot$}};
\node [anchor=center,red!70](cr5) at (0.3,-0.6){\Large{$\cdot$}};
\node [anchor=center,red!70](cr6) at (0.5,-0.55){\Large{$\cdot$}};
\node [anchor=center,red!70](cr7) at (0.35,-0.4){\Large{$\cdot$}};
\node [anchor=center,red!70](cr8) at (0.4,-0.7){\Large{$\cdot$}};
\node [anchor=center,red!70](cr8) at (0.55,-0.8){\Large{$\cdot$}};
\node [anchor=center,red!70](cr9) at (0.9,-0.8){\Large{$\cdot$}};
\node [anchor=center,red!70](cr10) at (0.9,-0.5){\Large{$\cdot$}};
\node [anchor=center,red!70](cr11) at (1.4,-0.8){\Large{$\cdot$}};
\node [anchor=center,red!70](cr12) at (1.45,-0.3){\Large{$\cdot$}};
\node [anchor=center,red!70](cr13) at (1.35,0.3){\Large{$\cdot$}};
\node [anchor=center,red!70](cr14) at (1.2,0.4){\Large{$\cdot$}};
\node [anchor=center,red!70](cr15) at (1.6,0.45){\Large{$\cdot$}};
%%%%%%一堆蓝色的球
\node [anchor=center,ublue](cb4) at (0.1,-0.2){\Large{$\cdot$}};
\node [anchor=center,ublue](cb5) at (0.3,-0.2){\Large{$\cdot$}};
\node [anchor=center,ublue](cb6) at (0.5,-0.25){\Large{$\cdot$}};
\node [anchor=center,ublue](cb7) at (0.4,-0.1){\Large{$\cdot$}};
\node [anchor=center,ublue](cb8) at (0.35,-0.45){\Large{$\cdot$}};
\node [anchor=center,ublue](cb9) at (0.45,-0.6){\Large{$\cdot$}};
\node [anchor=center,ublue](cb10) at (0.85,-0.45){\Large{$\cdot$}};
\node [anchor=center,ublue](cb11) at (1.45,-0.45){\Large{$\cdot$}};
\node [anchor=center,ublue](cb12) at (1.3,-0.85){\Large{$\cdot$}};
\node [anchor=center,ublue](cb13) at (1.8,-0.5){\Large{$\cdot$}};
\node [anchor=center,ublue](cb14) at (1.75,0.2){\Large{$\cdot$}};
\node [anchor=center,ublue](cb15) at (1.6,0.2){\Large{$\cdot$}};
\end{scope}
%%%%%%%%X、Y词嵌入空间------------------------------------------------------------
\begin{scope}[xshift=-16em]
\draw [-,ublue,line width=0.5pt] (0,0)..controls (0.3,0.2) and (0.5,0)..(0.7,-0.2)..controls (0.8,-0.3) and (0.9,-0.4)..(1.1,-0.4)..controls (1.3,-0.4) and (1.3,-0.1)..(1.28,0)..controls (1.26,0.1) and (1.25,0.2)..(1.2,0.3)..controls (1.15,0.4)and (1.2,0.5)..(1.6,0.55)..controls (1.7,0.56) and (1.78,0.5)..(1.85,0.35)..controls (2.0,0.0) and (2.05,-0.1)..(2.05,-0.5)..controls (2.04,-1.1) and (1.5,-1.1)..(0.6,-0.78)..controls (0.5,-0.74) and (0.4,-0.7)..(0.2,-0.5)..controls(0.1,-0.4) and (-0.15,-0.1)..(0,0) ;
\node [anchor=center](x1) at (-1.45,0.2){\tiny{$\mathbi{X}$}};
\node [anchor=center](y1) at (1.1,0.1){\tiny{$\mathbi{Y}$}};
\node [anchor=center,ublue](cb1) at (0.6,-0.5){\scriptsize{$\bullet$}};
\node [anchor=center,ublue](cb2) at (1.55,-0.8){\scriptsize{$\bullet$}};
\node [anchor=center,ublue](cb3) at (1.6,-0.05){\scriptsize{$\bullet$}};
%%%%%%一堆蓝色的球
\node [anchor=center,ublue](cb4) at (0.1,-0.2){\Large{$\cdot$}};
\node [anchor=center,ublue](cb5) at (0.3,-0.2){\Large{$\cdot$}};
\node [anchor=center,ublue](cb6) at (0.5,-0.25){\Large{$\cdot$}};
\node [anchor=center,ublue](cb7) at (0.4,-0.1){\Large{$\cdot$}};
\node [anchor=center,ublue](cb8) at (0.35,-0.45){\Large{$\cdot$}};
\node [anchor=center,ublue](cb9) at (0.45,-0.6){\Large{$\cdot$}};
\node [anchor=center,ublue](cb10) at (0.85,-0.45){\Large{$\cdot$}};
\node [anchor=center,ublue](cb11) at (1.45,-0.45){\Large{$\cdot$}};
\node [anchor=center,ublue](cb12) at (1.3,-0.85){\Large{$\cdot$}};
\node [anchor=center,ublue](cb13) at (1.8,-0.5){\Large{$\cdot$}};
\node [anchor=center,ublue](cb14) at (1.75,0.2){\Large{$\cdot$}};
\node [anchor=center,ublue](cb15) at (1.6,0.2){\Large{$\cdot$}};
\node [anchor=center](rw1) at (-0.5,0.45){\tiny{cat}};
\node [anchor=center](rw2) at (0.05,0.4){\tiny{feline}};
\node [anchor=center](rw3) at (-1.17,-0.07){\tiny{car}};
\node [anchor=center](rw4) at (-0.7,-0.65){\tiny{deep}};
\node [anchor=center](bw1) at (0.2,-0.1){\tiny{felin}};
\node [anchor=center](bw2) at (0.75,-0.65){\tiny{katze}};
\node [anchor=center](bw3) at (1.55,-0.65){\tiny{auto}};
\node [anchor=center](bw4) at (1.6,-0.2){\tiny{tief}};
\node [anchor=center](de1) at (0.3,-1.5) {\small{(a) $\mathbi{X}$$\mathbi{Y}$词嵌入空间}};
\node [anchor=center](de2) at (3.9,-1.5) {\small{(b) $\mathbi{X}$映射到$\mathbi{Y}$空间}};
\node [anchor=center](de3) at (7,-1.5) {\small{(c) 词典推断}};
\node [anchor=center](de4) at (10.1,-1.5) {\small{(d) 微调结果}};
\end{scope}
\begin{scope}[xshift=-14.5em,yshift=0.8em,rotate=-150]
\draw [-,red!70,line width=0.5pt] (0.04,-0.5) .. controls (0,-0.4) and (0.4,-0.1)..(0.7,-0.3)..controls (0.9,-0.45) and (1.1,-0.4)..(1.2,-0.3)..controls (1.3,-0.2) and (1.2,0.1).. (1.0,0.3)..controls (0.8,0.5) and (1.0,0.6)..(1.2,0.67)..controls (1.5,0.78) and (1.8,0.5)..(1.9,0.2)..controls(2.1,-0.3) and (2,-0.5)..(1.8,-0.75)..controls (1.5,-1.1) and (1.2,-1.0)..(0.4,-0.8)..controls (0.3,-0.77) and (0.14,-0.755)..(0.04,-0.5);
\node [anchor=center,red!70](cr1) at (0.65,-0.65){\scriptsize{$\bullet$}};
\node [anchor=center,red!70](cr2) at (1.65,-0.65){\scriptsize{$\bullet$}};
\node [anchor=center,red!70](cr3) at (1.5,0.1){\scriptsize{$\bullet$}};
%%%%%%一堆红色的球
\node [anchor=center,red!70](cr4) at (0.15,-0.6){\Large{$\cdot$}};
\node [anchor=center,red!70](cr5) at (0.3,-0.6){\Large{$\cdot$}};
\node [anchor=center,red!70](cr6) at (0.5,-0.55){\Large{$\cdot$}};
\node [anchor=center,red!70](cr7) at (0.35,-0.4){\Large{$\cdot$}};
\node [anchor=center,red!70](cr8) at (0.4,-0.7){\Large{$\cdot$}};
\node [anchor=center,red!70](cr8) at (0.55,-0.8){\Large{$\cdot$}};
\node [anchor=center,red!70](cr9) at (0.9,-0.8){\Large{$\cdot$}};
\node [anchor=center,red!70](cr10) at (0.9,-0.5){\Large{$\cdot$}};
\node [anchor=center,red!70](cr11) at (1.4,-0.8){\Large{$\cdot$}};
\node [anchor=center,red!70](cr12) at (1.45,-0.3){\Large{$\cdot$}};
\node [anchor=center,red!70](cr13) at (1.35,0.3){\Large{$\cdot$}};
\node [anchor=center,red!70](cr14) at (1.2,0.4){\Large{$\cdot$}};
\node [anchor=center,red!70](cr15) at (1.6,0.45){\Large{$\cdot$}};
\end{scope}
%%%%%%%%%%%微调结果------------------------------------------------------------
\begin{scope}[xshift=8.2em]
\draw [-,red!70,line width=0.5pt] (0,0.4688)..controls (0.3,0.45) and (0.5,0.2)..(0.7,-0.25)..controls (0.8,-0.45) and (0.9,-0.4)..(1.1,-0.4)..controls (1.3,-0.42) and (1.3,-0.12)..(1.28,0)..controls (1.26,0.1) and (1.25,0.2)..(1.2,0.3)..controls (1.13,0.4) and (1.18,0.5)..(1.6,0.55)..controls (1.7,0.56) and (1.78,0.5)..(1.85,0.35)..controls (2.03,0.0) and (2.08,-0.1)..(2.07,-0.5)..controls (2.04,-1.1) and (1.5,-1.16)..(0.6,-0.91)..controls (0.05,-0.71) and (-0.2,-0.53)..(-0.25,-0.45)..controls (-0.55,0.0) and (-0.5,0.501)..(0,0.4688);
\draw [-,ublue,line width=0.5pt] (0,0.5)..controls (0.3,0.5) and (0.5,0.2)..(0.7,-0.25)..controls (0.8,-0.45) and (0.9,-0.4)..(1.1,-0.4)..controls (1.3,-0.40) and (1.3,-0.1)..(1.28,0)..controls (1.26,0.1) and (1.25,0.2)..(1.2,0.3)..controls (1.15,0.4)and (1.2,0.5)..(1.6,0.55)..controls (1.7,0.56) and (1.78,0.5)..(1.85,0.35)..controls (2.0,0.0) and (2.05,-0.1)..(2.05,-0.5)..controls (2.04,-1.1) and (1.5,-1.1)..(0.6,-0.91)..controls (0.0,-0.75) and (-0.2,-0.53)..(-0.25,-0.45)..controls (-0.5,0.0) and (-0.5,0.501)..(0,0.5);
\draw [-,thick] (-0.8,1.0)--(-0.8,-1.0);
\node [anchor=center](c1) at (0.1,0.6){\tiny{$\mathbi{Y}$}};
\node [anchor=center](c2) at (-0.45,-0.7){\tiny{$\mathbi{W}\cdot \mathbi{X}$}};
\node [anchor=center,red!70](cr1) at (0.2,-0.35){\scriptsize{$\bullet$}};
\node [anchor=center,red!70](cr2) at (1.58,-0.78){\scriptsize{$\bullet$}};
\node [anchor=center,red!70](cr3) at (1.6,0){\scriptsize{$\bullet$}};
\node [anchor=center,ublue](cb1) at (0.2,-0.3){\scriptsize{$\bullet$}};
\node [anchor=center,ublue](cb2) at (1.55,-0.8){\scriptsize{$\bullet$}};
\node [anchor=center,ublue](cb3) at (1.6,-0.05){\scriptsize{$\bullet$}};
%%%%%%一堆红色的球
\node [anchor=center,red!70](cb4) at (-0.35,0.16){\Large{$\cdot$}};
\node [anchor=center,red!70](cb5) at (-0.03,0.37){\Large{$\cdot$}};
\node [anchor=center,red!70](cb6) at (-0.03,0.12){\Large{$\cdot$}};
\node [anchor=center,red!70](cb7) at (0.37,0.02){\Large{$\cdot$}};
\node [anchor=center,red!70](cb8) at (-0.18,-0.18){\Large{$\cdot$}};
\node [anchor=center,red!70](cb9) at (0.65,-0.43){\Large{$\cdot$}};
\node [anchor=center,red!70](cb10) at (0.32,-0.68){\Large{$\cdot$}};
\node [anchor=center,red!70](cb11) at (0.82,-0.73){\Large{$\cdot$}};
\node [anchor=center,red!70](cb12) at (1.23,-0.85){\Large{$\cdot$}};
\node [anchor=center,red!70](cb13) at (1.8,-0.47){\Large{$\cdot$}};
\node [anchor=center,red!70](cb14) at (1.75,0.23){\Large{$\cdot$}};
\node [anchor=center,red!70](cb15) at (1.38,-0.44){\Large{$\cdot$}};
\node [anchor=center,red!70](cb16) at (1.42,0.26){\Large{$\cdot$}};
%%%%%%一堆蓝色的球
\node [anchor=center,ublue](cb4) at (-0.35,0.2){\Large{$\cdot$}};
\node [anchor=center,ublue](cb5) at (0,0.4){\Large{$\cdot$}};
\node [anchor=center,ublue](cb6) at (0,0.15){\Large{$\cdot$}};
\node [anchor=center,ublue](cb7) at (0.4,0.05){\Large{$\cdot$}};
\node [anchor=center,ublue](cb8) at (-0.15,-0.15){\Large{$\cdot$}};
\node [anchor=center,ublue](cb9) at (0.65,-0.4){\Large{$\cdot$}};
\node [anchor=center,ublue](cb10) at (0.3,-0.65){\Large{$\cdot$}};
\node [anchor=center,ublue](cb11) at (0.8,-0.7){\Large{$\cdot$}};
\node [anchor=center,ublue](cb12) at (1.2,-0.85){\Large{$\cdot$}};
\node [anchor=center,ublue](cb13) at (1.8,-0.5){\Large{$\cdot$}};
\node [anchor=center,ublue](cb14) at (1.75,0.2){\Large{$\cdot$}};
\node [anchor=center,ublue](cb15) at (1.4,-0.45){\Large{$\cdot$}};
\node [anchor=center,ublue](cb16) at (1.45,0.3){\Large{$\cdot$}};
\node [anchor=center](rw1) at (0.22,-0.45){\tiny{cat}};
\node [anchor=center](rw2) at (0.20,-0.15){\tiny{katze}};
\end{scope}
\end{tikzpicture}
\ No newline at end of file
......@@ -4,26 +4,26 @@
\begin{tikzpicture}
\tikzstyle{node}=[rounded corners=2pt,draw,minimum width=5em,minimum height=2em,drop shadow,font=\footnotesize]
\node[node,fill=blue!20] (nmt1) at (0,0){NMT系统1};
\node[node,anchor=west,fill=yellow!20] (nmt2) at ([xshift=1em]nmt1.east){NMT系统2};
\node[node,anchor=west,fill=red!20] (nmt3) at ([xshift=1em]nmt2.east){NMT系统3};
\node[node,fill=blue!20,line width=0.6pt] (nmt1) at (0,0){NMT系统1};
\node[node,anchor=west,fill=yellow!20,line width=0.6pt] (nmt2) at ([xshift=1em]nmt1.east){NMT系统2};
\node[node,anchor=west,fill=red!20,line width=0.6pt] (nmt3) at ([xshift=1em]nmt2.east){NMT系统3};
\node[node,anchor=south,fill=blue!20] (n1) at ([yshift=2.4em]nmt1.north){我不悦};
\node[node,anchor=west,fill=yellow!20] (n2) at ([xshift=1em]n1.east){我不开心};
\node[node,anchor=west,fill=red!20] (n3) at ([xshift=1em]n2.east){吾怀忳忳};
\node[node,anchor=south,fill=blue!20,line width=0.6pt] (n1) at ([yshift=2.4em]nmt1.north){我不悦};
\node[node,anchor=west,fill=yellow!20,line width=0.6pt] (n2) at ([xshift=1em]n1.east){我不开心};
\node[node,anchor=west,fill=red!20,line width=0.6pt] (n3) at ([xshift=1em]n2.east){吾怀忳忳};
\node[node,anchor=south,fill=green!20,minimum height=1.6em] (task1) at ([yshift=2.6em]n2.north){不同任务};
\node[node,anchor=south,fill=green!20,minimum height=1.6em,line width=0.6pt] (task1) at ([yshift=2.6em]n2.north){不同任务};
\node[node,anchor=west,fill=green!20,minimum height=1.6em] (task2) at ([xshift=8em]task1.east){源任务};
\node[node,anchor=north,minimum height=3.2em,fill=orange!20] (n4) at ([yshift=-2em]task2.south){};
\node[draw,anchor=north,cylinder,shape border rotate=90,minimum width=3em,aspect=0.4,fill=orange!20] (kd) at ([yshift=-1.7em]n4.south){\footnotesize 知识};
\node[node,anchor=west,fill=green!20,minimum height=1.6em,line width=0.6pt] (task2) at ([xshift=8em]task1.east){源任务};
\node[node,anchor=north,minimum height=3.2em,fill=orange!20,line width=0.6pt] (n4) at ([yshift=-2em]task2.south){};
\node[draw,anchor=north,cylinder,shape border rotate=90,minimum width=3em,aspect=0.4,fill=orange!20,line width=0.6pt] (kd) at ([yshift=-1.7em]n4.south){\footnotesize 知识};
\node[draw,minimum width=4em,font=\scriptsize,anchor=north,inner ysep=2pt,fill=blue!20] at ([yshift=-2.35em]task2.south){我不悦};
\node[draw,minimum width=4em,font=\scriptsize,anchor=north,inner ysep=2pt,fill=yellow!20] at ([yshift=-3.75em]task2.south){我不开心};
\node[draw,minimum width=4em,font=\scriptsize,anchor=north,inner ysep=2pt,fill=blue!20,line width=0.6pt] at ([yshift=-2.35em]task2.south){我不悦};
\node[draw,minimum width=4em,font=\scriptsize,anchor=north,inner ysep=2pt,fill=yellow!20,line width=0.6pt] at ([yshift=-3.75em]task2.south){我不开心};
\node[node,anchor=west,fill=green!20,minimum height=1.6em] (task3) at ([xshift=3em]task2.east){目标任务};
\node[node,anchor=north,fill=red!20] (n5) at ([yshift=-2.5em]task3.south){吾怀忳忳};
\node[node,anchor=north,fill=red!20] (sys) at ([yshift=-2.5em]n5.south){学习系统};
\node[node,anchor=west,fill=green!20,minimum height=1.6em,line width=0.6pt] (task3) at ([xshift=3em]task2.east){目标任务};
\node[node,anchor=north,fill=red!20,line width=0.6pt] (n5) at ([yshift=-2.5em]task3.south){吾怀忳忳};
\node[node,anchor=north,fill=red!20,line width=0.6pt] (sys) at ([yshift=-2.5em]n5.south){学习系统};
\draw[->,thick] ([yshift=-0.2em,xshift=-0.7em]task1.-145) -- node[left,font=\scriptsize,yshift=0.2em]{书面语}([yshift=0.2em]n1.90);
\draw[->,thick] ([yshift=-0.2em]task1.-90) -- node[right,font=\scriptsize,yshift=0.2em,xshift=-0.2em]{口语}([yshift=0.2em]n2.90);
......
......@@ -3,11 +3,11 @@
%-------------------------------------------------------------------------
\begin{tikzpicture}
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20] (x) at (0,0) {$\seq{x}$};
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20,line width=0.6pt] (x) at (0,0) {$\seq{x}$};
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=red!15] (p) at (0,-2.4) {$\seq{p}$};
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=red!20,line width=0.6pt] (p) at (0,-2.4) {$\seq{p}$};
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20] (y) at (2.4,-1.2) {$\seq{y}$};
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20,line width=0.6pt] (y) at (2.4,-1.2) {$\seq{y}$};
\draw[-,dashed,thick,black!50] (x.-90) -- (p.90);
\draw[-,dashed,thick,black!50] (p.0) -- (y.-135);
......
......@@ -3,12 +3,12 @@
%-------------------------------------------------------------------------
\begin{tikzpicture}
\tikzstyle{lan}=[font=\footnotesize,inner ysep=2pt,minimum height=1em]
\node[minimum height=3em,minimum width=8em,fill=orange!20,draw,rounded corners=2pt,align=center] (sys) at (0,0){多语言 \\ 单模型系统};
\node[draw,font=\footnotesize,minimum width=4em,fill=blue!20,rounded corners=1pt] (en) at (-3em,4em){英语};
\node[draw,font=\footnotesize,minimum width=4em,fill=blue!20,rounded corners=1pt] (fr) at (3em,4em){法语};
\node[minimum height=3em,minimum width=8em,fill=orange!20,draw,rounded corners=2pt,align=center,line width=0.6pt] (sys) at (0,0){多语言 \\ 单模型系统};
\node[draw,font=\footnotesize,minimum width=4em,fill=red!20,rounded corners=1pt,line width=0.6pt] (en) at (-3em,4em){英语};
\node[draw,font=\footnotesize,minimum width=4em,fill=red!20,rounded corners=1pt,line width=0.6pt] (fr) at (3em,4em){法语};
\node[minimum width=4em] at (6.6em,4em){$\dots$};
\node[draw,font=\footnotesize,minimum width=4em,fill=yellow!20,rounded corners=1pt] (de) at (-3em,-4em){德语};
\node[draw,font=\footnotesize,minimum width=4em,fill=yellow!20,rounded corners=1pt] (sp) at (3em,-4em){西班牙语};
\node[draw,font=\footnotesize,minimum width=4em,fill=blue!20,rounded corners=1pt,line width=0.6pt] (de) at (-3em,-4em){德语};
\node[draw,font=\footnotesize,minimum width=4em,fill=blue!20,rounded corners=1pt,line width=0.6pt] (sp) at (3em,-4em){西班牙语};
\node[minimum width=4em] at (6.6em,-4em){$\dots$};
\draw[->,thick] (en.-90) -- ([xshift=-1em]sys.90);
......@@ -18,27 +18,21 @@
\node[font=\footnotesize] (train) at (11em,7em) {\small\bfnew{训练阶段:}};
\node[anchor=north,font=\footnotesize] (pair1) at ([yshift=-1em,xshift=1em]train.south) {双语句对1:};
\node[anchor=west,draw=blue!40,lan,minimum width=9.8em,fill=blue!20] (box1) at ([yshift=.7em,xshift=0.4em]pair1.east) {};
\node[anchor=west,lan] at ([yshift=.7em,xshift=0.4em]pair1.east) {英语:{\color{red}<spanish>} \ hello};
\node[anchor=west,draw=yellow!40,lan,minimum width=9.8em,fill=yellow!20] (box2) at ([yshift=-.7em,xshift=0.4em]pair1.east) {};
\node[anchor=west,lan] at ([yshift=-.7em,xshift=0.4em]pair1.east) {西班牙语:hola};
\node[anchor=west,lan](train1) at ([yshift=.7em,xshift=0.4em]pair1.east) {英语:{\color{red}<spanish>} \ hello};
\node[anchor=west,lan](train2) at ([yshift=-.7em,xshift=0.4em]pair1.east) {西班牙语:hola};
\node[anchor=north,font=\footnotesize] (pair2) at ([yshift=-4.5em,xshift=1em]train.south) {双语句对2:};
\node[anchor=west,draw=blue!40,lan,minimum width=9.8em,fill=blue!20] (box3) at ([yshift=.7em,xshift=0.4em]pair2.east) {};
\node[anchor=west,lan] at ([yshift=.7em,xshift=0.4em]pair2.east) {法语:{\color{red}<german>} \ Bonjour};
\node[anchor=west,draw=yellow!40,lan,minimum width=9.8em,fill=yellow!20] (box4) at ([yshift=-.7em,xshift=0.4em]pair2.east) {};
\node[anchor=west,lan] at ([yshift=-.7em,xshift=0.4em]pair2.east) {德语:Hallo};
\node[anchor=west,lan](train3) at ([yshift=.7em,xshift=0.4em]pair2.east) {法语:{\color{red}<german>} \ Bonjour};
\node[anchor=west,lan](train4) at ([yshift=-.7em,xshift=0.4em]pair2.east) {德语:Hallo};
\node[anchor=north,font=\footnotesize] (decode) at ([yshift=-8em]train.south) {\small\bfnew{解码阶段:}};
\node[anchor=north,font=\footnotesize] (input) at ([yshift=-0.6em]decode.south) {输入:};
\node[anchor=west,draw=blue!40,lan,minimum width=9.8em,fill=blue!20] (box5) at ([xshift=0.4em]input.east) {};
\node[anchor=west,lan] at ([xshift=0.4em]input.east) {英语:{\color{red}<german>} \ hello};
\node[anchor=north,font=\footnotesize] (output) at ([yshift=-2.6em]decode.south) {输出:};
\node[anchor=west,draw=yellow!40,lan,minimum width=9.8em,fill=yellow!20] (box6) at ([xshift=0.4em]output.east) {};
\node[anchor=west,lan] at ([xshift=0.4em]output.east) {德语:Hallo};
\node[anchor=north,lan,minimum width=9.8em] (box7) at ([yshift=-2em]box4.south) {};
\node[anchor=north,font=\footnotesize] (input) at ([xshift=2.13em,yshift=-0.6em]decode.south) {输入:};
\node[anchor=west,lan](decode2) at ([xshift=0.4em]input.east) {英语:{\color{red}<german>} \ hello};
\node[anchor=north,font=\footnotesize] (output) at ([xshift=2.13em,yshift=-2.6em]decode.south) {输出:};
\node[anchor=west,lan](decode3) at ([xshift=0.4em]output.east) {德语:Hallo};
\node[anchor=north,lan,minimum width=9.8em] (box7) at ([yshift=-4em]train3.south) {};
\begin{pgfonlayer}{background}
\node[fill=red!15,draw=red!30,rounded corners=2pt,inner ysep=6pt,line width=1pt][fit=(train)(box4)]{};
\node[fill=green!20,,draw=green!40,rounded corners=2pt,inner ysep=6pt,line width=1pt][fit=(decode)(box7)(box6)]{};
\node[fill=red!20,draw=black,rounded corners=2pt,inner ysep=6pt,line width=1pt][fit=(train)(train4)(train1)(train2)(train3)]{};
\node[fill=blue!20,,draw=black,rounded corners=2pt,inner ysep=6pt,line width=1pt][fit=(decode)(output)(decode2)(decode3)(box7)]{};
\end{pgfonlayer}
\end{tikzpicture}
......
%%% outline
%-------------------------------------------------------------------------
\begin{tikzpicture}
\tikzstyle{rec} = [line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em]
\node [anchor=center] (node1-1) at (0,0) {\small{$y'$}};
\node[anchor=north,rec,fill=blue!20](node1-2) at ([yshift=-2.0em]node1-1.south) {\small{解码器}};
\node[anchor=north,rec,fill=red!20](node1-3) at ([yshift=-2em]node1-2.south) {\small{编码器}};
\node[anchor=east](node1-5) at ([xshift=-2em]node1-2.west) {\small{$y$}};
\node[anchor=north](node1-4) at ([yshift=-2em]node1-3.south) {\small{$x$}};
\draw [->,thick](node1-4.north)--(node1-3.south);
\draw [->,thick](node1-5.east)--(node1-2.west);
\draw [->,thick](node1-3.north)--(node1-2.south);
\draw [->,thick](node1-2.north)--(node1-1.south);
\node [anchor=center] (node2-1) at ([xshift=12.0em]node1-1.east) {\small{$y'$}};
\node[anchor=north,rec,fill=blue!20](node2-2) at ([yshift=-2.0em]node2-1.south) {\small{解码器}};
\node[anchor=north,rec,fill=red!20](node2-3) at ([yshift=-2em]node2-2.south) {\small{编码器}};
\node[anchor=east](node2-5) at ([xshift=-2em]node2-2.west) {\small{$y$}};
\node[anchor=north](node2-4) at ([yshift=-2em]node2-3.south) {\small{$x$}};
\node[anchor=west,rec,fill=yellow!20](node2-6) at ([xshift=3.0em]node2-3.east) {\small{解码器}};
\node[anchor=south](node2-7) at ([yshift=2em]node2-6.north) {\small{$x'$}};
\draw [->,thick](node2-4.north)--(node2-3.south);
\draw [->,thick](node2-5.east)--(node2-2.west);
\draw [->,thick](node2-3.north)--(node2-2.south)node[pos=0.5,left,font=\scriptsize]{翻译};
\draw [->,thick](node2-2.north)--(node2-1.south);
\draw [->,thick](node2-3.east)--(node2-6.west)node[pos=0.5,above,font=\scriptsize]{重排序};
\draw [->,thick](node2-6.north)--(node2-7.south);
\node [anchor=north](pos1) at ([yshift=0em]node1-4.south) {\small{(a)单任务学习}};
\node [anchor=west](pos2) at ([xshift=10.0em]pos1.east) {\small{(b)多任务学习}};
\end{tikzpicture}
\ No newline at end of file
\begin{tikzpicture}
\begin{scope}
\node [anchor=center] (node1-1) at (0,0) {\small{$y'$}};
\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4em,fill=blue!20](node1-2) at ([yshift=-3em]node1-1.south) {\small{softmax}};
\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4.3em,fill=blue!20](node1-2) at ([yshift=-3em]node1-1.south) {\small{Softmax}};
\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node1-3) at ([yshift=-2.0em]node1-2.south) {\small{Decoder}};
\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=yellow!20](node3-3) at ([yshift=-2.0em]node1-3.south) {\small{LM}};
\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=blue!20](node1-3) at ([yshift=-2.0em]node1-2.south) {\small{解码器}};
\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=yellow!20](node3-3) at ([yshift=-2.0em]node1-3.south) {\small{语言模型}};
\node[anchor=west,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4em,fill=blue!20](node3-2) at ([xshift=2em]node3-3.east) {\small{softmax}};
\node[anchor=west,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4.3em,fill=blue!20](node3-2) at ([xshift=2em]node3-3.east) {\small{Softmax}};
\node [anchor=north] (node3-1) at ([yshift=3.0em]node3-2.north) {\small{$z'$}};
\node[anchor=north](node3-41) at ([xshift=-0.6em,yshift=-2em]node3-3.south) {\small{$y$}};
\node[anchor=north](node3-42) at ([xshift=0.6em,yshift=-2em]node3-3.south) {\small{$z$}};
\node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node2-1) at ([xshift=-2em]node1-3.west) {\small{Encoder}};
\node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=red!20](node2-1) at ([xshift=-2em]node1-3.west) {\small{编码器}};
\node[anchor=north](node2-2) at ([yshift=-2em]node2-1.south) {\small{$x$}};
......@@ -34,9 +34,9 @@
\node [anchor=east] (node2-1-1) at ([xshift=-12.0em,yshift=-4.25em]node1-1.west) {\small{$y'$}};
\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4em,fill=blue!20](node2-1-2) at ([yshift=-3em]node2-1-1.south) {\small{softmax}};
\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node2-1-3) at ([yshift=-2.0em]node2-1-2.south) {\small{Decoder}};
\node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node2-2-1) at ([xshift=-2em]node2-1-3.west) {\small{Encoder}};
\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4.3em,fill=blue!20](node2-1-2) at ([yshift=-3em]node2-1-1.south) {\small{Softmax}};
\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=blue!20](node2-1-3) at ([yshift=-2.0em]node2-1-2.south) {\small{解码器}};
\node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=red!20](node2-2-1) at ([xshift=-2em]node2-1-3.west) {\small{编码器}};
\node[anchor=north](node2-2-2) at ([yshift=-2em]node2-2-1.south) {\small{$x$}};
\node[anchor=north](node2-2-3) at ([yshift=-2em]node2-1-3.south) {\small{$y$}};
......
\begin{tabular}{c c}
\begin{tikzpicture}
\begin{scope}
% ,minimum height =1em,minimum width=2em
\tikzstyle{circle} = [draw,black,line width=0.6pt,inner sep=3.5pt,rounded corners=4pt,minimum width=2em]
\tikzstyle{word} = [inner sep=3.5pt]
\node[circle,fill=red!20](data) at (0,0) {数据};
\node[circle,fill=blue!20](model) at ([xshift=5em]data.east) {模型};
\node[word] (init) at ([xshift=-5em]data.west){初始化};
\draw[->,thick] (init.east) -- ([xshift=-0.2em]data.west);
\draw [->,thick] ([yshift=1pt]data.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt]model.north) node[above,midway] {参数优化};
\draw [->,thick] ([yshift=1pt]model.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt]data.south) node[below,midway] {数据优化};
\node[word] at ([xshift=-0.5em,yshift=-5em]data.south){(a)思路1};
\end{scope}
\end{tikzpicture}
&
\begin{tikzpicture}
\begin{scope}
% ,minimum height =1em,minimum width=2em
\tikzstyle{circle} = [draw,black,line width=0.6pt,inner sep=3.5pt,rounded corners=4pt,minimum width=2em]
\tikzstyle{word} = [inner sep=3.5pt]
\node[circle,fill=red!20](data) at (0,0) {数据};
\node[circle,fill=blue!20](model) at ([xshift=5em]data.east) {模型};
\node[word] (init) at ([xshift=5em]model.east){初始化};
\draw[->,thick] (init.west) -- ([xshift=0.2em]model.east);
\draw [->,thick] ([yshift=1pt]data.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt]model.north) node[above,midway] {参数优化};
\draw [->,thick] ([yshift=1pt]model.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt]data.south) node[below,midway] {数据优化};
\node[word] at ([xshift=-0.5em,yshift=-5em]model.south){(b)思路2};
\end{scope}
\end{tikzpicture}
\end{tabular}
\ No newline at end of file
......@@ -4,13 +4,13 @@
\begin{tikzpicture}
\tikzstyle{node}=[rounded corners=4pt,draw,minimum height=3em,drop shadow,font=\footnotesize]
\node[node,minimum width=6em,minimum height=2.4em,fill=blue!20] (encoder1) at (0,0){\small 编码器};
\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!20] (encoder2) at ([xshift=4em,yshift=0em]encoder1.east){\small 编码器};
\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=red!20] (encoder3) at ([xshift=3em]encoder2.east){\small 编码器};
\node[node,minimum width=6em,minimum height=2.4em,fill=red!20,line width=0.6pt] (encoder1) at (0,0){\small 编码器};
\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=red!20,line width=0.6pt] (encoder2) at ([xshift=4em,yshift=0em]encoder1.east){\small 编码器};
\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=red!40,line width=0.6pt] (encoder3) at ([xshift=3em]encoder2.east){\small 编码器};
\node[node,anchor=north,minimum width=6em,minimum height=2.4em,fill=blue!20] (decoder1) at ([yshift=-3em]encoder1.south){\small 解码器};
\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!20] (decoder2) at ([xshift=4em,yshift=0em]decoder1.east){\small 解码器};
\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=red!20] (decoder3) at ([xshift=3em]decoder2.east){\small 解码器};
\node[node,anchor=north,minimum width=6em,minimum height=2.4em,fill=blue!20,line width=0.6pt] (decoder1) at ([yshift=-3em]encoder1.south){\small 解码器};
\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!20,line width=0.6pt] (decoder2) at ([xshift=4em,yshift=0em]decoder1.east){\small 解码器};
\node[node,anchor=west,minimum width=6em,minimum height=2.4em,fill=blue!40,line width=0.6pt] (decoder3) at ([xshift=3em]decoder2.east){\small 解码器};
\node[anchor=north,font=\scriptsize,fill=yellow!20] (w1) at ([yshift=-1.6em]decoder1.south){知识 \ 就是 \ 力量 \ \ <EOS>};
\node[anchor=north,font=\scriptsize,fill=green!20] (w3) at ([yshift=-1.6em]decoder3.south){Wissen \ ist \ Machit \ . \ <EOS>};
......@@ -24,7 +24,7 @@
\draw[->,thick] (w4.-90) -- (encoder3.90);
\node [anchor=north,single arrow,minimum height=2.2em,fill=blue!20,rotate=-90] (arrow1) at ([yshift=-1.4em,xshift=0.4em]encoder1.south) {};
\node [anchor=north,single arrow,minimum height=2.2em,fill=blue!20,rotate=-90] (arrow2) at ([yshift=-1.4em,xshift=0.4em]encoder2.south) {};
\node [anchor=north,single arrow,minimum height=2.2em,fill=red!20,rotate=-90] (arrow2) at ([yshift=-1.4em,xshift=0.4em]encoder2.south) {};
\node [anchor=north,single arrow,minimum height=2.2em,fill=red!20,rotate=-90] (arrow3) at ([yshift=-1.4em,xshift=0.4em]encoder3.south) {};
\node[anchor=south,yshift=3.4em] at (encoder1.north){\small\bfnew{父模型}};
......
......@@ -3,11 +3,11 @@
%-------------------------------------------------------------------------
\begin{tikzpicture}
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20] (x) at (0,0) {$\seq{x}$};
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20,line width=0.6pt] (x) at (0,0) {$\seq{x}$};
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=red!15] (p) at (2,0) {$\seq{p}$};
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=red!20,line width=0.6pt] (p) at (2,0) {$\seq{p}$};
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20] (y) at (4,0) {$\seq{y}$};
\node[draw,circle,inner sep=2pt,minimum size=2em,fill=blue!20,line width=0.6pt] (y) at (4,0) {$\seq{y}$};
\draw[-,dashed,thick,black!50] (x.0) -- (p.180);
\draw[-,dashed,thick,black!50] (p.0) -- (y.180);
......
\begin{tikzpicture}
\tikzstyle{rec} = [,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4.3em,fill=blue!20]
\node [anchor=center](node1) at (0,0) {源语言};
\node [anchor=west,rec,fill=red!20](node2) at ([xshift=2.0em]node1.east){编码器};
\node [anchor=west,rec](node3) at ([xshift=3.0em,yshift=2.0em]node2.east){解码器};
\node [anchor=west,rec,fill=yellow!20](node4) at ([xshift=3.0em,yshift=-2.0em]node2.east){鉴别器};
\draw [->,thick](node1.east)--(node2.west);
\draw [->,thick](node2.east)--([xshift=1.5em]node2.east)--([xshift=1.5em,yshift=2.0em]node2.east)--(node3.west);
\draw [->,thick](node2.east)--([xshift=1.5em]node2.east)--([xshift=1.5em,yshift=-2.0em]node2.east)--(node4.west);
\node [anchor=west,minimum width=5.0em](node5) at ([xshift=2.0em]node3.east) {目标语言};
\node [anchor=west,minimum width=5.0em](node6) at ([xshift=2.0em]node4.east) {< 领域 >};
\draw [->,thick](node3.east)--(node5.west);
\draw [->,thick](node4.east)--(node6.west);
\end{tikzpicture}
\ No newline at end of file
......@@ -47,38 +47,38 @@
\node [anchor=south](pos2-2) at ([yshift=-0.5em]pos2.north){\scriptsize{词典}};
%circle1
\node[rec,anchor=center,rotate=60,fill=green!30](c1x1) at ([xshift=-7em,yshift=-1.4em]circle1.east){\tiny{1}};
\node[rec,anchor=center,rotate=60,fill=green!30](c1x2) at ([xshift=-4.5em,yshift=1.8em]circle1.east){\tiny{2}};
\node[rec,anchor=center,rotate=60,fill=green!30](c1x3) at ([xshift=-4em,yshift=-0.5em]circle1.east){\tiny{3}};
\node[rec,anchor=center,rotate=60,fill=green!30](c1x4) at ([xshift=-3.5em,yshift=-2.5em]circle1.east){\tiny{4}};
\node[rec,anchor=center,rotate=60,fill=green!30](c1x5) at ([xshift=-2em,yshift=1.0em]circle1.east){\tiny{5}};
\node[rec,anchor=center,rotate=60,fill=green!40](c1x1) at ([xshift=-7em,yshift=-1.4em]circle1.east){\tiny{1}};
\node[rec,anchor=center,rotate=60,fill=green!40](c1x2) at ([xshift=-4.5em,yshift=1.8em]circle1.east){\tiny{2}};
\node[rec,anchor=center,rotate=60,fill=green!40](c1x3) at ([xshift=-4em,yshift=-0.5em]circle1.east){\tiny{3}};
\node[rec,anchor=center,rotate=60,fill=green!40](c1x4) at ([xshift=-3.5em,yshift=-2.5em]circle1.east){\tiny{4}};
\node[rec,anchor=center,rotate=60,fill=green!40](c1x5) at ([xshift=-2em,yshift=1.0em]circle1.east){\tiny{5}};
%circle2
\node[cir,anchor=center,rotate=-30,fill=red!30] (c2a) at ([xshift=-5.3em,yshift=2.15em]circle2.east){\tiny{a}};
\node[cir,anchor=east,rotate=-30,fill=red!30] (c2b) at ([xshift=2.0em,yshift=-1.25em]c2a.east){\tiny{b}};
\node[cir,anchor=east,rotate=-30,fill=red!30] (c2c) at ([xshift=0.8em,yshift=-3.9em]c2a.south){\tiny{c}};
\node[cir,anchor=east,rotate=-30,fill=red!30] (c2x) at ([xshift=-0.3em,yshift=-1.9em]c2a.south){\tiny{x}};
\node[cir,anchor=west,rotate=-30,fill=red!30] (c2y) at ([xshift=1.15em,yshift=-2.85em]c2a.east){\tiny{y}};
\node[cir,anchor=center,rotate=-30,fill=red!40] (c2a) at ([xshift=-5.3em,yshift=2.15em]circle2.east){\tiny{a}};
\node[cir,anchor=east,rotate=-30,fill=red!40] (c2b) at ([xshift=2.0em,yshift=-1.25em]c2a.east){\tiny{b}};
\node[cir,anchor=east,rotate=-30,fill=red!40] (c2c) at ([xshift=0.8em,yshift=-3.9em]c2a.south){\tiny{c}};
\node[cir,anchor=east,rotate=-30,fill=red!40] (c2x) at ([xshift=-0.3em,yshift=-1.9em]c2a.south){\tiny{x}};
\node[cir,anchor=west,rotate=-30,fill=red!40] (c2y) at ([xshift=1.15em,yshift=-2.85em]c2a.east){\tiny{y}};
%circle3
\node[rec,anchor=center,rotate=-30,fill=green!30] (c3x1) at ([xshift=-6.7em,yshift=1.75em]circle3.east){\tiny{1}};
\node[rec,anchor=east,rotate=-30,fill=green!30] (c3x2) at ([xshift=4.7em,yshift=-0.95em]c3x1.east){\tiny{2}};
\node[rec,anchor=east,rotate=-30,fill=green!30] (c3x3) at ([xshift=2.6em,yshift=-2.4em]c3x1.south){\tiny{3}};
\node[rec,anchor=east,rotate=-30,fill=green!30] (c3x4) at ([xshift=0.35em,yshift=-2.7em]c3x1.south){\tiny{4}};
\node[rec,anchor=west,rotate=-30,fill=green!30] (c3x5) at ([xshift=2.35em,yshift=-3.85em]c3x1.east){\tiny{5}};
\node[rec,anchor=center,rotate=-30,fill=green!40] (c3x1) at ([xshift=-6.7em,yshift=1.75em]circle3.east){\tiny{1}};
\node[rec,anchor=east,rotate=-30,fill=green!40] (c3x2) at ([xshift=4.7em,yshift=-0.95em]c3x1.east){\tiny{2}};
\node[rec,anchor=east,rotate=-30,fill=green!40] (c3x3) at ([xshift=2.6em,yshift=-2.4em]c3x1.south){\tiny{3}};
\node[rec,anchor=east,rotate=-30,fill=green!40] (c3x4) at ([xshift=0.35em,yshift=-2.7em]c3x1.south){\tiny{4}};
\node[rec,anchor=west,rotate=-30,fill=green!40] (c3x5) at ([xshift=2.35em,yshift=-3.85em]c3x1.east){\tiny{5}};
%circle4
\node[rec,anchor=center,rotate=-30,fill=green!30] (c4x1) at ([xshift=-6.7em,yshift=1.75em]circle4.east){\tiny{1}};
\node[rec,anchor=east,rotate=-30,fill=green!30] (c4x2) at ([xshift=4.7em,yshift=-0.95em]c4x1.east){\tiny{2}};
\node[rec,anchor=east,rotate=-30,fill=green!30] (c4x3) at ([xshift=2.6em,yshift=-2.4em]c4x1.south){\tiny{3}};
\node[rec,anchor=east,rotate=-30,fill=green!30] (c4x4) at ([xshift=0.35em,yshift=-2.7em]c4x1.south){\tiny{4}};
\node[rec,anchor=west,rotate=-30,fill=green!30] (c4x5) at ([xshift=2.35em,yshift=-3.85em]c4x1.east){\tiny{5}};
\node[rec,anchor=center,rotate=-30,fill=green!40] (c4x1) at ([xshift=-6.7em,yshift=1.75em]circle4.east){\tiny{1}};
\node[rec,anchor=east,rotate=-30,fill=green!40] (c4x2) at ([xshift=4.7em,yshift=-0.95em]c4x1.east){\tiny{2}};
\node[rec,anchor=east,rotate=-30,fill=green!40] (c4x3) at ([xshift=2.6em,yshift=-2.4em]c4x1.south){\tiny{3}};
\node[rec,anchor=east,rotate=-30,fill=green!40] (c4x4) at ([xshift=0.35em,yshift=-2.7em]c4x1.south){\tiny{4}};
\node[rec,anchor=west,rotate=-30,fill=green!40] (c4x5) at ([xshift=2.35em,yshift=-3.85em]c4x1.east){\tiny{5}};
\node[cir,anchor=center,rotate=-30,fill=red!30] (c4a) at ([xshift=-5.3em,yshift=2.15em]circle4.east){\tiny{a}};
\node[cir,anchor=east,rotate=-30,fill=red!30] (c4b) at ([xshift=2.0em,yshift=-1.25em]c4a.east){\tiny{b}};
\node[cir,anchor=east,rotate=-30,fill=red!30] (c4c) at ([xshift=0.8em,yshift=-3.9em]c4a.south){\tiny{c}};
\node[cir,anchor=east,rotate=-30,fill=red!30] (c4x) at ([xshift=-0.3em,yshift=-1.9em]c4a.south){\tiny{x}};
\node[cir,anchor=west,rotate=-30,fill=red!30] (c4y) at ([xshift=1.15em,yshift=-2.85em]c4a.east){\tiny{y}};
\node[cir,anchor=center,rotate=-30,fill=red!40] (c4a) at ([xshift=-5.3em,yshift=2.15em]circle4.east){\tiny{a}};
\node[cir,anchor=east,rotate=-30,fill=red!40] (c4b) at ([xshift=2.0em,yshift=-1.25em]c4a.east){\tiny{b}};
\node[cir,anchor=east,rotate=-30,fill=red!40] (c4c) at ([xshift=0.8em,yshift=-3.9em]c4a.south){\tiny{c}};
\node[cir,anchor=east,rotate=-30,fill=red!40] (c4x) at ([xshift=-0.3em,yshift=-1.9em]c4a.south){\tiny{x}};
\node[cir,anchor=west,rotate=-30,fill=red!40] (c4y) at ([xshift=1.15em,yshift=-2.85em]c4a.east){\tiny{y}};
\draw [color=red,line width=0.7pt,rotate=18] ([xshift=-5.1em,yshift=3.7em]circle4.east) ellipse (1.6em and 0.9em);
\draw [color=red,line width=0.7pt,rotate=-5] ([xshift=-2.8em,yshift=0.6em]circle4.east) ellipse (1.6em and 0.9em);
......
\begin{tikzpicture}
\begin{scope}
\tikzstyle{circle} = [draw,black,line width=0.6pt,inner sep=3.5pt,rounded corners=4pt,minimum width=2em,align=center,fill=blue!20]
\tikzstyle{word} = [inner sep=3.5pt]
\node[circle](center) at (0,0) {
\begin{tabular}{c | c}
$x\rightarrow y$ & $y\rightarrow x$ \\
模型 & 模型
\end{tabular}
};
\node[circle,fill=red!20] (left) at ([xshift=-9em]center.west) {$x\rightarrow y$ \\ 数据};
\node[circle,fill=red!20] (right) at ([xshift=9em]center.east) {$y\rightarrow x$ \\ 数据};
\node[word] (init) at ([yshift=6em]center.north){初始化};
\node[circle,fill=red!20] (down) at ([yshift=-8em]center.south) {$x,y$ \\ 数据};
\draw[->,thick] (init.south) -- ([yshift=0.2em]center.north);
\draw[->,thick] ([yshift=0.2em]down.north) -- ([yshift=-0.2em]center.south) node[pos=0.6,midway,align=left,xshift=-2.5em,yshift=0.5em] {语言模型\\目标函数};
\node [anchor=center] at ([yshift=2.0em,xshift=-2.5em]down.north){(模型优化)};
\draw[->,thick] ([yshift=1pt]left.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt,xshift=-2.2em]center.north) node[above,midway,align=center] {翻译模型目标函数\\(模型优化)};
\draw[->,thick] ([yshift=1pt,xshift=-1.8em]center.north) .. controls +(90:2em) and +(90:2em) .. ([yshift=1pt]right.north) node[above,pos=0.6,align=center] {回译\\(数据优化)};
\draw [->,thick] ([yshift=1pt]right.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt,xshift=2.2em]center.south) node[below,midway,align=center] {翻译模型目标函数\\(模型优化)};
\draw [->,thick] ([yshift=1pt,xshift=1.8em]center.south) .. controls +(-90:2em) and +(-90:2em) .. ([yshift=1pt]left.south) node[below,pos=0.6,align=center] {回译\\(数据优化)};
\end{scope}
\end{tikzpicture}
\begin{tikzpicture}
\tikzstyle{circle} = [draw,black,line width=0.6pt,inner sep=3.5pt,rounded corners=4pt,minimum width=2em]
\tikzstyle{word} = [inner sep=3.5pt]
\node [anchor=center] (node1-1) at (0,0) {\small{\seq{x}}};
\node [anchor=west] (node1-2) at ([xshift=0.8em]node1-1.east) {\small{\seq{y}}};
\node [anchor=north] (node1-3) at ([xshift=1.0em]node1-1.south) {\small{翻译模型f}};
\draw [->,line width=0.6pt](node1-1.east)--(node1-2.west);
\begin{pgfonlayer}{background}
{
\node[fill=blue!20,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=5em,drop shadow,rounded corners=2pt] [fit =(node1-1)(node1-2)(node1-3)] (remark1) {};
}
\end{pgfonlayer}
\node[anchor=north,circle,fill=red!20,minimum width=6.8em](node2) at ([xshift=-6.0em,yshift=-2.0em]remark1.south) {源语言句子$\seq{x}$};
\node[anchor=north,circle,fill=red!20,minimum width=6.8em](node2-2) at ([yshift=-0.2em]node2.south) {新生成句子$\seq{x'}$};
\draw [->,thick]([yshift=0.2em]node2.north).. controls (-1.93,-1.5) and (-2.0,-0.2)..([xshift=-0.2em]remark1.west);
\node[anchor=north,circle,fill=red!20](node3) at ([xshift=6.5em,yshift=-2.0em]remark1.south) {目标语言句子$\seq{x}$};
\draw [->,thick]([xshift=0.2em]remark1.east).. controls (2.9,-0.25) and (2.9,-0.7) ..([yshift=0.2em]node3.north);
\node [anchor=north] (node4-1) at ([xshift=-1.0em,yshift=-7.0em]remark1.south) {\small{\seq{y}}};
\node [anchor=west] (node4-2) at ([xshift=0.8em]node4-1.east) {\small{\seq{x}}};
\node [anchor=north] (node4-3) at ([xshift=1.0em]node4-1.south) {\small{翻译模型g}};
\draw [->,line width=0.6pt](node4-1.east)--(node4-2.west);
\begin{pgfonlayer}{background}
{
\node[fill=yellow!20,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=5em,drop shadow,rounded corners=2pt] [fit =(node4-1)(node4-2)(node4-3)] (remark2) {};
}
\end{pgfonlayer}
\draw [->,thick]([xshift=-0.2em]remark2.west).. controls (-0.8,-4.12) and (-1.95,-4.12)..([yshift=-0.2em]node2-2.south);
\draw [->,thick]([yshift=-0.2em]node3.south).. controls (2.9,-3) and (2.9,-4.1)..([xshift=0.2em]remark2.east);
\end{tikzpicture}
\ No newline at end of file
This source diff could not be displayed because it is too large. You can view the blob instead.
......@@ -32,7 +32,7 @@
%----------------------------------------------------------------------------------------
\sectionnewpage
\section{基于扭曲度的翻译模型}
\section{基于扭曲度的模型}
下面将介绍扭曲度在机器翻译中的定义及使用方法。这也带来了两个新的翻译模型\ \dash\ IBM模型2\upcite{DBLP:journals/coling/BrownPPM94}和HMM翻译模型\upcite{vogel1996hmm}
......@@ -161,7 +161,7 @@
%----------------------------------------------------------------------------------------
\sectionnewpage
\section{基于繁衍率的翻译模型}
\section{基于繁衍率的模型}
下面介绍翻译中的一对多问题,以及这个问题所带来的句子长度预测问题。
......
......@@ -8836,7 +8836,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{DBLP:conf/emnlp/EdunovOAG18,
author = {Sergey Edunov and
Myle Ott and
......@@ -8959,15 +8958,6 @@ author = {Zhuang Liu and
volume = {abs/1706.05098},
year = {2017}
}
@inproceedings{DBLP:conf/emnlp/DomhanH17,
author = {Tobias Domhan and
Felix Hieber},
title = {Using Target-side Monolingual Data for Neural Machine Translation
through Multi-task Learning},
pages = {1500--1505},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{DBLP:conf/icml/XiaQCBYL17,
author = {Yingce Xia and
Tao Qin and
......@@ -9014,13 +9004,6 @@ author = {Zhuang Liu and
publisher = {The {MIT} Press},
year = {1999}
}
@inproceedings{lample2019cross,
author = {Alexis Conneau and
Guillaume Lample},
title = {Cross-lingual Language Model Pretraining},
pages = {7057--7067},
year = {2019}
}
@inproceedings{DBLP:conf/aclnmt/HoangKHC18,
author = {Cong Duy Vu Hoang and
Philipp Koehn and
......@@ -9042,15 +9025,6 @@ author = {Zhuang Liu and
publisher = {{PMLR}},
year = {2018}
}
@inproceedings{DBLP:conf/acl/FadaeeBM17a,
author = {Marzieh Fadaee and
Arianna Bisazza and
Christof Monz},
title = {Data Augmentation for Low-Resource Neural Machine Translation},
pages = {567--573},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{finding2006adafre,
author = {S. F. Adafre and Maarten de Rijke},
title = {Finding Similar Sentences across Multiple Languages in Wikipedia },
......@@ -9074,24 +9048,6 @@ author = {Zhuang Liu and
pages = {477--504},
year = {2005}
}
@inproceedings{DBLP:conf/naacl/SmithQT10,
author = {Jason R. Smith and
Chris Quirk and
Kristina Toutanova},
title = {Extracting Parallel Sentences from Comparable Corpora using Document
Level Alignment},
pages = {403--411},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2010}
}
@inproceedings{DBLP:conf/emnlp/ZhangZ16,
author = {Jiajun Zhang and
Chengqing Zong},
title = {Exploiting Source-side Monolingual Data in Neural Machine Translation},
pages = {1535--1545},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
}
@inproceedings{DBLP:conf/acl/XiaKAN19,
author = {Mengzhou Xia and
Xiang Kong and
......@@ -9102,17 +9058,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/WangPDN18,
author = {Xinyi Wang and
Hieu Pham and
Zihang Dai and
Graham Neubig},
title = {SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine
Translation},
pages = {856--861},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/acl/GaoZWXQCZL19,
author = {Fei Gao and
Jinhua Zhu and
......@@ -9127,17 +9072,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/WangLWLS19,
author = {Shuo Wang and
Yang Liu and
Chao Wang and
Huanbo Luan and
Maosong Sun},
title = {Improving Back-Translation with Uncertainty-based Confidence Estimation},
pages = {791--802},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/WuWXQLL19,
author = {Lijun Wu and
Yiren Wang and
......@@ -9176,7 +9110,6 @@ author = {Zhuang Liu and
journal = {Computer Science},
year = {2015},
}
@phdthesis{黄书剑0统计机器翻译中的词对齐研究,
title={统计机器翻译中的词对齐研究},
author={黄书剑},
......@@ -9199,16 +9132,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
}
@inproceedings{DBLP:conf/iclr/SmithTHH17,
author = {Samuel L. Smith and
David H. P. Turban and
Steven Hamblin and
Nils Y. Hammerla},
title = {Offline bilingual word vectors, orthogonal transformations and the
inverted softmax},
publisher = {International Conference on Learning Representations},
year = {2017}
}
@inproceedings{DBLP:conf/acl/ArtetxeLA17,
author = {Mikel Artetxe and
Gorka Labaka and
......@@ -9227,7 +9150,6 @@ author = {Zhuang Liu and
pages={1-10},
year={1966},
}
@inproceedings{DBLP:conf/iclr/LampleCRDJ18,
author = {Guillaume Lample and
Alexis Conneau and
......@@ -9248,16 +9170,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{DBLP:conf/emnlp/XuYOW18,
author = {Ruochen Xu and
Yiming Yang and
Naoki Otani and
Yuexin Wu},
title = {Unsupervised Cross-lingual Transfer of Word Embedding Spaces},
pages = {2465--2474},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/emnlp/Alvarez-MelisJ18,
author = {David Alvarez-Melis and
Tommi S. Jaakkola},
......@@ -9310,15 +9222,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/acl/SogaardVR18,
author = {Anders S{\o}gaard and
Sebastian Ruder and
Ivan Vulic},
title = {On the Limitations of Unsupervised Bilingual Dictionary Induction},
pages = {778--788},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@article{DBLP:journals/talip/MarieF20,
author = {Benjamin Marie and
Atsushi Fujita},
......@@ -9351,15 +9254,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/iclr/LampleCDR18,
author = {Guillaume Lample and
Alexis Conneau and
Ludovic Denoyer and
Marc'Aurelio Ranzato},
title = {Unsupervised Machine Translation Using Monolingual Corpora Only},
publisher = {International Conference on Learning Representations},
year = {2018}
}
@inproceedings{DBLP:conf/nips/ConneauL19,
author = {Alexis Conneau and
Guillaume Lample},
......@@ -9388,7 +9282,6 @@ author = {Zhuang Liu and
publisher={International Conference on Computational Linguistics},
year={2020}
}
@inproceedings{2018When,
title={When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?},
author={ Qi, Ye and Sachan, Devendra Singh and Felix, Matthieu and Padmanabhan, Sarguna Janani and Neubig, Graham },
......@@ -9404,16 +9297,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/ImamuraS19,
author = {Kenji Imamura and
Eiichiro Sumita},
title = {Recycling a Pre-trained {BERT} Encoder for Neural Machine Translation},
booktitle = {Proceedings of the 3rd Workshop on Neural Generation and Translation@EMNLP-IJCNLP
2019, Hong Kong, November 4, 2019},
pages = {23--31},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/aaai/YangW0Z00020,
author = {Jiacheng Yang and
Mingxuan Wang and
......@@ -9538,7 +9421,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@article{DBLP:journals/corr/abs-1811-01124,
author = {Jean Alaux and
Edouard Grave and
......@@ -9549,16 +9431,6 @@ author = {Zhuang Liu and
volume = {abs/1811.01124},
year = {2018}
}
@inproceedings{DBLP:conf/emnlp/XuYOW18,
author = {Ruochen Xu and
Yiming Yang and
Naoki Otani and
Yuexin Wu},
title = {Unsupervised Cross-lingual Transfer of Word Embedding Spaces},
pages = {2465--2474},
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/emnlp/DouZH18,
author = {Zi-Yi Dou and
Zhi-Hao Zhou and
......@@ -9595,18 +9467,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/emnlp/JoulinBMJG18,
author = {Armand Joulin and
Piotr Bojanowski and
Tomas Mikolov and
Herv{\'{e}} J{\'{e}}gou and
Edouard Grave},
title = {Loss in Translation: Learning Bilingual Word Mapping with a Retrieval
Criterion},
pages = {2979--2984},
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/emnlp/ChenC18,
author = {Xilun Chen and
Claire Cardie},
......@@ -9615,15 +9475,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/naacl/MohiuddinJ19,
author = {Tasnim Mohiuddin and
Shafiq R. Joty},
title = {Revisiting Adversarial Autoencoder for Unsupervised Word Translation
with Cycle Consistency and Improved Training},
pages = {3857--3867},
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/TaitelbaumCG19,
author = {Hagai Taitelbaum and
Gal Chechik and
......@@ -9675,7 +9526,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2020}
}
@article{hartmann2018empirical,
title={Empirical observations on the instability of aligning word vector spaces with GANs},
author={Hartmann, Mareike and Kementchedjhieva, Yova and S{\o}gaard, Anders},
......@@ -9699,7 +9549,6 @@ author = {Zhuang Liu and
pages = {6031--6041},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/HartmannKS18,
author = {Mareike Hartmann and
Yova Kementchedjhieva and
......@@ -9710,17 +9559,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/emnlp/VulicGRK19,
author = {Ivan Vulic and
Goran Glavas and
Roi Reichart and
Anna Korhonen},
title = {Do We Really Need Fully Unsupervised Cross-Lingual Embeddings?},
pages = {4406--4417},
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/JoulinBMJG18,
author = {Armand Joulin and
Piotr Bojanowski and
......@@ -9766,36 +9604,6 @@ author = {Zhuang Liu and
publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
year = {2016}
}
@inproceedings{DBLP:conf/naacl/FiratCB16,
author = {Orhan Firat and
Kyunghyun Cho and
Yoshua Bengio},
title = {Multi-Way, Multilingual Neural Machine Translation with a Shared Attention
Mechanism},
pages = {866--875},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
}
@article{DBLP:journals/tacl/JohnsonSLKWCTVW17,
author = {Melvin Johnson and
Mike Schuster and
Quoc V. Le and
Maxim Krikun and
Yonghui Wu and
Zhifeng Chen and
Nikhil Thorat and
Fernanda B. Vi{\'{e}}gas and
Martin Wattenberg and
Greg Corrado and
Macduff Hughes and
Jeffrey Dean},
title = {Google's Multilingual Neural Machine Translation System: Enabling
Zero-Shot Translation},
journal = {Trans. Assoc. Comput. Linguistics},
volume = {5},
pages = {339--351},
year = {2017}
}
@inproceedings{DBLP:conf/emnlp/KimPPKN19,
author = {Yunsu Kim and
Petre Petrov and
......@@ -9877,16 +9685,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2007}
}
@article{DBLP:journals/mt/WuW07,
author = {Hua Wu and
Haifeng Wang},
title = {Pivot language approach for phrase-based statistical machine translation},
journal = {Mach. Transl.},
volume = {21},
number = {3},
pages = {165--181},
year = {2007}
}
@inproceedings{DBLP:conf/acl/WuW09,
author = {Hua Wu and
Haifeng Wang},
......@@ -9987,17 +9785,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2015}
}
@article{DBLP:journals/tacl/LeeCH17,
author = {Jason Lee and
Kyunghyun Cho and
Thomas Hofmann},
title = {Fully Character-Level Neural Machine Translation without Explicit
Segmentation},
journal = {Trans. Assoc. Comput. Linguistics},
volume = {5},
pages = {365--378},
year = {2017}
}
@inproceedings{DBLP:conf/lrec/RiktersPK18,
author = {Matiss Rikters and
Marcis Pinnis and
......@@ -10017,26 +9804,6 @@ author = {Zhuang Liu and
pages = {1345--1359},
year = {2010}
}
@article{DBLP:journals/tacl/JohnsonSLKWCTVW17,
author = {Melvin Johnson and
Mike Schuster and
Quoc V. Le and
Maxim Krikun and
Yonghui Wu and
Zhifeng Chen and
Nikhil Thorat and
Fernanda B. Vi{\'{e}}gas and
Martin Wattenberg and
Greg Corrado and
Macduff Hughes and
Jeffrey Dean},
title = {Google's Multilingual Neural Machine Translation System: Enabling
Zero-Shot Translation},
journal = {Trans. Assoc. Comput. Linguistics},
volume = {5},
pages = {339--351},
year = {2017}
}
@book{2009Handbook,
title={Handbook Of Research On Machine Learning Applications and Trends: Algorithms, Methods and Techniques - 2 Volumes},
author={ Olivas, Emilio Soria and Guerrero, Jose David Martin and Sober, Marcelino Martinez and Benedito, Jose Rafael Magdalena and Lopez, Antonio Jose Serrano },
......@@ -10122,35 +9889,6 @@ author = {Zhuang Liu and
pages={1--38},
year={2020}
}
@inproceedings{DBLP:conf/emnlp/VulicGRK19,
author = {Ivan Vulic and
Goran Glavas and
Roi Reichart and
Anna Korhonen},
title = {Do We Really Need Fully Unsupervised Cross-Lingual Embeddings?},
pages = {4406--4417},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@article{DBLP:journals/corr/MikolovLS13,
author = {Tomas Mikolov and
Quoc V. Le and
Ilya Sutskever},
title = {Exploiting Similarities among Languages for Machine Translation},
journal = {CoRR},
volume = {abs/1309.4168},
year = {2013}
}
@article{DBLP:journals/corr/MikolovLS13,
author = {Tomas Mikolov and
Quoc V. Le and
Ilya Sutskever},
title = {Exploiting Similarities among Languages for Machine Translation},
journal = {CoRR},
volume = {abs/1309.4168},
year = {2013}
}
@inproceedings{DBLP:conf/emnlp/XuYOW18,
author = {Ruochen Xu and
Yiming Yang and
......@@ -10161,17 +9899,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/iclr/LampleCRDJ18,
author = {Guillaume Lample and
Alexis Conneau and
Marc'Aurelio Ranzato and
Ludovic Denoyer and
Herv{\'{e}} J{\'{e}}gou},
title = {Word translation without parallel data},
publisher = {International Conference on Learning Representations},
year = {2018}
}
@inproceedings{DBLP:conf/emnlp/ZhangLLS17,
author = {Meng Zhang and
Yang Liu and
......@@ -10183,17 +9910,6 @@ author = {Zhuang Liu and
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2017}
}
@inproceedings{DBLP:conf/naacl/MohiuddinJ19,
author = {Tasnim Mohiuddin and
Shafiq R. Joty},
title = {Revisiting Adversarial Autoencoder for Unsupervised Word Translation
with Cycle Consistency and Improved Training},
pages = {3857--3867},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/ArtetxeLA18,
author = {Mikel Artetxe and
Gorka Labaka and
......@@ -10203,7 +9919,6 @@ author = {Zhuang Liu and
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2018}
}
@article{DBLP:journals/tacl/LeeCH17,
author = {Jason Lee and
Kyunghyun Cho and
......@@ -10231,29 +9946,9 @@ author = {Zhuang Liu and
Alexander H. Waibel},
title = {Toward Multilingual Neural Machine Translation with Universal Encoder
and Decoder},
journal = {CoRR},
volume = {abs/1611.04798},
year = {2016}
}
@article{DBLP:journals/tacl/JohnsonSLKWCTVW17,
author = {Melvin Johnson and
Mike Schuster and
Quoc V. Le and
Maxim Krikun and
Yonghui Wu and
Zhifeng Chen and
Nikhil Thorat and
Fernanda B. Vi{\'{e}}gas and
Martin Wattenberg and
Greg Corrado and
Macduff Hughes and
Jeffrey Dean},
title = {Google's Multilingual Neural Machine Translation System: Enabling
Zero-Shot Translation},
journal = {Transactions of the Association for Computational Linguistics},
volume = {5},
pages = {339--351},
year = {2017}
journal = {CoRR},
volume = {abs/1611.04798},
year = {2016}
}
@inproceedings{DBLP:conf/coling/BlackwoodBW18,
author = {Graeme W. Blackwood and
......@@ -10318,13 +10013,6 @@ author = {Zhuang Liu and
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2019}
}
@inproceedings{2019Consistency,
title={Consistency by Agreement in Zero-Shot Neural Machine Translation},
author={Al-Shedivat, Maruan and Parikh, Ankur },
publisher={Proceedings of the 2019 Conference of the North},
year={2019},
}
@article{DBLP:journals/corr/abs-1903-07091,
author = {Naveen Arivazhagan and
Ankur Bapna and
......@@ -10421,15 +10109,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2009}
}
@inproceedings{DBLP:conf/eacl/LapataSM17,
author = {Jonathan Mallinson and
Rico Sennrich and
Mirella Lapata},
title = {Paraphrasing Revisited with Neural Machine Translation},
pages = {881--893},
publisher = {European Association of Computational Linguistics},
year = {2017}
}
@inproceedings{DBLP:conf/aclnmt/ImamuraFS18,
author = {Kenji Imamura and
Atsushi Fujita and
......@@ -10451,21 +10130,6 @@ author = {Zhuang Liu and
pages = {1096--1103},
publisher = {International Conference on Machine Learning}
}
@article{DBLP:journals/ipm/FarhanTAJATT20,
author = {Wael Farhan and
Bashar Talafha and
Analle Abuammar and
Ruba Jaikat and
Mahmoud Al-Ayyoub and
Ahmad Bisher Tarakji and
Anas Toma},
title = {Unsupervised dialectal neural machine translation},
journal = {Inform Process Manag},
volume = {57},
number = {3},
pages = {102181},
year = {2020}
}
@inproceedings{DBLP:conf/iclr/LampleCDR18,
author = {Guillaume Lample and
Alexis Conneau and
......@@ -10521,13 +10185,6 @@ author = {Zhuang Liu and
publisher = {European Association of Computational Linguistics},
year = {2017}
}
@inproceedings{yasuda2008method,
title={Method for building sentence-aligned corpus from wikipedia},
author={Yasuda, Keiji and Sumita, Eiichiro},
publisher={2008 AAAI Workshop on Wikipedia and Artificial Intelligence},
pages={263--268},
year={2008}
}
@article{2005Improving,
title={Improving Machine Translation Performance by Exploiting Non-Parallel Corpora},
author={ Munteanu, Ds and Marcu, D },
......@@ -10698,54 +10355,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{DBLP:conf/naacl/PetersNIGCLZ18,
author = {Matthew E. Peters and
Mark Neumann and
Mohit Iyyer and
Matt Gardner and
Christopher Clark and
Kenton Lee and
Luke Zettlemoyer},
title = {Deep Contextualized Word Representations},
pages = {2227--2237},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/naacl/PetersNIGCLZ18,
author = {Matthew E. Peters and
Mark Neumann and
Mohit Iyyer and
Matt Gardner and
Christopher Clark and
Kenton Lee and
Luke Zettlemoyer},
title = {Deep Contextualized Word Representations},
pages = {2227--2237},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/naacl/PetersNIGCLZ18,
author = {Matthew E. Peters and
Mark Neumann and
Mohit Iyyer and
Matt Gardner and
Christopher Clark and
Kenton Lee and
Luke Zettlemoyer},
title = {Deep Contextualized Word Representations},
pages = {2227--2237},
publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/emnlp/ClinchantJN19,
author = {St{\'{e}}phane Clinchant and
Kweon Woo Jung and
Vassilina Nikoulina},
title = {On the use of {BERT} for Neural Machine Translation},
pages = {108--117},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/ImamuraS19,
author = {Kenji Imamura and
Eiichiro Sumita},
......@@ -10773,34 +10382,6 @@ author = {Zhuang Liu and
volume = {abs/1908.06259},
year = {2019}
}
@inproceedings{DBLP:conf/aaai/YangW0Z00020,
author = {Jiacheng Yang and
Mingxuan Wang and
Hao Zhou and
Chengqi Zhao and
Weinan Zhang and
Yong Yu and
Lei Li},
title = {Towards Making the Most of {BERT} in Neural Machine Translation},
pages = {9378--9385},
publisher = {AAAI Conference on Artificial Intelligence},
year = {2020}
}
@inproceedings{DBLP:conf/acl/LewisLGGMLSZ20,
author = {Mike Lewis and
Yinhan Liu and
Naman Goyal and
Marjan Ghazvininejad and
Abdelrahman Mohamed and
Omer Levy and
Veselin Stoyanov and
Luke Zettlemoyer},
title = {{BART:} Denoising Sequence-to-Sequence Pre-training for Natural Language
Generation, Translation, and Comprehension},
pages = {7871--7880},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2020}
}
@inproceedings{DBLP:conf/emnlp/QiYGLDCZ020,
author = {Weizhen Qi and
Yu Yan and
......@@ -10941,13 +10522,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2013}
}
@article{joty2015using,
title={Using joint models for domain adaptation in statistical machine translation},
author={Joty, Nadir Durrani Hassan Sajjad Shafiq and Vogel, Ahmed Abdelali Stephan},
journal={Proceedings of MT Summit XV},
pages={117},
year={2015}
}
@article{imamura2016multi,
title={Multi-domain adaptation for statistical machine translation based on feature augmentation},
author={Imamura, Kenji and Sumita, Eiichiro},
......@@ -11025,17 +10599,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2010}
}
@inproceedings{DBLP:conf/acl/DuhNST13,
author = {Kevin Duh and
Graham Neubig and
Katsuhito Sudoh and
Hajime Tsukada},
title = {Adaptation Data Selection using Neural Language Models: Experiments
in Machine Translation},
pages = {678--683},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2013}
}
@inproceedings{DBLP:conf/coling/HoangS14,
author = {Cuong Hoang and
Khalil Sima'an},
......@@ -11110,33 +10673,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2012}
}
@inproceedings{DBLP:conf/wmt/FosterK07,
author = {George F. Foster and
Roland Kuhn},
title = {Mixture-Model Adaptation for {SMT}},
pages = {128--135},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2007}
}
@inproceedings{DBLP:conf/emnlp/MatsoukasRZ09,
author = {Spyros Matsoukas and
Antti-Veikko I. Rosti and
Bing Zhang},
title = {Discriminative Corpus Weight Estimation for Machine Translation},
pages = {708--717},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2009}
}
@inproceedings{DBLP:conf/emnlp/FosterGK10,
author = {George F. Foster and
Cyril Goutte and
Roland Kuhn},
title = {Discriminative Instance Weighting for Domain Adaptation in Statistical
Machine Translation},
pages = {451--459},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2010}
}
@inproceedings{DBLP:conf/wmt/ShahBS10,
author = {Kashif Shah and
Lo{\"{\i}}c Barrault and
......@@ -11152,24 +10688,6 @@ author = {Zhuang Liu and
publisher={International Workshop on Spoken Language Translation},
year={2011}
}
@inproceedings{DBLP:conf/lrec/EckVW04,
author = {Matthias Eck and
Stephan Vogel and
Alex Waibel},
title = {Language Model Adaptation for Statistical Machine Translation Based
on Information Retrieval},
publisher = {European Language Resources Association},
year = {2004}
}
@inproceedings{DBLP:conf/coling/ZhaoEV04,
author = {Bing Zhao and
Matthias Eck and
Stephan Vogel},
title = {Language Model Adaptation for Statistical Machine Translation via
Structured Query Models},
publisher = {International Conference on Computational Linguistics},
year = {2004}
}
@article{moore2010intelligent,
title = {Intelligent selection of language model training data},
author = {Moore, Robert C and Lewis, Will},
......@@ -11311,12 +10829,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{2019Non,
title={Non-Parametric Adaptation for Neural Machine Translation},
author={Bapna, Ankur and Firat, Orhan },
booktitle={Conference of the North},
year={2019},
}
@inproceedings{britz2017effective,
title={Effective domain mixing for neural machine translation},
author={Britz, Denny and Le, Quoc and Pryzant, Reid},
......@@ -11472,17 +10984,6 @@ author = {Zhuang Liu and
publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
year = {2019}
}
@article{DBLP:journals/corr/abs-1906-03129,
author = {Shen Yan and
Leonard Dahlmann and
Pavel Petrushkov and
Sanjika Hewavitharana and
Shahram Khadivi},
title = {Word-based Domain Adaptation for Neural Machine Translation},
journal = {CoRR},
volume = {abs/1906.03129},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/WeesBM17,
author = {Marlies van der Wees and
Arianna Bisazza and
......@@ -11514,15 +11015,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{DBLP:conf/emnlp/DomhanH17,
author = {Tobias Domhan and
Felix Hieber},
title = {Using Target-side Monolingual Data for Neural Machine Translation
through Multi-task Learning},
pages = {1500--1505},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2017}
}
@inproceedings{DBLP:conf/naacl/BapnaF19,
author = {Ankur Bapna and
Orhan Firat},
......@@ -11531,8 +11023,6 @@ author = {Zhuang Liu and
publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
year = {2019}
}
@article{DBLP:journals/corr/abs-2010-11125,
author = {Angela Fan and
Shruti Bhosale and
......@@ -11570,7 +11060,6 @@ author = {Zhuang Liu and
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2020}
}
@inproceedings{DBLP:conf/emnlp/ZhuH07,
author = {Jingbo Zhu and
Eduard H. Hovy},
......@@ -11604,8 +11093,6 @@ author = {Zhuang Liu and
publisher = {AAAI Conference on Artificial Intelligence},
year = {2018}
}
@inproceedings{DBLP:conf/wmt/SunJXHWW19,
author = {Meng Sun and
Bojian Jiang and
......@@ -11618,8 +11105,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/acl/SuHC19,
author = {Shang-Yu Su and
Chao-Wei Huang and
......@@ -11629,8 +11114,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@article{DBLP:journals/ejasmp/RadzikowskiNWY19,
author = {Kacper Radzikowski and
Robert Nowak and
......@@ -11670,6 +11153,155 @@ author = {Zhuang Liu and
pages = {170248--170260},
year = {2020}
}
@inproceedings{DBLP:conf/acl/MarieRF20,
author = {Benjamin Marie and
Raphael Rubino and
Atsushi Fujita},
title = {Tagged Back-translation Revisited: Why Does It Really Work?},
pages = {5990--5997},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2020}
}
@inproceedings{DBLP:conf/nips/YangDYCSL19,
author = {Zhilin Yang and
Zihang Dai and
Yiming Yang and
Jaime G. Carbonell and
Ruslan Salakhutdinov and
Quoc V. Le},
title = {XLNet: Generalized Autoregressive Pretraining for Language Understanding},
pages = {5754--5764},
year = {2019}
}
@article{lewis2019bart,
title={Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension},
author={Lewis, Mike and Liu, Yinhan and Goyal, Naman and Ghazvininejad, Marjan and Mohamed, Abdelrahman and Levy, Omer and Stoyanov, Ves and Zettlemoyer, Luke},
journal={arXiv preprint arXiv:1910.13461},
year={2019}
}
@inproceedings{DBLP:conf/iclr/LanCGGSS20,
author = {Zhenzhong Lan and
Mingda Chen and
Sebastian Goodman and
Kevin Gimpel and
Piyush Sharma and
Radu Soricut},
title = {{ALBERT:} {A} Lite {BERT} for Self-supervised Learning of Language
Representations},
publisher = {International Conference on Learning Representations},
year = {2020}
}
@inproceedings{DBLP:conf/acl/ZhangHLJSL19,
author = {Zhengyan Zhang and
Xu Han and
Zhiyuan Liu and
Xin Jiang and
Maosong Sun and
Qun Liu},
title = {{ERNIE:} Enhanced Language Representation with Informative Entities},
pages = {1441--1451},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/emnlp/HuangLDGSJZ19,
author = {Haoyang Huang and
Yaobo Liang and
Nan Duan and
Ming Gong and
Linjun Shou and
Daxin Jiang and
Ming Zhou},
title = {Unicoder: {A} Universal Language Encoder by Pre-training with Multiple
Cross-lingual Tasks},
pages = {2485--2494},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2019}
}
@inproceedings{DBLP:conf/iccv/SunMV0S19,
author = {Chen Sun and
Austin Myers and
Carl Vondrick and
Kevin Murphy and
Cordelia Schmid},
title = {VideoBERT: {A} Joint Model for Video and Language Representation Learning},
pages = {7463--7472},
publisher = {International Conference on Computer Vision},
year = {2019}
}
@article{DBLP:journals/corr/abs-2010-12831,
author = {Liunian Harold Li and
Haoxuan You and
Zhecan Wang and
Alireza Zareian and
Shih-Fu Chang and
Kai-Wei Chang},
title = {Weakly-supervised VisualBERT: Pre-training without Parallel Images
and Captions},
journal = {CoRR},
volume = {abs/2010.12831},
year = {2020}
}
@inproceedings{DBLP:conf/nips/LuBPL19,
author = {Jiasen Lu and
Dhruv Batra and
Devi Parikh and
Stefan Lee},
title = {ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations
for Vision-and-Language Tasks},
publisher = {Annual Conference and Workshop on Neural Information Processing Systems},
pages = {13--23},
year = {2019}
}
@inproceedings{DBLP:conf/interspeech/ChuangLLL20,
author = {Yung-Sung Chuang and
Chi-Liang Liu and
Hung-yi Lee and
Lin-Shan Lee},
title = {SpeechBERT: An Audio-and-Text Jointly Learned Language Model for End-to-End
Spoken Question Answering},
pages = {4168--4172},
publisher = {Annual Conference of the International Speech Communication Association},
year = {2020}
}
@inproceedings{DBLP:conf/rep4nlp/PetersRS19,
author = {Matthew E. Peters and
Sebastian Ruder and
Noah A. Smith},
title = {To Tune or Not to Tune? Adapting Pretrained Representations to Diverse
Tasks},
pages = {7--14},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/cncl/SunQXH19,
author = {Chi Sun and
Xipeng Qiu and
Yige Xu and
Xuanjing Huang},
title = {How to Fine-Tune {BERT} for Text Classification?},
volume = {11856},
pages = {194--206},
publisher = {Springer},
year = {2019}
}
@inproceedings{shen2020q,
title={Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT.},
author={Shen, Sheng and Dong, Zhen and Ye, Jiayu and Ma, Linjian and Yao, Zhewei and Gholami, Amir and Mahoney, Michael W and Keutzer, Kurt},
booktitle={AAAI Conference on Artificial Intelligence},
pages={8815--8821},
year={2020}
}
@article{DBLP:journals/corr/abs-1910-01108,
author = {Victor Sanh and
Lysandre Debut and
Julien Chaumond and
Thomas Wolf},
title = {DistilBERT, a distilled version of {BERT:} smaller, faster, cheaper
and lighter},
journal = {CoRR},
volume = {abs/1910.01108},
year = {2019}
}
%%%%% chapter 16------------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论