Commit c053ef8a by zengxin

合并分支 'caorunzhe' 到 'zengxin'

Caorunzhe

查看合并请求 !432
parents 984ce1bd 583c3721
......@@ -567,7 +567,7 @@
\parinterval 卷积是一种高效处理网格数据的计算方式,在图像、语音等领域取得了令人瞩目的成绩。本章介绍了卷积的概念及其特性,并对池化、填充等操作进行了详细的讨论。前面介绍的基于循环神经网络的翻译模型在引入注意力机制后已经大幅度超越了基于统计的机器翻译模型,但由于循环神经网络的计算方式导致网络整体的并行能力差,训练耗时。本章介绍了具有高并行计算的能力的模型范式,即基于卷积神经网络的编码器-解码器框架。其在机器翻译任务上取得了与基于循环神经网络的GNMT模型相当的性能,并大幅度缩短了模型的训练周期。除了基础部分,本章还针对卷积计算进行了延伸,包括逐通道卷积、逐点卷积、轻量卷积和动态卷积等。除了上述提及的内容,卷积神经网络及其变种在文本分类、命名实体识别等其他自然语言处理任务上也有许多应用。
\parinterval 和机器翻译任务不同的是,文本分类任务侧重于对序列特征的提取,然后通过压缩后的特征表示做出类别预测。卷积神经网络可以对序列中一些$n$-gram特征进行提取,也可以用在文本分类任务中,其基本结构包括输入层、卷积层、池化层和全连接层。除了在本章介绍过的TextCNN模型\upcite{Kim2014ConvolutionalNN},不少研究工作在此基础上对其进行改进。比如,通过改变输入层来引入更多特征\upcite{DBLP:conf/acl/NguyenG15,DBLP:conf/aaai/LaiXLZ15},对卷积层的改进\upcite{DBLP:conf/acl/ChenXLZ015,DBLP:conf/emnlp/LeiBJ15}以及对池化层的改进\upcite{Kalchbrenner2014ACN,DBLP:conf/acl/ChenXLZ015}。在命名实体识别任务中,同样可以使用卷积神经网络来进行特征提取\upcite{DBLP:journals/jmlr/CollobertWBKKK11,DBLP:conf/cncl/ZhouZXQBX17},或者使用更高效的空洞卷积对更长的上下文进行建模\upcite{DBLP:conf/emnlp/StrubellVBM17}。此外,也有一些研究工作尝试使用卷积神经网络来提取字符级特征\upcite{DBLP:conf/acl/MaH16,DBLP:conf/emnlp/LiDWCM17,DBLP:conf/acl-codeswitch/WangCK18}
\parinterval 和机器翻译任务不同的是,文本分类任务侧重于对序列特征的提取,然后通过压缩后的特征表示做出类别预测。卷积神经网络可以对序列中一些$n$-gram特征进行提取,也可以用在文本分类任务中,其基本结构包括输入层、卷积层、池化层和全连接层。除了在本章介绍过的TextCNN模型\upcite{Kim2014ConvolutionalNN},不少研究工作在此基础上对其进行改进。比如,通过改变输入层来引入更多特征\upcite{DBLP:conf/acl/NguyenG15,DBLP:conf/aaai/LaiXLZ15},对卷积层的改进\upcite{DBLP:conf/acl/ChenXLZ015,DBLP:conf/emnlp/LeiBJ15}以及对池化层的改进\upcite{Kalchbrenner2014ACN,DBLP:conf/acl/ChenXLZ015}。在命名实体识别任务中,同样可以使用卷积神经网络来进行特征提取\upcite{2011Natural,DBLP:conf/cncl/ZhouZXQBX17},或者使用更高效的空洞卷积对更长的上下文进行建模\upcite{DBLP:conf/emnlp/StrubellVBM17}。此外,也有一些研究工作尝试使用卷积神经网络来提取字符级特征\upcite{DBLP:conf/acl/MaH16,DBLP:conf/emnlp/LiDWCM17,DBLP:conf/acl-codeswitch/WangCK18}
......
\begin{tikzpicture}
\begin{scope}
\node [anchor=center] (node1) at (4.9,1) {\small{训练:}};
\node [anchor=center] (node11) at (5.5,1) {};
\node [anchor=center] (node12) at (6.7,1) {};
\node [anchor=center] (node2) at (4.9,0.5) {\small{推理:}};
\node [anchor=center] (node21) at (5.5,0.5) {};
\node [anchor=center] (node22) at (6.7,0.5) {};
\node [anchor=west,line width=0.6pt,draw=black,minimum width=5.6em,minimum height=2.2em,fill=blue!20,rounded corners=2pt] (node1-1) at (0,0) {\footnotesize{双语数据}};
\node [anchor=south,line width=0.6pt,draw=black,minimum width=4.5em,minimum height=2.2em,fill=blue!20,rounded corners=2pt] (node1-2) at ([yshift=-5em]node1-1.south) {\footnotesize{目标语伪数据}};
\node [anchor=west,line width=0.6pt,draw=black,minimum width=4.5em,minimum height=2.2em,fill=red!20,rounded corners=2pt] (node2-1) at ([xshift=-8.8em,yshift=-2.5em]node1-1.west) {\footnotesize{反向NMT系统}};
\node [anchor=west,line width=0.6pt,draw=black,minimum width=4.5em,minimum height=2.2em,fill=red!20,rounded corners=2pt] (node3-1) at ([xshift=3em,yshift=-2.5em]node1-1.east) {\footnotesize{前向NMT系统}};
\draw [->,line width=1pt](node1-1.west)--([xshift=3em]node2-1.north);
\draw [->,line width=1pt](node1-1.east)--([xshift=-3em]node3-1.north);
\draw [->,line width=1pt](node1-2.east)--([xshift=-3em]node3-1.south);
\draw [->,line width=1pt](node11.east)--(node12.west);
\draw [->,line width=1pt,dashed](node21.east)--(node22.west);
\draw [->,line width=1pt,dashed]([xshift=3em]node2-1.south)--(node1-2.west);
\end{scope}
\tikzstyle{bignode} = [line width=0.6pt,draw=black,minimum width=6.3em,minimum height=2.2em,fill=blue!20,rounded corners=2pt]
\tikzstyle{middlenode} = [line width=0.6pt,draw=black,minimum width=5.6em,minimum height=2.2em,fill=blue!20,rounded corners=2pt]
\node [anchor=center] (node1-1) at (0,0) {\scriptsize{汉语}};
\node [anchor=west] (node1-2) at ([xshift=1.0em]node1-1.east) {\scriptsize{英语}};
\node [anchor=north] (node1-3) at ([xshift=1.65em]node1-1.south) {\scriptsize{反向翻译模型}};
\draw [->,line width=0.6pt](node1-1.east)--(node1-2.west);
\begin{pgfonlayer}{background}
{
\node[fill=red!20,rounded corners=2pt,inner sep=0.2em,draw=black,line width=0.6pt,minimum width=6.0em] [fit =(node1-1)(node1-2)(node1-3)] (remark1) {};
}
\end{pgfonlayer}
\node [anchor=north](node2-1) at ([xshift=-1.93em,yshift=-1.95em]remark1.south){\scriptsize{汉语}};
\node [anchor=north](node2-1-2) at (node2-1.south){\scriptsize{真实数据}};
\begin{pgfonlayer}{background}
{
\node[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node2-1)(node2-1-2)] (remark2-1) {};
}
\end{pgfonlayer}
\node [anchor=west](node2-2) at ([xshift=0.82em,yshift=0.68em]remark2-1.east){\scriptsize{英语}};
\node [anchor=north](node2-2-2) at (node2-2.south){\scriptsize{真实数据}};
\begin{pgfonlayer}{background}
{
\node[fill=green!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node2-2)(node2-2-2)] (remark2-2) {};
}
\end{pgfonlayer}
\draw [->,line width=0.6pt]([yshift=-2.0em]remark1.south)--(remark1.south) node [pos=0.5,right] (pos1) {\scriptsize{训练}};
\node [anchor=west](node3-1) at ([xshift=5.0em,yshift=0.1em]node1-2.east){\scriptsize{汉语}};
\node [anchor=north](node3-1-2) at (node3-1.south){\scriptsize{真实数据}};
\begin{pgfonlayer}{background}
{
\node[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node3-1)(node3-1-2)] (remark3-1) {};
}
\end{pgfonlayer}
\node [anchor=north](node3-2) at ([yshift=-2.15em]remark3-1.south){\scriptsize{英语}};
\node [anchor=north](node3-2-2) at (node3-2.south){\scriptsize{伪数据}};
\begin{pgfonlayer}{background}
{
\node[fill=yellow!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node3-2)(node3-2-2)] (remark3-2) {};
}
\end{pgfonlayer}
\draw [->,line width=0.6pt](remark3-1.south)--(remark3-2.north) node [pos=0.5,right] (pos2) {\scriptsize{翻译}};
\begin{pgfonlayer}{background}
{
\node[rounded corners=2pt,inner sep=0.3em,draw=black,line width=0.6pt,dotted] [fit =(remark3-1)(remark3-2)] (remark2) {};
}
\end{pgfonlayer}
\draw [->,line width=0.6pt](remark1.east)--([yshift=2.40em]remark2.west) node [pos=0.5,above] (pos2) {\scriptsize{模型翻译}};
\node [anchor=south](pos2-2) at ([yshift=-0.5em]pos2.north){\scriptsize{使用反向}};
\draw[decorate,thick,decoration={brace,amplitude=5pt}] ([yshift=1.3em,xshift=1.5em]node3-1.east) -- ([yshift=-7.7em,xshift=1.5em]node3-1.east) node [pos=0.1,right,xshift=0.0em,yshift=0.0em] (label1) {\scriptsize{{混合}}};
\node [anchor=west](node4-1) at ([xshift=3.5em,yshift=3.94em]node3-2.east){\scriptsize{英语}};
\node [anchor=north](node4-1-2) at (node4-1.south){\scriptsize{伪数据}};
\begin{pgfonlayer}{background}
{
\node[fill=yellow!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node4-1)(node4-1-2)] (remark4-1) {};
}
\end{pgfonlayer}
\node [anchor=north](node4-2) at ([yshift=-1.59em]node4-1.south){\scriptsize{英语}};
\node [anchor=north](node4-2-2) at (node4-2.south){\scriptsize{真实数据}};
\begin{pgfonlayer}{background}
{
\node[fill=green!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node4-2)(node4-2-2)] (remark4-2) {};
}
\end{pgfonlayer}
\node [anchor=west](node4-3) at ([xshift=1.7em]node4-2.east){\scriptsize{汉语}};
\node [anchor=north](node4-3-2) at (node4-3.south){\scriptsize{真实数据}};
\begin{pgfonlayer}{background}
{
\node[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node4-3)(node4-3-2)] (remark4-3) {};
}
\end{pgfonlayer}
\node [anchor=west](node4-4) at ([xshift=1.7em]node4-1.east){\scriptsize{汉语}};
\node [anchor=north](node4-4-2) at (node4-4.south){\scriptsize{真实数据}};
\begin{pgfonlayer}{background}
{
\node[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em] [fit =(node4-4)(node4-4-2)] (remark4-3) {};
}
\end{pgfonlayer}
\node [anchor=center] (node5-1) at ([xshift=4.3em,yshift=-1.48em]node4-4.east) {\scriptsize{英语}};
\node [anchor=west] (node5-2) at ([xshift=1.0em]node5-1.east) {\scriptsize{汉语}};
\node [anchor=north] (node5-3) at ([xshift=1.65em]node5-1.south) {\scriptsize{正向翻译模型}};
\draw [->,line width=0.6pt](node5-1.east)--(node5-2.west);
\begin{pgfonlayer}{background}
{
\node[fill=red!20,rounded corners=2pt,inner sep=0.2em,draw=black,line width=0.6pt,minimum width=6.0em] [fit =(node5-1)(node5-2)(node5-3)] (remark3) {};
}
\end{pgfonlayer}
\draw [->,line width=0.6pt]([xshift=-2em]remark3.west)--(remark3.west) node [pos=0.5,above] (pos3) {\scriptsize{训练}};
\end{tikzpicture}
\ No newline at end of file
......@@ -235,7 +235,7 @@
\end{pgfonlayer}
{\scriptsize
\node [anchor=center] (cy00-2) at ([xshift=6.7em,yshift=0.2em]pos4-212) {\tiny{TopK}};
\node [anchor=center] (cy00-2) at ([xshift=6.7em,yshift=0.2em]pos4-212) {\tiny{$n$-best}};
\node [anchor=center,minimum height=1.8em,minimum width=0.8em,fill=orange!30] (cy11-2) at ([xshift=0.0em,yshift=-1.8em]pos4-212) {};
\node [anchor=center,minimum height=1.5em,minimum width=0.8em,fill=blue!30] (cy12-2) at ([xshift=1.3em,yshift=-0.15em]cy11-2) {};
\node [anchor=center,minimum height=2.5em,minimum width=0.8em,fill=black!30] (cy13-2) at ([xshift=1.3em,yshift=0.5em]cy12-2) {};
......
......@@ -5,7 +5,7 @@
\node [rectangle,inner sep=2pt,font=\scriptsize] (top) at ([yshift=3em,xshift=0em]center.north) {
\begin{tabular}{c}
翻译模型 \\
$\textrm{P}(\mathbf t|\mathbf s)$
$\textrm{P}(\ \mathbi{y}|\ \mathbi{x})$
\end{tabular}
};
......@@ -24,7 +24,7 @@ The weather is \\so good today.
\node [rectangle,inner sep=2pt,font=\scriptsize] (down) at ([yshift=-3em,xshift=0em]center.south) {
\begin{tabular}{c}
翻译模型 \\
$\textrm{P}(\mathbf s|\mathbf t)$
$\textrm{P}(\ \mathbi{x}|\ \mathbi{y})$
\end{tabular}
};
......
\begin{tikzpicture}
\begin{scope}
\node [anchor=center] (node1) at (-2.3,0) {\small{$x,y$:双语数据}};
\node [anchor=center] (node2) at (-2.1,-0.5) {\small{$z$}:单语数据};
\node [anchor=center] (node1-1) at (0,0) {\small{$y'$}};
\node [anchor=center] (node3-1) at ([xshift=5.5em,yshift=-0.1em]node1-1.east) {\small{$z'$}};
\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4em,fill=blue!20](node1-2) at ([yshift=-3em]node1-1.south) {\small{softmax}};
\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4em,fill=blue!20](node3-2) at ([yshift=-3em]node3-1.south) {\small{softmax}};
\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node1-3) at ([yshift=-4.0em]node1-2.south) {\small{Decoder}};
\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=yellow!20](node3-3) at ([yshift=-4.0em]node3-2.south) {\small{LM}};
\node[anchor=south](node1-4) at ([xshift=-0.6em,yshift=-3em]node1-3.south) {\gray{\small{$y$}}};
\node[anchor=south](node3-41) at ([xshift=-0.6em,yshift=-3em]node3-3.south) {\small{$y$}};
\node[anchor=south](node3-42) at ([xshift=0.6em,yshift=-2.9em]node3-3.south) {\small{$z$}};
\node[anchor=west](node2-2) at ([xshift=-4.9em]node1-4.west) {\small{$x$}};
\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node2-1) at ([yshift=4em]node2-2.north) {\small{Encoder}};
\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node1-3) at ([yshift=-2.0em]node1-2.south) {\small{Decoder}};
\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=yellow!20](node3-3) at ([yshift=-2.0em]node1-3.south) {\small{LM}};
\node[anchor=west,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4em,fill=blue!20](node3-2) at ([xshift=2em]node3-3.east) {\small{softmax}};
\node [anchor=north] (node3-1) at ([yshift=3.0em]node3-2.north) {\small{$z'$}};
\node[anchor=north](node3-41) at ([xshift=-0.6em,yshift=-2em]node3-3.south) {\small{$y$}};
\node[anchor=north](node3-42) at ([xshift=0.6em,yshift=-2em]node3-3.south) {\small{$z$}};
\node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node2-1) at ([xshift=-2em]node1-3.west) {\small{Encoder}};
\node[anchor=north](node2-2) at ([yshift=-2em]node2-1.south) {\small{$x$}};
\node [rectangle,rounded corners,draw=red,line width=0.2mm,densely dashed,inner sep=0.4em] [fit = (node3-2) (node3-3)] (inputshadow) {};
\draw [->,thick,draw=gray](node1-4.north)--([xshift=-0.6em]node1-3.south);
\draw [->,thick](node1-3.north)--(node1-2);
\draw [->,thick](node1-2.north)--(node1-1);
\draw [->,thick](node2-2.north)--(node2-1);
......@@ -24,11 +27,31 @@
\draw [->,thick](node3-41.north)--([xshift=-0.6em]node3-3.south);
\draw [->,thick](node3-42.north)--([xshift=0.6em]node3-3.south);
\draw [->,thick]([xshift=0.6em]node3-3.north)--([xshift=0.6em]node3-2.south);
\draw [->,thick](node3-3.north)--(node1-3.south);
\draw [->,thick](node3-2.north)--(node3-1);
\draw[->,thick]([xshift=-0.6em]node3-3.north)--([xshift=-0.6em,yshift=0.6em]node3-3.north)--([xshift=-3em,yshift=0.6em]node3-3.north)--([xshift=-3em,yshift=-3em]node3-3.north)--([xshift=-5.6em,yshift=-3em]node3-3.north)--([xshift=0.6em]node1-3.south);
\draw[->,thick](node3-3.east)--(node3-2.west);
\node [anchor=east] (node2-1-1) at ([xshift=-12.0em,yshift=-4.25em]node1-1.west) {\small{$y'$}};
\node[anchor=south,line width=0.6pt,draw,rounded corners,minimum height=1.5em,minimum width=4em,fill=blue!20](node2-1-2) at ([yshift=-3em]node2-1-1.south) {\small{softmax}};
\node[anchor=north,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node2-1-3) at ([yshift=-2.0em]node2-1-2.south) {\small{Decoder}};
\node[anchor=east,line width=0.6pt,draw,rounded corners,minimum height=2.2em,minimum width=4em,fill=red!20](node2-2-1) at ([xshift=-2em]node2-1-3.west) {\small{Encoder}};
\node[anchor=north](node2-2-2) at ([yshift=-2em]node2-2-1.south) {\small{$x$}};
\node[anchor=north](node2-2-3) at ([yshift=-2em]node2-1-3.south) {\small{$y$}};
\draw [->,thick](node2-1-2.north)--(node2-1-1);
\draw [->,thick](node2-2-2.north)--(node2-2-1);
\draw[->,thick](node2-2-1.east)--(node2-1-3.west);
\draw [->,thick](node2-1-3.north)--(node2-1-2.south);
\draw [->,thick](node2-2-3.north)--(node2-1-3);
\node [anchor=east] (node1) at ([xshift=-2.0em,yshift=4em]node2-1-1.west) {\small{$x,y$:双语数据}};
\node [anchor=north] (node2) at ([xshift=0.45em]node1.south) {\small{$z$}:单语数据};
\node [anchor=north](pos1) at ([yshift=-3.5em]node3-3.south) {\small{(b)多任务学习}};
\node [anchor=east](pos2) at ([xshift=-10.0em]pos1.west) {\small{(a)单任务学习}};
%\draw[->](node2-1.north)--([yshift=1em]node2-1.north)--([xshift=2.5em,yshift=1em]node2-1.north)--([xshift=2.5em,yshift=-0.4em]node2-1.north)--(node1-3.west);
\end{scope}
\end{tikzpicture}
\ No newline at end of file
......@@ -20,8 +20,8 @@
\node [anchor=west] (a15-2) at ([xshift=-4.25em]a15.west) {\tiny{$\cdots$}};
\node [anchor=east] (a13-3) at ([yshift=0.8em]a13-2.west) {\small{无监督语言}};
\node [anchor=north] (a13-4) at ([xshift=0em]a13-3.south) {\small{模型隐藏层}};
\node [anchor=east] (a13-3) at ([yshift=0.8em]a13-2.west) {\small{模型语言}};
\node [anchor=north] (a13-4) at ([xshift=0em]a13-3.south) {\small{隐藏层}};
\node [anchor=east] (a14-3) at ([yshift=0.8em]a14-2.west) {\small{神经机器翻译}};
\node [anchor=north] (a14-4) at ([xshift=0.5em]a14-3.south) {\small{模型隐藏层}};
......
This source diff could not be displayed because it is too large. You can view the blob instead.
......@@ -139,14 +139,14 @@
%\include{Chapter6/chapter6}
%\include{Chapter7/chapter7}
%\include{Chapter8/chapter8}
\include{Chapter9/chapter9}
\include{Chapter10/chapter10}
\include{Chapter11/chapter11}
\include{Chapter12/chapter12}
%\include{Chapter9/chapter9}
%\include{Chapter10/chapter10}
%\include{Chapter11/chapter11}
%\include{Chapter12/chapter12}
%\include{Chapter13/chapter13}
%\include{Chapter14/chapter14}
%\include{Chapter15/chapter15}
%\include{Chapter16/chapter16}
\include{Chapter16/chapter16}
%\include{Chapter17/chapter17}
%\include{Chapter18/chapter18}
%\include{ChapterAppend/chapterappend}
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论