Commit 30677034 by zengxin

合并分支 'caorunzhe' 到 'zengxin'

Caorunzhe

查看合并请求 !844
parents 8f0e83e9 6d268143
\begin{tikzpicture} \begin{tikzpicture}
\tikzstyle{tnode} = [rectangle,inner sep=0em,minimum width=8em,minimum height=6.6em,rounded corners=5pt,fill=ugreen!20] \tikzstyle{tnode} = [rectangle,inner sep=0em,minimum width=8em,minimum height=6.6em,rounded corners=5pt,fill=green!20]
\tikzstyle{pnode} = [rectangle,inner sep=0em,minimum width=8em,minimum height=6.6em,rounded corners=5pt,fill=yellow!20] \tikzstyle{pnode} = [rectangle,inner sep=0em,minimum width=8em,minimum height=6.6em,rounded corners=5pt,fill=yellow!30]
\tikzstyle{mnode} = [rectangle,inner sep=0em,minimum width=8em,minimum height=6.6em,rounded corners=5pt,fill=red!20] \tikzstyle{mnode} = [rectangle,inner sep=0em,minimum width=8em,minimum height=6.6em,rounded corners=5pt,fill=red!20]
\tikzstyle{wnode} = [inner sep=0em,minimum height=1.5em] \tikzstyle{wnode} = [inner sep=0em,minimum height=1.5em]
...@@ -19,7 +19,7 @@ ...@@ -19,7 +19,7 @@
\begin{pgfonlayer}{background} \begin{pgfonlayer}{background}
\node [rectangle,inner sep=0.7em,draw,ugreen!40,dashed,very thick,rounded corners=7pt] [fit = (n1) (n4)] (box1) {}; \node [rectangle,inner sep=0.7em,draw,ugreen!60,dashed,very thick,rounded corners=7pt] [fit = (n1) (n4)] (box1) {};
\end{pgfonlayer} \end{pgfonlayer}
\node [anchor=west,align=left,font=\footnotesize] (nt1) at ([xshift=0.1em,yshift=0em]n2.east) {统计词表和\\[0.5ex]词频}; \node [anchor=west,align=left,font=\footnotesize] (nt1) at ([xshift=0.1em,yshift=0em]n2.east) {统计词表和\\[0.5ex]词频};
...@@ -75,7 +75,7 @@ ...@@ -75,7 +75,7 @@
\node [anchor=east,ublue,align=left,font=\footnotesize] (l3) at ([xshift=-0.5em,yshift=0em]cd.west) {直至达到设定的符号合\\并表大小或无法合并}; \node [anchor=east,ublue,align=left,font=\footnotesize] (l3) at ([xshift=-0.5em,yshift=0em]cd.west) {直至达到设定的符号合\\并表大小或无法合并};
\begin{pgfonlayer}{background} \begin{pgfonlayer}{background}
\node [rectangle,inner sep=0.7em,draw,yellow!40,dashed,very thick,rounded corners=7pt] [fit = (n5) (n8) (l3) (cd)] (box2) {}; \node [rectangle,inner sep=0.7em,draw,orange!40,dashed,very thick,rounded corners=7pt] [fit = (n5) (n8) (l3) (cd)] (box2) {};
\end{pgfonlayer} \end{pgfonlayer}
%第五排 %第五排
......
...@@ -4,10 +4,10 @@ ...@@ -4,10 +4,10 @@
\tikzstyle{node}=[inner sep=0mm,minimum height=3em,minimum width=6em,rounded corners=5pt] \tikzstyle{node}=[inner sep=0mm,minimum height=3em,minimum width=6em,rounded corners=5pt]
\node[anchor=west,node,fill=ugreen!30] (n1) at (0,0) {训练集}; \node[anchor=west,node,fill=ugreen!15] (n1) at (0,0) {训练集};
\node[anchor=west,node,fill=yellow!30] (n2) at ([xshift=2em,yshift=0em]n1.east) {难度评估器}; \node[anchor=west,node,fill=yellow!15] (n2) at ([xshift=2em,yshift=0em]n1.east) {难度评估器};
\node[anchor=west,node,fill=red!30] (n3) at ([xshift=4em,yshift=0em]n2.east) {训练调度器}; \node[anchor=west,node,fill=red!15] (n3) at ([xshift=4em,yshift=0em]n2.east) {训练调度器};
\node[anchor=west,node,fill=blue!30] (n4) at ([xshift=4em,yshift=0em]n3.east) {模型训练器}; \node[anchor=west,node,fill=blue!15] (n4) at ([xshift=4em,yshift=0em]n3.east) {模型训练器};
\draw [->,very thick] ([xshift=0em,yshift=0em]n1.east) -- ([xshift=0em,yshift=0em]n2.west); \draw [->,very thick] ([xshift=0em,yshift=0em]n1.east) -- ([xshift=0em,yshift=0em]n2.west);
\draw [->,very thick] ([xshift=0em,yshift=0em]n2.east) -- ([xshift=0em,yshift=0em]n3.west); \draw [->,very thick] ([xshift=0em,yshift=0em]n2.east) -- ([xshift=0em,yshift=0em]n3.west);
...@@ -23,8 +23,8 @@ ...@@ -23,8 +23,8 @@
\draw [->,dotted,very thick] ([xshift=0em,yshift=0em]n4.north) -- ([xshift=0em,yshift=1em]n4.north) -- ([xshift=0em,yshift=1em]n3.north) -- (n3.north); \draw [->,dotted,very thick] ([xshift=0em,yshift=0em]n4.north) -- ([xshift=0em,yshift=1em]n4.north) -- ([xshift=0em,yshift=1em]n3.north) -- (n3.north);
\begin{pgfonlayer}{background} \begin{pgfonlayer}{background}
\node[rectangle,inner sep=5pt,rounded corners=5pt,fill=gray!30] [fit = (n3) (n4) (n6) (n8) ] (g2) {}; \node[rectangle,inner sep=5pt,rounded corners=5pt,fill=gray!15] [fit = (n3) (n4) (n6) (n8) ] (g2) {};
\node[rectangle,inner sep=5pt,rounded corners=5pt,fill=orange!30] [fit = (n2) (n3) (n9) ] (g1) {}; \node[rectangle,inner sep=5pt,rounded corners=5pt,fill=orange!15] [fit = (n2) (n3) (n9) ] (g1) {};
\end{pgfonlayer} \end{pgfonlayer}
......
%------------------------------------------------------------ %------------------------------------------------------------
\begin{tikzpicture} \begin{tikzpicture}
\tikzstyle{rnnnode} = [draw,inner sep=4pt,minimum width=2em,minimum height=2em,rounded corners=1pt,fill=yellow!20] \tikzstyle{rnnnode} = [draw,inner sep=4pt,minimum width=2em,minimum height=2em,rounded corners=1pt,fill=green!20]
\tikzstyle{snode} = [draw,inner sep=4pt,minimum width=2em,minimum height=2em,rounded corners=1pt,fill=red!20] \tikzstyle{snode} = [draw,inner sep=4pt,minimum width=2em,minimum height=2em,rounded corners=1pt,fill=red!20]
\tikzstyle{wode} = [inner sep=0pt,minimum width=2em,minimum height=2em,rounded corners=0pt] \tikzstyle{wode} = [inner sep=0pt,minimum width=2em,minimum height=2em,rounded corners=0pt]
......
...@@ -4,8 +4,8 @@ ...@@ -4,8 +4,8 @@
\begin{tikzpicture} \begin{tikzpicture}
\tikzstyle{rnnnode} = [draw,inner sep=2pt,minimum width=4em,minimum height=2em,rounded corners=1pt,fill=red!20] \tikzstyle{rnnnode} = [draw,inner sep=2pt,minimum width=4em,minimum height=2em,rounded corners=1pt,fill=red!15]
\tikzstyle{snode} = [draw,inner sep=2pt,minimum width=4em,minimum height=2em,rounded corners=1pt,fill=blue!20] \tikzstyle{snode} = [draw,inner sep=2pt,minimum width=4em,minimum height=2em,rounded corners=1pt,fill=blue!15]
\tikzstyle{ynode} = [inner sep=2pt,minimum width=4em,minimum height=2em,rounded corners=1pt] \tikzstyle{ynode} = [inner sep=2pt,minimum width=4em,minimum height=2em,rounded corners=1pt]
......
\begin{tikzpicture} \begin{tikzpicture}
\node[anchor=west,inner sep=0mm,minimum height=4em,minimum width=5.5em,rounded corners=15pt,align=left,draw,fill=red!20] (n1) at (0,0) {Decoder\\Encoder}; \node[anchor=west,inner sep=0mm,minimum height=4em,minimum width=5.5em,rounded corners=15pt,align=left,draw,fill=red!15] (n1) at (0,0) {Decoder\\Encoder};
\node[anchor=west,inner sep=0mm,minimum height=4em,minimum width=5.5em,rounded corners=15pt,align=left,draw,fill=green!20] (n2) at ([xshift=10em,yshift=0em]n1.east) {Decoder\\Encoder}; \node[anchor=west,inner sep=0mm,minimum height=4em,minimum width=5.5em,rounded corners=15pt,align=left,draw,fill=green!15] (n2) at ([xshift=10em,yshift=0em]n1.east) {Decoder\\Encoder};
\node[anchor=south,inner sep=0mm,font=\small] (a1) at ([xshift=0em,yshift=1em]n1.north) {演员$p$}; \node[anchor=south,inner sep=0mm,font=\small] (a1) at ([xshift=0em,yshift=1em]n1.north) {演员$p$};
......
...@@ -10,12 +10,12 @@ ...@@ -10,12 +10,12 @@
\tikzstyle{output} = [rectangle,thick,rounded corners=3pt,minimum width=1.2cm,align=center,font=\scriptsize]; \tikzstyle{output} = [rectangle,thick,rounded corners=3pt,minimum width=1.2cm,align=center,font=\scriptsize];
\begin{scope} \begin{scope}
\node [system,fill=orange!20,draw] (model3) at (0,0) {模型 $3$}; \node [system,fill=yellow!30,draw] (model3) at (0,0) {模型 $3$};
\node [system,fill=ugreen!20,draw,anchor=south] (model2) at ([yshift=0.5cm]model3.north) {模型 $2$}; \node [system,fill=green!20,draw,anchor=south] (model2) at ([yshift=0.5cm]model3.north) {模型 $2$};
\node [system,fill=red!20,draw,anchor=south] (model1) at ([yshift=0.5cm]model2.north) {模型 $1$}; \node [system,fill=red!20,draw,anchor=south] (model1) at ([yshift=0.5cm]model2.north) {模型 $1$};
\node [output,fill=orange!20,draw,anchor=west] (output3) at ([xshift=0.8cm]model3.east) {输出 $3$}; \node [output,fill=yellow!30,draw,anchor=west] (output3) at ([xshift=0.8cm]model3.east) {输出 $3$};
\node [output,fill=ugreen!20,draw,anchor=west] (output2) at ([xshift=0.8cm]model2.east) {输出 $2$}; \node [output,fill=green!20,draw,anchor=west] (output2) at ([xshift=0.8cm]model2.east) {输出 $2$};
\node [output,fill=red!20,draw,anchor=west] (output1) at ([xshift=0.8cm]model1.east) {输出 $1$}; \node [output,fill=red!20,draw,anchor=west] (output1) at ([xshift=0.8cm]model1.east) {输出 $1$};
\begin{pgfonlayer}{background} \begin{pgfonlayer}{background}
...@@ -40,15 +40,15 @@ ...@@ -40,15 +40,15 @@
\tikzstyle{output} = [rectangle,thick,rounded corners=3pt,minimum width=1.2cm,align=center,font=\scriptsize]; \tikzstyle{output} = [rectangle,thick,rounded corners=3pt,minimum width=1.2cm,align=center,font=\scriptsize];
\begin{scope} \begin{scope}
\node [system,fill=orange!20,draw] (model3) at (0,0) {模型 $3$}; \node [system,fill=yellow!30,draw] (model3) at (0,0) {模型 $3$};
\node [system,fill=ugreen!20,draw,anchor=south] (model2) at ([yshift=0.5cm]model3.north) {模型 $2$}; \node [system,fill=green!20,draw,anchor=south] (model2) at ([yshift=0.5cm]model3.north) {模型 $2$};
\node [system,fill=red!20,draw,anchor=south] (model1) at ([yshift=0.5cm]model2.north) {模型 $1$}; \node [system,fill=red!20,draw,anchor=south] (model1) at ([yshift=0.5cm]model2.north) {模型 $1$};
\begin{pgfonlayer}{background} \begin{pgfonlayer}{background}
\node [draw,thick,dashed,inner sep=3pt,fit=(model3) (model2) (model1)] (ensemble) {}; \node [draw,thick,dashed,inner sep=3pt,fit=(model3) (model2) (model1)] (ensemble) {};
\end{pgfonlayer} \end{pgfonlayer}
\node [system,fill=ugreen!20,draw,right=1cm of ensemble] (model) {模型}; \node [system,fill=green!20,draw,right=1cm of ensemble] (model) {模型};
\node [output,fill=cocoabrown!20,draw,minimum width=1.2cm,anchor=west] (final) at ([xshift=0.8cm]model.east) {最终\\输出}; \node [output,fill=cocoabrown!20,draw,minimum width=1.2cm,anchor=west] (final) at ([xshift=0.8cm]model.east) {最终\\输出};
...@@ -68,12 +68,12 @@ ...@@ -68,12 +68,12 @@
\tikzstyle{dot} = [circle,fill=blue!40!white,minimum size=5pt,inner sep=0pt]; \tikzstyle{dot} = [circle,fill=blue!40!white,minimum size=5pt,inner sep=0pt];
\begin{scope} \begin{scope}
\node [system,fill=orange!20,draw] (model3) at (0,0) {模型 $3$}; \node [system,fill=yellow!30,draw] (model3) at (0,0) {模型 $3$};
\node [system,fill=ugreen!20,draw,anchor=south] (model2) at ([yshift=0.5cm]model3.north) {模型 $2$}; \node [system,fill=green!20,draw,anchor=south] (model2) at ([yshift=0.5cm]model3.north) {模型 $2$};
\node [system,fill=red!20,draw,anchor=south] (model1) at ([yshift=0.5cm]model2.north) {模型 $1$}; \node [system,fill=red!20,draw,anchor=south] (model1) at ([yshift=0.5cm]model2.north) {模型 $1$};
\node [output,fill=orange!20,draw,anchor=west] (output3) at ([xshift=0.8cm]model3.east) {输出 $3$}; \node [output,fill=yellow!30,draw,anchor=west] (output3) at ([xshift=0.8cm]model3.east) {输出 $3$};
\node [output,fill=ugreen!20,draw,anchor=west] (output2) at ([xshift=0.8cm]model2.east) {输出 $2$}; \node [output,fill=green!20,draw,anchor=west] (output2) at ([xshift=0.8cm]model2.east) {输出 $2$};
\node [output,fill=red!20,draw,anchor=west] (output1) at ([xshift=0.8cm]model1.east) {输出 $1$}; \node [output,fill=red!20,draw,anchor=west] (output1) at ([xshift=0.8cm]model1.east) {输出 $1$};
\draw [->,very thick] (model1) to (output1); \draw [->,very thick] (model1) to (output1);
......
...@@ -5,12 +5,12 @@ ...@@ -5,12 +5,12 @@
\tikzstyle{output} = [rectangle,thick,rounded corners=3pt,minimum width=1.2cm,align=center,font=\scriptsize]; \tikzstyle{output} = [rectangle,thick,rounded corners=3pt,minimum width=1.2cm,align=center,font=\scriptsize];
\begin{scope}[local bounding box=MULTIPLE] \begin{scope}[local bounding box=MULTIPLE]
\node [system,fill=orange!20,draw] (engine3) at (0,0) {系统 $n$}; \node [system,fill=yellow!30,draw] (engine3) at (0,0) {系统 $n$};
\node [system,fill=ugreen!20,draw,anchor=south] (engine2) at ([yshift=0.6cm]engine3.north) {系统 $2$}; \node [system,fill=green!20,draw,anchor=south] (engine2) at ([yshift=0.6cm]engine3.north) {系统 $2$};
\node [system,fill=red!20,draw,anchor=south] (engine1) at ([yshift=0.3cm]engine2.north) {系统 $1$}; \node [system,fill=red!20,draw,anchor=south] (engine1) at ([yshift=0.3cm]engine2.north) {系统 $1$};
\node [output,fill=orange!20,draw,anchor=west] (output3) at ([xshift=0.5cm]engine3.east) {输出 $n$}; \node [output,fill=yellow!30,draw,anchor=west] (output3) at ([xshift=0.5cm]engine3.east) {输出 $n$};
\node [output,fill=ugreen!20,draw,anchor=west] (output2) at ([xshift=0.5cm]engine2.east) {输出 $2$}; \node [output,fill=green!20,draw,anchor=west] (output2) at ([xshift=0.5cm]engine2.east) {输出 $2$};
\node [output,fill=red!20,draw,anchor=west] (output1) at ([xshift=0.5cm]engine1.east) {输出 $1$}; \node [output,fill=red!20,draw,anchor=west] (output1) at ([xshift=0.5cm]engine1.east) {输出 $1$};
\draw [very thick,decorate,decoration={brace}] ([xshift=3pt]output1.north east) to node [midway,name=final] {} ([xshift=3pt]output3.south east); \draw [very thick,decorate,decoration={brace}] ([xshift=3pt]output1.north east) to node [midway,name=final] {} ([xshift=3pt]output3.south east);
...@@ -25,11 +25,11 @@ ...@@ -25,11 +25,11 @@
\end{scope} \end{scope}
\begin{scope}[local bounding box=SINGLE] \begin{scope}[local bounding box=SINGLE]
\node [output,fill=ugreen!20,draw,anchor=west] (output3) at ([xshift=4cm]output3.east) {输出 $n$}; \node [output,fill=green!20,draw,anchor=west] (output3) at ([xshift=4cm]output3.east) {输出 $n$};
\node [output,fill=ugreen!20,draw,anchor=west] (output2) at ([xshift=4cm]output2.east) {输出 $2$}; \node [output,fill=green!20,draw,anchor=west] (output2) at ([xshift=4cm]output2.east) {输出 $2$};
\node [output,fill=ugreen!20,draw,anchor=west] (output1) at ([xshift=4cm]output1.east) {输出 $1$}; \node [output,fill=green!20,draw,anchor=west] (output1) at ([xshift=4cm]output1.east) {输出 $1$};
\node [system,fill=ugreen!20,draw,anchor=east,align=center,inner sep=1.9pt] (engine) at ([xshift=-0.5cm]output2.west) {单系统}; \node [system,fill=green!20,draw,anchor=east,align=center,inner sep=1.9pt] (engine) at ([xshift=-0.5cm]output2.west) {单系统};
\draw [very thick,decorate,decoration={brace}] ([xshift=3pt]output1.north east) to node [midway,name=final] {} ([xshift=3pt]output3.south east); \draw [very thick,decorate,decoration={brace}] ([xshift=3pt]output1.north east) to node [midway,name=final] {} ([xshift=3pt]output3.south east);
......
\begin{tikzpicture} \begin{tikzpicture}
%左 %左
\node [anchor=west,draw=black!70,rounded corners,drop shadow,very thick,minimum width=6em,minimum height=3.5em,fill=blue!15,align=center,text=black] (part1) at (0,0) {\scriptsize{预测模块}}; \node [anchor=west,draw=black!70,rounded corners,drop shadow,very thick,minimum width=6em,minimum height=3.5em,fill=red!15,align=center,text=black] (part1) at (0,0) {\small{预测模块}};
\node [anchor=south] (text) at ([xshift=0.5em,yshift=-3.5em]part1.south) {\scriptsize{源语言句子(编码器输出)}}; \node [anchor=south] (text) at ([xshift=0.5em,yshift=-3.5em]part1.south) {\scriptsize{源语言句子(编码器输出)}};
\node [anchor=east,draw=black!70,rounded corners,drop shadow,very thick,minimum width=6em,minimum height=3.5em,fill=blue!15,align=center,text=black] (part2) at ([xshift=10em]part1.east) {\scriptsize{搜索模块}}; \node [anchor=east,draw=black!70,rounded corners,drop shadow,very thick,minimum width=6em,minimum height=3.5em,fill=green!15,align=center,text=black] (part2) at ([xshift=10em]part1.east) {\small{搜索模块}};
\node [anchor=south] (text1) at ([xshift=0.1em,yshift=2.2em]part1.north) {\scriptsize{译文中已经生成的单词}}; \node [anchor=south] (text1) at ([xshift=0.1em,yshift=2.2em]part1.north) {\scriptsize{译文中已经生成的单词}};
\node [anchor=south] (text2) at ([xshift=0.5em,yshift=2.2em]part2.north) {\scriptsize{预测当前位置的单词概率分布}}; \node [anchor=south] (text2) at ([xshift=0.5em,yshift=2.2em]part2.north) {\scriptsize{预测当前位置的单词概率分布}};
......
...@@ -8,10 +8,10 @@ ...@@ -8,10 +8,10 @@
\tikzstyle{po} = [font=\scriptsize,rounded corners=1pt, fill=gray!20, minimum width=1.8em,minimum height=1.5em,draw] \tikzstyle{po} = [font=\scriptsize,rounded corners=1pt, fill=gray!20, minimum width=1.8em,minimum height=1.5em,draw]
\tikzstyle{tgt} = [minimum height=1.6em,minimum width=5.2em,fill=black!10!yellow!30,font=\footnotesize,drop shadow={shadow xshift=0.15em,shadow yshift=-0.15em,}] \tikzstyle{tgt} = [minimum height=1.6em,minimum width=5.2em,fill=black!10!yellow!30,font=\footnotesize,drop shadow={shadow xshift=0.15em,shadow yshift=-0.15em,}]
\tikzstyle{p} = [fill=ugreen!15,minimum width=0.4em,inner sep=0pt] \tikzstyle{p} = [fill=ugreen!15,minimum width=0.4em,inner sep=0pt]
\node[ rounded corners=3pt, fill=red!20, drop shadow, minimum width=12em,minimum height=4em,draw] (encoder) at (0,0) {编码器}; \node[ rounded corners=3pt, thick,fill=red!20, drop shadow, minimum width=12em,minimum height=4em,draw] (encoder) at (0,0) {编码器};
\node[anchor=north,rounded corners=3pt, fill=yellow!20, drop shadow, minimum width=12em,minimum height=2em,draw] (lenpre) at([yshift=3em]encoder.north){长度预测器}; \node[anchor=north,rounded corners=3pt, thick,fill=yellow!20, drop shadow, minimum width=12em,minimum height=2em,draw] (lenpre) at([yshift=3em]encoder.north){长度预测器};
\node[anchor=north] (lable) at([xshift=3.5em,yshift=2.5em]lenpre.north){译文长度:3}; \node[anchor=north] (lable) at([xshift=3.5em,yshift=2.5em]lenpre.north){译文长度:3};
\node[anchor=west, rounded corners=3pt, fill=blue!20, drop shadow, minimum width=13em,minimum height=4em,draw] (decoder) at ([xshift=1cm]encoder.east) {解码器}; \node[anchor=west, rounded corners=3pt, thick,fill=blue!20, drop shadow, minimum width=13em,minimum height=4em,draw] (decoder) at ([xshift=1cm]encoder.east) {解码器};
\node[anchor=north,emb] (en1) at ([yshift=-1.3em,xshift=-4.5em]encoder.south) {${\mathbi e}$(干)}; \node[anchor=north,emb] (en1) at ([yshift=-1.3em,xshift=-4.5em]encoder.south) {${\mathbi e}$(干)};
\node[anchor=north,emb] (en2) at ([yshift=-1.3em,xshift=-1.5em]encoder.south) {${\mathbi e}$(得)}; \node[anchor=north,emb] (en2) at ([yshift=-1.3em,xshift=-1.5em]encoder.south) {${\mathbi e}$(得)};
......
...@@ -7,10 +7,10 @@ ...@@ -7,10 +7,10 @@
\tikzstyle{emb} = [font=\scriptsize,rounded corners=1pt, fill=orange!20, minimum width=1.8em,minimum height=1.5em,draw] \tikzstyle{emb} = [font=\scriptsize,rounded corners=1pt, fill=orange!20, minimum width=1.8em,minimum height=1.5em,draw]
\tikzstyle{po} = [font=\scriptsize,rounded corners=1pt, fill=gray!20, minimum width=1.8em,minimum height=1.5em,draw] \tikzstyle{po} = [font=\scriptsize,rounded corners=1pt, fill=gray!20, minimum width=1.8em,minimum height=1.5em,draw]
\begin{scope} \begin{scope}
\node[rounded corners=3pt, fill=red!20, drop shadow, minimum width=10em,minimum height=4em,draw] (encoder) at (0,0) {编码器}; \node[rounded corners=3pt, thick,fill=red!20, drop shadow, minimum width=10em,minimum height=4em,draw] (encoder) at (0,0) {编码器};
\node[anchor=north,rounded corners=3pt, fill=yellow!20, drop shadow, minimum width=10em,minimum height=2em,draw] (lenpre) at([yshift=3em]encoder.north){长度预测器}; \node[anchor=north,rounded corners=3pt, thick,fill=yellow!20, drop shadow, minimum width=10em,minimum height=2em,draw] (lenpre) at([yshift=3em]encoder.north){长度预测器};
\node[anchor=north] (lable) at([xshift=3.5em,yshift=2.5em]lenpre.north){译文长度:4}; \node[anchor=north] (lable) at([xshift=3.5em,yshift=2.5em]lenpre.north){译文长度:4};
\node[anchor=west, rounded corners=3pt, fill=blue!20, drop shadow, minimum width=16em,minimum height=4em,draw] (decoder) at ([xshift=1.8cm]encoder.east) {解码器}; \node[anchor=west, rounded corners=3pt, thick,fill=blue!20, drop shadow, minimum width=16em,minimum height=4em,draw] (decoder) at ([xshift=1.8cm]encoder.east) {解码器};
\node[anchor=north,emb] (en2) at ([yshift=-1.3em]encoder.south) {${\mathbi e}(x_2)$}; \node[anchor=north,emb] (en2) at ([yshift=-1.3em]encoder.south) {${\mathbi e}(x_2)$};
\node[anchor=north,emb] (en1) at ([yshift=-1.3em,xshift=-3em]encoder.south) {${\mathbi e}(x_1)$}; \node[anchor=north,emb] (en1) at ([yshift=-1.3em,xshift=-3em]encoder.south) {${\mathbi e}(x_1)$};
...@@ -61,7 +61,7 @@ ...@@ -61,7 +61,7 @@
\end{scope} \end{scope}
\begin{scope}[yshift=2.8in] \begin{scope}[yshift=2.8in]
\node[rounded corners=3pt, fill=red!20, drop shadow, minimum width=10em,minimum height=4em,draw] (encoder) at (0,0) {编码器}; \node[rounded corners=3pt, thick,fill=red!20, drop shadow, minimum width=10em,minimum height=4em,draw] (encoder) at (0,0) {编码器};
\node[anchor=west,minimum width=16em,minimum height=4em] (decoder) at ([xshift=1.8cm]encoder.east) {}; \node[anchor=west,minimum width=16em,minimum height=4em] (decoder) at ([xshift=1.8cm]encoder.east) {};
\node[anchor=north,emb] (en2) at ([yshift=-1.3em]encoder.south) {${\mathbi e}(x_2)$}; \node[anchor=north,emb] (en2) at ([yshift=-1.3em]encoder.south) {${\mathbi e}(x_2)$};
...@@ -122,7 +122,7 @@ ...@@ -122,7 +122,7 @@
\draw [->,very thick,dotted] ([xshift=-0.3em]out2.east) .. controls +(east:0.5) and +(west:0.5) ..([xshift=0em]de3.west); \draw [->,very thick,dotted] ([xshift=-0.3em]out2.east) .. controls +(east:0.5) and +(west:0.5) ..([xshift=0em]de3.west);
\draw [->,very thick,dotted] ([xshift=-0.3em]out3.east) .. controls +(east:0.5) and +(west:0.5) ..([xshift=0em]de4.west); \draw [->,very thick,dotted] ([xshift=-0.3em]out3.east) .. controls +(east:0.5) and +(west:0.5) ..([xshift=0em]de4.west);
\draw [->,very thick,dotted] ([xshift=-0.3em]out4.east) .. controls +(east:0.5) and +(west:0.5) ..([xshift=0em]de5.west); \draw [->,very thick,dotted] ([xshift=-0.3em]out4.east) .. controls +(east:0.5) and +(west:0.5) ..([xshift=0em]de5.west);
\node[anchor=west, rounded corners=3pt, fill=blue!20, drop shadow, minimum width=16em,minimum height=4em,draw] (decoder2) at ([xshift=1.8cm]encoder.east) {解码器}; \node[anchor=west, rounded corners=3pt, thick,fill=blue!20, drop shadow, minimum width=16em,minimum height=4em,draw] (decoder2) at ([xshift=1.8cm]encoder.east) {解码器};
\draw[->,line width=1pt] (encoder.east) -- (decoder.west); \draw[->,line width=1pt] (encoder.east) -- (decoder.west);
\end{scope} \end{scope}
......
...@@ -154,7 +154,7 @@ ...@@ -154,7 +154,7 @@
\begin{itemize} \begin{itemize}
\vspace{0.5em} \vspace{0.5em}
\item 长度惩罚因子。用译文长度来归一化翻译概率是最常用的方法:对于源语言句子$\seq{x}$和译文句子$\seq{y}$,模型得分$\textrm{score}(\seq{x},\seq{y})$的值会随着译文$\seq{y}$ 的长度增大而减小。为了避免此现象,可以引入一个长度惩罚函数$\textrm{lp}(\seq{y})$,并定义模型得分如公式\eqref{eq:14-12}所示: \item {\small\sffamily\bfseries{长度惩罚因子}}。用译文长度来归一化翻译概率是最常用的方法:对于源语言句子$\seq{x}$和译文句子$\seq{y}$,模型得分$\textrm{score}(\seq{x},\seq{y})$的值会随着译文$\seq{y}$ 的长度增大而减小。为了避免此现象,可以引入一个长度惩罚函数$\textrm{lp}(\seq{y})$,并定义模型得分如公式\eqref{eq:14-12}所示:
\begin{eqnarray} \begin{eqnarray}
\textrm{score}(\seq{x},\seq{y}) &=& \frac{\log \funp{P}(\seq{y}\vert\seq{x})}{\textrm{lp}(\seq{y})} \textrm{score}(\seq{x},\seq{y}) &=& \frac{\log \funp{P}(\seq{y}\vert\seq{x})}{\textrm{lp}(\seq{y})}
...@@ -179,7 +179,7 @@ ...@@ -179,7 +179,7 @@
\end{table} \end{table}
%---------------------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------------------
\vspace{0.5em} \vspace{0.5em}
\item 译文长度范围约束。为了让译文的长度落在合理的范围内,神经机器翻译的推断也会设置一个译文长度约束\upcite{Vaswani2018Tensor2TensorFN,KleinOpenNMT}。令$[a,b]$表示一个长度范围,可以定义: \item {\small\sffamily\bfseries{译文长度范围约束}}。为了让译文的长度落在合理的范围内,神经机器翻译的推断也会设置一个译文长度约束\upcite{Vaswani2018Tensor2TensorFN,KleinOpenNMT}。令$[a,b]$表示一个长度范围,可以定义:
\begin{eqnarray} \begin{eqnarray}
a &=& \omega_{\textrm{low}}\cdot |\seq{x}| \label{eq:14-3}\\ a &=& \omega_{\textrm{low}}\cdot |\seq{x}| \label{eq:14-3}\\
...@@ -188,7 +188,7 @@ b &=& \omega_{\textrm{high}}\cdot |\seq{x}| \label{eq:14-4} ...@@ -188,7 +188,7 @@ b &=& \omega_{\textrm{high}}\cdot |\seq{x}| \label{eq:14-4}
\vspace{0.5em} \vspace{0.5em}
\noindent 其中,$\omega_{\textrm{low}}$$\omega_{\textrm{high}}$分别表示译文长度的下限和上限,比如,很多系统中设置为$\omega_{\textrm{low}}=1/2$$\omega_{\textrm{high}}=2$,表示译文至少有源语言句子一半长,最多有源语言句子两倍长。$\omega_{\textrm{low}}$$\omega_{\textrm{high}}$的设置对推断效率影响很大,$\omega_{\textrm{high}}$可以被看作是一个推断的终止条件,最理想的情况是$\omega_{\textrm{high}} \cdot |\seq{x}|$恰巧就等于最佳译文的长度,这时没有浪费任何计算资源。反过来的一种情况,$\omega_{\textrm{high}} \cdot |\seq{x}|$远大于最佳译文的长度,这时很多计算都是无用的。为了找到长度预测的准确率和召回率之间的平衡,一般需要大量的实验最终确定$\omega_{\textrm{low}}$$\omega_{\textrm{high}}$。当然,利用统计模型预测$\omega_{\textrm{low}}$$\omega_{\textrm{high}}$也是非常值得探索的方向,比如基于繁衍率的模型\upcite{Gu2017NonAutoregressiveNM,Feng2016ImprovingAM} \noindent 其中,$\omega_{\textrm{low}}$$\omega_{\textrm{high}}$分别表示译文长度的下限和上限,比如,很多系统中设置为$\omega_{\textrm{low}}=1/2$$\omega_{\textrm{high}}=2$,表示译文至少有源语言句子一半长,最多有源语言句子两倍长。$\omega_{\textrm{low}}$$\omega_{\textrm{high}}$的设置对推断效率影响很大,$\omega_{\textrm{high}}$可以被看作是一个推断的终止条件,最理想的情况是$\omega_{\textrm{high}} \cdot |\seq{x}|$恰巧就等于最佳译文的长度,这时没有浪费任何计算资源。反过来的一种情况,$\omega_{\textrm{high}} \cdot |\seq{x}|$远大于最佳译文的长度,这时很多计算都是无用的。为了找到长度预测的准确率和召回率之间的平衡,一般需要大量的实验最终确定$\omega_{\textrm{low}}$$\omega_{\textrm{high}}$。当然,利用统计模型预测$\omega_{\textrm{low}}$$\omega_{\textrm{high}}$也是非常值得探索的方向,比如基于繁衍率的模型\upcite{Gu2017NonAutoregressiveNM,Feng2016ImprovingAM}
\vspace{0.5em} \vspace{0.5em}
\item 覆盖度模型。译文长度过长或过短的问题,本质上对应着 {\small\sffamily\bfseries{过翻译}}\index{过翻译}(Over Translation)\index{Over Translation}{\small\sffamily\bfseries{欠翻译}}\index{欠翻译}(Under Translation)\index{Under Translation}的问题\upcite{Yang2018OtemUtemOA}。这两种问题出现的原因主要在于:神经机器翻译没有对过翻译和欠翻译建模,即机器翻译覆盖度问题\upcite{TuModeling}。针对此问题,最常用的方法是在推断的过程中引入一个度量覆盖度的模型。比如,使用GNMT 覆盖度模型定义模型得分\upcite{Wu2016GooglesNM},如下: \item {\small\sffamily\bfseries{覆盖度模型}}。译文长度过长或过短的问题,本质上对应着 {\small\sffamily\bfseries{过翻译}}\index{过翻译}(Over Translation)\index{Over Translation}{\small\sffamily\bfseries{欠翻译}}\index{欠翻译}(Under Translation)\index{Under Translation}的问题\upcite{Yang2018OtemUtemOA}。这两种问题出现的原因主要在于:神经机器翻译没有对过翻译和欠翻译建模,即机器翻译覆盖度问题\upcite{TuModeling}。针对此问题,最常用的方法是在推断的过程中引入一个度量覆盖度的模型。比如,使用GNMT 覆盖度模型定义模型得分\upcite{Wu2016GooglesNM},如下:
\begin{eqnarray} \begin{eqnarray}
\textrm{score}(\seq{x},\seq{y}) &=& \frac{\log \funp{P}(\seq{y} | \seq{x})}{\textrm{lp}(\seq{y})} + \textrm{cp}(\seq{x},\seq{y}) \label {eq:14-5}\\ \textrm{score}(\seq{x},\seq{y}) &=& \frac{\log \funp{P}(\seq{y} | \seq{x})}{\textrm{lp}(\seq{y})} + \textrm{cp}(\seq{x},\seq{y}) \label {eq:14-5}\\
\textrm{cp}(\seq{x},\seq{y}) &=& \beta \cdot \sum_{i=1}^{|\seq{x}|} \log(\textrm{min} (\sum_{j}^{|\seq{y}|} a_{ij} , 1)) \textrm{cp}(\seq{x},\seq{y}) &=& \beta \cdot \sum_{i=1}^{|\seq{x}|} \log(\textrm{min} (\sum_{j}^{|\seq{y}|} a_{ij} , 1))
......
...@@ -2,10 +2,10 @@ ...@@ -2,10 +2,10 @@
\begin{tikzpicture} \begin{tikzpicture}
\begin{scope} \begin{scope}
\tikzstyle{hnode}=[rectangle,inner sep=0mm,minimum height=2em,minimum width=3em,rounded corners=5pt,fill=ugreen!20] \tikzstyle{hnode}=[rectangle,inner sep=0mm,minimum height=2em,minimum width=3em,rounded corners=5pt,fill=green!20]
\tikzstyle{tnode}=[rectangle,inner sep=0mm,minimum height=2em,minimum width=3em,rounded corners=5pt,fill=red!20] \tikzstyle{tnode}=[rectangle,inner sep=0mm,minimum height=2em,minimum width=3em,rounded corners=5pt,fill=red!20]
\tikzstyle{fnoder}=[rectangle,inner sep=0mm,minimum height=2.4em,minimum width=6.8em,draw,dashed,very thick,rounded corners=5pt,red!40] \tikzstyle{fnoder}=[rectangle,inner sep=0mm,minimum height=2.4em,minimum width=6.8em,draw,dashed,very thick,rounded corners=5pt,red!40]
\tikzstyle{fnodeg}=[rectangle,inner sep=0mm,minimum height=2.4em,minimum width=6.8em,draw,dashed,very thick,rounded corners=5pt,ugreen!40] \tikzstyle{fnodeg}=[rectangle,inner sep=0mm,minimum height=2.4em,minimum width=6.8em,draw,dashed,very thick,rounded corners=5pt,green!40]
\node [anchor=south west,fnodeg] (f1) at (0,0) {}; \node [anchor=south west,fnodeg] (f1) at (0,0) {};
\node [anchor=west,hnode] (n1) at ([xshift=0.2em,yshift=0em]f1.west) {$\mathbi{h}_1^{\textrm{up}}$}; \node [anchor=west,hnode] (n1) at ([xshift=0.2em,yshift=0em]f1.west) {$\mathbi{h}_1^{\textrm{up}}$};
...@@ -24,24 +24,24 @@ ...@@ -24,24 +24,24 @@
\node [anchor=east,hnode] (n8) at ([xshift=-0.2em,yshift=0em]f4.east) {$\cdots$}; \node [anchor=east,hnode] (n8) at ([xshift=-0.2em,yshift=0em]f4.east) {$\cdots$};
\node [anchor=west,fnodeg] (f5) at ([xshift=0.6em,yshift=0em]f4.east) {}; \node [anchor=west,fnodeg] (f5) at ([xshift=0.6em,yshift=0em]f4.east) {};
\node [anchor=west,hnode] (n9) at ([xshift=0.2em,yshift=0em]f5.west) {$\mathbi{h}_n^{\textrm{up}}$}; \node [anchor=west,hnode] (n9) at ([xshift=0.2em,yshift=0em]f5.west) {$\mathbi{h}_m^{\textrm{up}}$};
\node [anchor=east,hnode] (n10) at ([xshift=-0.2em,yshift=0em]f5.east) {$\mathbi{h}_n^{\textrm{down}}$}; \node [anchor=east,hnode] (n10) at ([xshift=-0.2em,yshift=0em]f5.east) {$\mathbi{h}_m^{\textrm{down}}$};
\node [anchor=south,fnoder] (f6) at ([xshift=3.7em,yshift=1em]f1.north) {}; \node [anchor=south,fnoder] (f6) at ([xshift=3.7em,yshift=1em]f1.north) {};
\node [anchor=west,tnode] (n11) at ([xshift=0.2em,yshift=0em]f6.west) {$\mathbi{h}_{n+1}^{\textrm{up}}$}; \node [anchor=west,tnode] (n11) at ([xshift=0.2em,yshift=0em]f6.west) {$\mathbi{h}_{m+1}^{\textrm{up}}$};
\node [anchor=east,tnode] (n12) at ([xshift=-0.2em,yshift=0em]f6.east) {$\mathbi{h}_{n+1}^{\textrm{down}}$}; \node [anchor=east,tnode] (n12) at ([xshift=-0.2em,yshift=0em]f6.east) {$\mathbi{h}_{m+1}^{\textrm{down}}$};
\node [anchor=south,fnoder] (f7) at ([xshift=3.7em,yshift=1em]f6.north) {}; \node [anchor=south,fnoder] (f7) at ([xshift=3.7em,yshift=1em]f6.north) {};
\node [anchor=west,tnode] (n13) at ([xshift=0.2em,yshift=0em]f7.west) {$\mathbi{h}_{n+2}^{\textrm{up}}$}; \node [anchor=west,tnode] (n13) at ([xshift=0.2em,yshift=0em]f7.west) {$\mathbi{h}_{m+2}^{\textrm{up}}$};
\node [anchor=east,tnode] (n14) at ([xshift=-0.2em,yshift=0em]f7.east) {$\mathbi{h}_{n+2}^{\textrm{down}}$}; \node [anchor=east,tnode] (n14) at ([xshift=-0.2em,yshift=0em]f7.east) {$\mathbi{h}_{m+2}^{\textrm{down}}$};
\node [anchor=south,fnoder] (f8) at ([xshift=3.7em,yshift=1em]f7.north) {}; \node [anchor=south,fnoder] (f8) at ([xshift=3.7em,yshift=1em]f7.north) {};
\node [anchor=west,tnode] (n15) at ([xshift=0.2em,yshift=0em]f8.west) {$\cdots$}; \node [anchor=west,tnode] (n15) at ([xshift=0.2em,yshift=0em]f8.west) {$\cdots$};
\node [anchor=east,tnode] (n16) at ([xshift=-0.2em,yshift=0em]f8.east) {$\cdots$}; \node [anchor=east,tnode] (n16) at ([xshift=-0.2em,yshift=0em]f8.east) {$\cdots$};
\node [anchor=south,fnoder] (f9) at ([xshift=3.7em,yshift=1em]f8.north) {}; \node [anchor=south,fnoder] (f9) at ([xshift=3.7em,yshift=1em]f8.north) {};
\node [anchor=west,tnode] (n17) at ([xshift=0.2em,yshift=0em]f9.west) {$\mathbi{h}_{2n-1}^{\textrm{up}}$}; \node [anchor=west,tnode] (n17) at ([xshift=0.2em,yshift=0em]f9.west) {$\mathbi{h}_{2m-1}^{\textrm{up}}$};
\node [anchor=east,tnode] (n18) at ([xshift=-0.2em,yshift=0em]f9.east) {$\mathbi{h}_{2n-1}^{\textrm{down}}$}; \node [anchor=east,tnode] (n18) at ([xshift=-0.2em,yshift=0em]f9.east) {$\mathbi{h}_{2m-1}^{\textrm{down}}$};
\draw [->,thick] ([xshift=0em,yshift=0em]n11.east) -- ([xshift=0em,yshift=0em]n12.west); \draw [->,thick] ([xshift=0em,yshift=0em]n11.east) -- ([xshift=0em,yshift=0em]n12.west);
......
...@@ -39,42 +39,49 @@ ...@@ -39,42 +39,49 @@
\end{scope} \end{scope}
%right %right
\begin{scope}[xshift=14em] \begin{scope}[xshift=13em]
\foreach \x/\d in {1/2em, 2/8em, 3/14em}
\node[unit,fill=yellow!20] at (0,\d) (ln_\x) {层正则化};
\foreach \x/\d in {1/6em, 2/12em, 3/22em}
\node[draw,circle,minimum size=1em,inner sep=1pt] at (0,\d) (add_\x) {\scriptsize\bfnew{+}};
\node[unit,fill=red!20] at (0,16em) (conv_4) {卷积$1 \times 1$:2048}; \foreach \x/\d in {1/2em, 2/8em, 3/16em}
\node[unit,fill=red!20] at (0,20em) (conv_5) {卷积$1 \times 1$:512}; \node[unit,fill=yellow!20] at (0,\d) (ln_\x) {层正则化};
\node[unit,fill=blue!20] at (0,18em) (relu_3) {RELU}; \foreach \x/\d in {1/6em, 2/14em, 3/20em}
\node[unit,fill=cyan!20] at (0,4em) (conv_3) {Sep卷积$9 \times 1$:256}; \node[draw,circle,minimum size=1em,inner sep=1pt] at (0,\d) (add_\x) {\scriptsize\bfnew{+}};
\node[unit,fill=green!20] at (0,10em) (sa_1) {8头自注意力:512};
\node[unit,fill=red!20] at (0,4em) (glu_1) {门控线性单元:512};
\node[unit,fill=red!20] at (-3em,10em) (conv_1) {卷积$1 \times 1$:2048};
\node[unit,fill=cyan!20] at (3em,10em) (conv_2) {卷积$3 \times 1$:256};
\node[unit,fill=blue!20] at (-3em,12em) (relu_1) {RELU};
\node[unit,fill=blue!20] at (3em,12em) (relu_2) {RELU};
\node[unit,fill=cyan!20] at (0em,18em) (conv_3) {Sep卷积$9 \times 1$:256};
\draw[->,thick] ([yshift=-1.4em]ln_1.-90) -- ([yshift=-0.1em]ln_1.-90); \draw[->,thick] ([yshift=-1.4em]ln_1.-90) -- ([yshift=-0.1em]ln_1.-90);
\draw[->,thick] ([yshift=0.1em]ln_1.90) -- ([yshift=-0.1em]conv_3.-90); \draw[->,thick] ([yshift=0.1em]ln_1.90) -- ([yshift=-0.1em]glu_1.-90);
\draw[->,thick] ([yshift=0.1em]conv_3.90) -- ([yshift=-0.1em]add_1.-90); \draw[->,thick] ([yshift=0.1em]glu_1.90) -- ([yshift=-0.1em]add_1.-90);
\draw[->,thick] ([yshift=0.1em]add_1.90) -- ([yshift=-0.1em]ln_2.-90); \draw[->,thick] ([yshift=0.1em]add_1.90) -- ([yshift=-0.1em]ln_2.-90);
\draw[->,thick] ([,yshift=0.1em]ln_2.90) -- ([yshift=-0.1em]sa_1.-90); \draw[->,thick] ([,yshift=0.1em]ln_2.135) -- ([yshift=-0.1em]conv_1.-90);
\draw[->,thick] ([yshift=0.1em]sa_1.90) -- ([yshift=-0.1em]add_2.-90); \draw[->,thick] ([yshift=0.1em]ln_2.45) -- ([yshift=-0.1em]conv_2.-90);
\draw[->,thick] ([yshift=0.1em]conv_1.90) -- ([yshift=-0.1em]relu_1.-90);
\draw[->,thick] ([yshift=0.1em]conv_2.90) -- ([yshift=-0.1em]relu_2.-90);
\draw[->,thick] ([yshift=0.1em]relu_1.90) -- ([yshift=-0.1em]add_2.-135);
\draw[->,thick] ([yshift=0.1em]relu_2.90) -- ([yshift=-0.1em]add_2.-45);
\draw[->,thick] ([yshift=0.1em]add_2.90) -- ([yshift=-0.1em]ln_3.-90); \draw[->,thick] ([yshift=0.1em]add_2.90) -- ([yshift=-0.1em]ln_3.-90);
\draw[->,thick] ([yshift=0.1em]ln_3.90) -- ([yshift=-0.1em]conv_4.-90); \draw[->,thick] ([yshift=0.1em]ln_3.90) -- ([yshift=-0.1em]conv_3.-90);
\draw[->,thick] ([yshift=0.1em]conv_4.90) -- ([yshift=-0.1em]relu_3.-90); \draw[->,thick] ([yshift=0.1em]conv_3.90) -- ([yshift=-0.1em]add_3.-90);
\draw[->,thick] ([yshift=0.1em]relu_3.90) -- ([yshift=-0.1em]conv_5.-90);
\draw[->,thick] ([yshift=0.1em]conv_5.90) -- ([yshift=-0.1em]add_3.-90);
\draw[->,thick] ([yshift=0.1em]add_3.90) -- ([yshift=1em]add_3.90); \draw[->,thick] ([yshift=0.1em]add_3.90) -- ([yshift=1em]add_3.90);
\draw[->,thick] ([yshift=-0.8em]ln_1.-90) .. controls ([xshift=5em,yshift=-0.8em]ln_1.-90) and ([xshift=5em]add_1.0) .. (add_1.0); \draw[->,thick] ([yshift=-0.8em]ln_1.-90) .. controls ([xshift=5em,yshift=-0.8em]ln_1.-90) and ([xshift=5em]add_1.0) .. (add_1.0);
\draw[->,thick] (add_1.0) .. controls ([xshift=5em]add_1.0) and ([xshift=5em]add_2.0) .. (add_2.0); \draw[->,thick] (add_1.0) .. controls ([xshift=8em]add_1.0) and ([xshift=8em]add_3.0) .. (add_3.0);
\draw[->,thick] (add_2.0) .. controls ([xshift=5em]add_2.0) and ([xshift=5em]add_3.0) .. (add_3.0);
\node[font=\scriptsize,align=center] at (0em, -1.5em){(b) 使用结构搜索方法优化后的 \\ Transformer编码器中若干块的结构}; \node[font=\scriptsize,align=center] at (0em, -1.5em){(b) 使用结构搜索方法优化后的 \\ Transformer编码器中若干块的结构};
\node[minimum size=0.8em,inner sep=0pt,rounded corners=1pt,draw,fill=blue!20] (act) at (5.5em, 20em){}; \node[minimum size=0.8em,inner sep=0pt,rounded corners=1pt,draw,fill=blue!20] (act) at (8em, 20em){};
\node[anchor=west,font=\footnotesize] at ([xshift=0.1em]act.east){激活函数}; \node[anchor=west,font=\footnotesize] at ([xshift=0.1em]act.east){激活函数};
\node[anchor=north,minimum size=0.8em,inner sep=0pt,rounded corners=1pt,draw,fill=yellow!20] (nor) at ([yshift=-0.6em]act.south){}; \node[anchor=north,minimum size=0.8em,inner sep=0pt,rounded corners=1pt,draw,fill=yellow!20] (nor) at ([yshift=-0.6em]act.south){};
\node[anchor=west,font=\footnotesize] at ([xshift=0.1em]nor.east){层正则化}; \node[anchor=west,font=\footnotesize] at ([xshift=0.1em]nor.east){层正则化};
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
\begin{tikzpicture} \begin{tikzpicture}
\begin{scope} \begin{scope}
\tikzstyle{hnode}=[rectangle,inner sep=0mm,minimum height=2em,minimum width=4.5em,rounded corners=5pt,fill=ugreen!30] \tikzstyle{hnode}=[rectangle,inner sep=0mm,minimum height=2em,minimum width=4.5em,rounded corners=5pt,fill=green!30]
\tikzstyle{tnode}=[rectangle,inner sep=0mm,minimum height=2em,minimum width=4.5em,rounded corners=5pt,fill=red!30] \tikzstyle{tnode}=[rectangle,inner sep=0mm,minimum height=2em,minimum width=4.5em,rounded corners=5pt,fill=red!30]
\tikzstyle{wnode}=[inner sep=0mm,minimum height=1.4em,minimum width=4.4em] \tikzstyle{wnode}=[inner sep=0mm,minimum height=1.4em,minimum width=4.4em]
...@@ -10,12 +10,12 @@ ...@@ -10,12 +10,12 @@
\node [anchor=west,hnode] (n2) at ([xshift=1em,yshift=0em]n1.east) {$\mathbi{h}_2$}; \node [anchor=west,hnode] (n2) at ([xshift=1em,yshift=0em]n1.east) {$\mathbi{h}_2$};
\node [anchor=west,hnode] (n3) at ([xshift=1em,yshift=0em]n2.east) {$\mathbi{h}_3$}; \node [anchor=west,hnode] (n3) at ([xshift=1em,yshift=0em]n2.east) {$\mathbi{h}_3$};
\node [anchor=west,hnode] (n4) at ([xshift=1em,yshift=0em]n3.east) {$\cdots$}; \node [anchor=west,hnode] (n4) at ([xshift=1em,yshift=0em]n3.east) {$\cdots$};
\node [anchor=west,hnode] (n5) at ([xshift=1em,yshift=0em]n4.east) {$\mathbi{h}_n$}; \node [anchor=west,hnode] (n5) at ([xshift=1em,yshift=0em]n4.east) {$\mathbi{h}_m$};
\node [anchor=south,tnode] (t1) at ([xshift=2.8em,yshift=1em]n1.north) {$\mathbi{h}_{n+1}$}; \node [anchor=south,tnode] (t1) at ([xshift=2.8em,yshift=1em]n1.north) {$\mathbi{h}_{m+1}$};
\node [anchor=south,tnode] (t2) at ([xshift=2.8em,yshift=1em]t1.north) {$\mathbi{h}_{n+2}$}; \node [anchor=south,tnode] (t2) at ([xshift=2.8em,yshift=1em]t1.north) {$\mathbi{h}_{m+2}$};
\node [anchor=south,tnode] (t3) at ([xshift=2.8em,yshift=1em]t2.north) {$\cdots$}; \node [anchor=south,tnode] (t3) at ([xshift=2.8em,yshift=1em]t2.north) {$\cdots$};
\node [anchor=south,tnode] (t4) at ([xshift=2.8em,yshift=1em]t3.north) {$\mathbi{h}_{2n-1}$}; \node [anchor=south,tnode] (t4) at ([xshift=2.8em,yshift=1em]t3.north) {$\mathbi{h}_{2m-1}$};
\draw [->,thick] ([xshift=0em,yshift=0em]n1.east) -- ([xshift=0em,yshift=0em]n2.west); \draw [->,thick] ([xshift=0em,yshift=0em]n1.east) -- ([xshift=0em,yshift=0em]n2.west);
\draw [->,thick] ([xshift=0em,yshift=0em]n2.east) -- ([xshift=0em,yshift=0em]n3.west); \draw [->,thick] ([xshift=0em,yshift=0em]n2.east) -- ([xshift=0em,yshift=0em]n3.west);
......
...@@ -7,9 +7,9 @@ ...@@ -7,9 +7,9 @@
\tikzstyle{vlnode}=[rectangle,inner sep=0mm,minimum height=1em,minimum width=5em,rounded corners=2pt,draw] \tikzstyle{vlnode}=[rectangle,inner sep=0mm,minimum height=1em,minimum width=5em,rounded corners=2pt,draw]
\node [anchor=west,lnode] (n1) at (0, 0) {$\mathbi{g}^3$}; \node [anchor=west,lnode] (n1) at (0, 0) {$\mathbi{h}_3$};
\node [anchor=north west,lnode] (n2) at ([xshift=0em,yshift=-0.5em]n1.south west) {$\mathbi{g}^2$}; \node [anchor=north west,lnode] (n2) at ([xshift=0em,yshift=-0.5em]n1.south west) {$\mathbi{h}_2$};
\node [anchor=north west,lnode] (n3) at ([xshift=0em,yshift=-0.5em]n2.south west) {$\mathbi{g}^1$}; \node [anchor=north west,lnode] (n3) at ([xshift=0em,yshift=-0.5em]n2.south west) {$\mathbi{h}_1$};
\node [anchor=south] (d1) at ([xshift=0em,yshift=0.2em]n1.north) {1D}; \node [anchor=south] (d1) at ([xshift=0em,yshift=0.2em]n1.north) {1D};
......
...@@ -19,7 +19,7 @@ ...@@ -19,7 +19,7 @@
\node [anchor=west,encnode,draw=red!60!black!80,fill=red!20] (n7) at ([xshift=1em,yshift=0em]n6.east) {$\mathbi{h}_{L-1}$}; \node [anchor=west,encnode,draw=red!60!black!80,fill=red!20] (n7) at ([xshift=1em,yshift=0em]n6.east) {$\mathbi{h}_{L-1}$};
\node [anchor=north,rectangle,draw=teal!80, inner sep=0mm,minimum height=2em,minimum width=8em,fill=teal!17,rounded corners=5pt,thick] (n8) at ([xshift=3em,yshift=-1.2em]n4.south) {权重聚合$\mathbi{g}$}; \node [anchor=north,rectangle,draw=teal!80, inner sep=0mm,minimum height=2em,minimum width=8em,fill=teal!17,rounded corners=5pt,thick] (n8) at ([xshift=3em,yshift=-1.5em]n4.south) {权重聚合$\mathbi{g}$};
...@@ -61,11 +61,11 @@ ...@@ -61,11 +61,11 @@
\draw [->,thick] ([xshift=0em,yshift=0em]n5.east) -- ([xshift=0em,yshift=0em]n6.west); \draw [->,thick] ([xshift=0em,yshift=0em]n5.east) -- ([xshift=0em,yshift=0em]n6.west);
\draw [->,thick] ([xshift=0em,yshift=0em]n6.east) -- ([xshift=0em,yshift=0em]n7.west); \draw [->,thick] ([xshift=0em,yshift=0em]n6.east) -- ([xshift=0em,yshift=0em]n7.west);
\draw [->,thick] ([xshift=0em,yshift=0em]n2.south) -- ([xshift=0em,yshift=0em]n8.north); \draw [->,thick] ([xshift=0em,yshift=0em]n2.south)..controls +(south:1.5em) and +(north:1.5em)..([xshift=0em,yshift=0.1em]n8.north);
\draw [->,thick] ([xshift=0em,yshift=0em]n3.south) -- ([xshift=0em,yshift=0em]n8.north); \draw [->,thick] ([xshift=0em,yshift=0em]n3.south)..controls +(south:0.9em) and +(north:1.6em)..([xshift=0em,yshift=0.1em]n8.north);
\draw [->,thick] ([xshift=0em,yshift=0em]n4.south) -- ([xshift=0em,yshift=0em]n8.north); \draw [->,thick] ([xshift=0em,yshift=0em]n4.south)..controls +(south:0.8em) and +(north:1.4em)..([xshift=0em,yshift=0.1em]n8.north);
\draw [->,thick] ([xshift=0em,yshift=0em]n5.south) -- ([xshift=0em,yshift=0em]n8.north); \draw [->,thick] ([xshift=0em,yshift=0em]n5.south)..controls +(south:0.8em) and +(north:1.4em)..([xshift=0em,yshift=0.1em]n8.north);
\draw [->,thick] ([xshift=0em,yshift=0em]n7.south) -- ([xshift=0em,yshift=0em]n8.north); \draw [->,thick] ([xshift=0em,yshift=0em]n7.south)..controls +(south:1.5em) and +(north:1.5em)..([xshift=0em,yshift=0.1em]n8.north);
\draw [->,thick] ([xshift=0em,yshift=0em]n10.east) -- ([xshift=0em,yshift=0em]n11.west); \draw [->,thick] ([xshift=0em,yshift=0em]n10.east) -- ([xshift=0em,yshift=0em]n11.west);
\draw [->,thick] ([xshift=0em,yshift=0em]n11.east) -- ([xshift=0em,yshift=0em]n12.west); \draw [->,thick] ([xshift=0em,yshift=0em]n11.east) -- ([xshift=0em,yshift=0em]n12.west);
...@@ -74,10 +74,10 @@ ...@@ -74,10 +74,10 @@
\draw [->,thick] ([xshift=0em,yshift=0em]n14.east) -- ([xshift=0em,yshift=0em]n15.west); \draw [->,thick] ([xshift=0em,yshift=0em]n14.east) -- ([xshift=0em,yshift=0em]n15.west);
\draw [->,thick] ([xshift=0em,yshift=0em]n15.east) -- ([xshift=0em,yshift=0em]n16.west); \draw [->,thick] ([xshift=0em,yshift=0em]n15.east) -- ([xshift=0em,yshift=0em]n16.west);
\draw [->,thick] ([xshift=0em,yshift=0em]n8.south) -- ([xshift=0em,yshift=0em]n11.north); \draw [->,thick] ([xshift=0em,yshift=0em]n8.south)..controls +(south:1.2em) and +(north:1.5em)..([xshift=0em,yshift=0em]n11.north);
\draw [->,thick] ([xshift=0em,yshift=0em]n8.south) -- ([xshift=0em,yshift=0em]n12.north); \draw [->,thick] ([xshift=0em,yshift=0em]n8.south)..controls +(south:1.3em) and +(north:1.2em)..([xshift=0em,yshift=0em]n12.north);
\draw [->,thick] ([xshift=0em,yshift=0em]n8.south) -- ([xshift=0em,yshift=0em]n13.north); \draw [->,thick] ([xshift=0em,yshift=0em]n8.south)..controls +(south:1.3em) and +(north:1.2em)..([xshift=0em,yshift=0em]n13.north);
\draw [->,thick] ([xshift=0em,yshift=0em]n8.south) -- ([xshift=0em,yshift=0em]n15.north); \draw [->,thick] ([xshift=0em,yshift=0em]n8.south)..controls +(south:1.5em) and +(north:1.5em)..([xshift=0em,yshift=0em]n15.north);
......
...@@ -3,9 +3,9 @@ ...@@ -3,9 +3,9 @@
\begin{center} \begin{center}
\begin{tikzpicture} \begin{tikzpicture}
\tikzstyle{manode}=[rectangle,inner sep=0mm,minimum height=4em,minimum width=4em,rounded corners=5pt,thick,draw,fill=blue!20] \tikzstyle{manode}=[rectangle,inner sep=0mm,minimum height=4em,minimum width=4em,rounded corners=5pt,thick,draw,fill=blue!15]
\tikzstyle{ffnnode}=[rectangle,inner sep=0mm,minimum height=1.8em,minimum width=6em,rounded corners=5pt,thick,fill=red!20,draw] \tikzstyle{ffnnode}=[rectangle,inner sep=0mm,minimum height=1.8em,minimum width=6em,rounded corners=5pt,thick,fill=red!15,draw]
\tikzstyle{ebnode}=[rectangle,inner sep=0mm,minimum height=1.8em,minimum width=10em,rounded corners=5pt,thick,fill=green!20,draw] \tikzstyle{ebnode}=[rectangle,inner sep=0mm,minimum height=1.8em,minimum width=10em,rounded corners=5pt,thick,fill=ugreen!15,draw]
\begin{scope}[] \begin{scope}[]
...@@ -33,6 +33,11 @@ ...@@ -33,6 +33,11 @@
\draw[->,thick,rectangle,rounded corners=5pt] ([xshift=0em,yshift=0.5em]f1.north)--([xshift=-6em,yshift=0.5em]f1.north)--([xshift=-5.45em,yshift=0em]add1.west)--([xshift=0em,yshift=0em]add1.west); \draw[->,thick,rectangle,rounded corners=5pt] ([xshift=0em,yshift=0.5em]f1.north)--([xshift=-6em,yshift=0.5em]f1.north)--([xshift=-5.45em,yshift=0em]add1.west)--([xshift=0em,yshift=0em]add1.west);
\node [anchor=north,inner sep=0mm,minimum height=1.5em] (ip) at ([xshift=0em,yshift=-1em]f1.south){input};
\node [anchor=south,inner sep=0mm,minimum height=1.5em] (op) at ([xshift=0em,yshift=1em]f2.north){output};
\draw[->,thick] ([xshift=0em,yshift=0em]ip.north)--([xshift=0em,yshift=0em]f1.south);
\draw[->,thick] ([xshift=0em,yshift=0em]f2.north)--([xshift=0em,yshift=0em]op.south);
\end{scope} \end{scope}
\end{tikzpicture} \end{tikzpicture}
\end{center} \end{center}
\ No newline at end of file
...@@ -5,9 +5,9 @@ ...@@ -5,9 +5,9 @@
\begin{scope}[scale=0.36] \begin{scope}[scale=0.36]
\tikzstyle{every node}=[scale=0.36] \tikzstyle{every node}=[scale=0.36]
\node[draw=ublue,very thick,drop shadow,fill=white,minimum width=40em,minimum height=25em] (rec3) at (2.25,0){}; \node[draw=ublue,very thick,rounded corners=3pt,drop shadow,fill=white,minimum width=40em,minimum height=25em] (rec3) at (2.25,0){};
\node[draw=ublue,very thick,drop shadow,fill=white,minimum width=22em,minimum height=25em] (rec2) at (-12.4,0){}; \node[draw=ublue,very thick,rounded corners=3pt,drop shadow,fill=white,minimum width=22em,minimum height=25em] (rec2) at (-12.4,0){};
\node[draw=ublue,very thick,drop shadow,fill=white,minimum width=24em,minimum height=25em] (rec1) at (-24,0){}; \node[draw=ublue,very thick,rounded corners=3pt,drop shadow,fill=white,minimum width=24em,minimum height=25em] (rec1) at (-24,0){};
%left %left
\node[text=ublue] (label1) at (-26.4,4){\Huge\bfnew{结构空间}}; \node[text=ublue] (label1) at (-26.4,4){\Huge\bfnew{结构空间}};
......
\begin{tikzpicture} \begin{tikzpicture}
\begin{scope} \begin{scope}
\tikzstyle{cirnode}=[circle,minimum size=3.7em,draw] \tikzstyle{cirnode}=[circle,minimum size=3em,font=\footnotesize,draw]
\tikzstyle{recnode}=[rectangle,rounded corners=2pt,inner sep=0mm,minimum height=1.5em,minimum width=4em,draw] \tikzstyle{recnode}=[rectangle,rounded corners=2pt,inner sep=0mm,minimum height=1.8em,minimum width=6em]
\node [anchor=west,cirnode] (n1) at (0, 0) {$\mathbi{h}_{i-2}^l$}; \node [anchor=west,cirnode] (n1) at (0, 0) {$\mathbi{h}_{i-2}^l$};
\node [anchor=west,cirnode] (n2) at ([xshift=1em,yshift=0em]n1.east) {$\mathbi{h}_{i-1}^l$}; \node [anchor=west,cirnode] (n2) at ([xshift=1.2em,yshift=0em]n1.east) {$\mathbi{h}_{i-1}^l$};
\node [anchor=west,cirnode] (n3) at ([xshift=1em,yshift=0em]n2.east) {$\mathbi{h}_{i}^l$}; \node [anchor=west,cirnode] (n3) at ([xshift=1.2em,yshift=0em]n2.east) {$\mathbi{h}_{i}^l$};
\node [anchor=west,cirnode] (n4) at ([xshift=1em,yshift=0em]n3.east) {$\mathbi{h}_{i+1}^l$}; \node [anchor=west,cirnode] (n4) at ([xshift=1.2em,yshift=0em]n3.east) {$\mathbi{h}_{i+1}^l$};
\node [anchor=west,cirnode] (n5) at ([xshift=1em,yshift=0em]n4.east) {$\mathbi{h}_{i+2}^l$}; \node [anchor=west,cirnode] (n5) at ([xshift=1.2em,yshift=0em]n4.east) {$\mathbi{h}_{i+2}^l$};
\node [anchor=center,blue!30,minimum height=4.2em,minimum width=4.5em,very thick,draw] (c1) at ([xshift=0em,yshift=0em]n3.center) {}; \begin{pgfonlayer}{background}
\node [anchor=center,ugreen!30,minimum height=4.9em,minimum width=14.5em,very thick,draw] (c2) at ([xshift=0em,yshift=0em]n3.center) {}; \node [anchor=center,red!30,minimum height=4.5em,minimum width=21em,very thick,draw] (c3) at ([xshift=0em,yshift=0em]n3.center) {};
\node [anchor=center,red!30,minimum height=5.6em,minimum width=24.5em,very thick,draw] (c3) at ([xshift=0em,yshift=0em]n3.center) {}; \node [anchor=center,ugreen!30,minimum height=4em,minimum width=12.5em,very thick,draw] (c2) at ([xshift=0em,yshift=0em]n3.center) {};
\node [anchor=center,orange!30,minimum height=3.5em,minimum width=3.6em,very thick,draw] (c1) at ([xshift=0em,yshift=0em]n3.center) {};
\end{pgfonlayer}
\node [anchor=south,recnode] (r1) at ([xshift=0em,yshift=2.5em]n2.north) {$\textrm{head}_1$}; \node [anchor=south,recnode,fill=red!20] (r1) at ([xshift=-3.5em,yshift=2.5em]n2.north) {$\textrm{head}_1$};
\node [anchor=south,recnode] (r2) at ([xshift=0em,yshift=2.5em]n3.north) {$\textrm{head}_2$}; \node [anchor=south,recnode,fill=orange!20] (r2) at ([xshift=0em,yshift=2.5em]n3.north) {$\textrm{head}_2$};
\node [anchor=south,recnode] (r3) at ([xshift=0em,yshift=2.5em]n4.north) {$\textrm{head}_3$}; \node [anchor=south,recnode,fill=ugreen!20] (r3) at ([xshift=3.5em,yshift=2.5em]n4.north) {$\textrm{head}_3$};
\node [anchor=south,cirnode] (n6) at ([xshift=0em,yshift=1em]r2.north) {$\mathbi{h}_{i}^{l+1}$}; \node [anchor=south,cirnode] (n6) at ([xshift=0em,yshift=1em]r2.north) {$\mathbi{h}_{i}^{l+1}$};
\draw [->,very thick,blue!30] ([xshift=0em,yshift=0em]c1.north) -- ([xshift=0em,yshift=0em]r2.south); \draw [->,very thick,orange!30] ([xshift=0em,yshift=0em]c1.north) -- ([xshift=0em,yshift=0em]r2.south);
\draw [->,very thick,ugreen!30] ([xshift=4.73em,yshift=0em]c2.north) -- ([xshift=0em,yshift=0em]r3.south); \draw [->,very thick,ugreen!30] ([xshift=3em,yshift=0em]c2.north)..controls +(north:1.5em) and +(south:1.5em)..([xshift=0em,yshift=0em]r3.south);
\draw [->,very thick,red!30] ([xshift=-4.73em,yshift=0em]c3.north) -- ([xshift=0em,yshift=0em]r1.south); \draw [->,very thick,red!30] ([xshift=-3em,yshift=0em]c3.north)..controls +(north:1.5em) and +(south:1.5em)..([xshift=0em,yshift=0em]r1.south);
\draw [->] ([xshift=0em,yshift=0em]r1.north) -- ([xshift=0em,yshift=0em]n6.south west); \draw [->] ([xshift=0em,yshift=0em]r1.north) -- ([xshift=0em,yshift=0em]n6.south west);
\draw [->] ([xshift=0em,yshift=0em]r2.north) -- ([xshift=0em,yshift=0em]n6.south); \draw [->] ([xshift=0em,yshift=0em]r2.north) -- ([xshift=0em,yshift=0em]n6.south);
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
\begin{tikzpicture} \begin{tikzpicture}
\begin{scope} \begin{scope}
\tikzstyle{enode}=[rectangle,inner sep=0mm,minimum height=5em,minimum width=5em,rounded corners=7pt,fill=ugreen!30] \tikzstyle{enode}=[rectangle,inner sep=0mm,minimum height=5em,minimum width=5em,rounded corners=7pt,fill=green!30]
\tikzstyle{dnode}=[rectangle,inner sep=0mm,minimum height=2em,minimum width=6.5em,rounded corners=5pt,fill=red!30] \tikzstyle{dnode}=[rectangle,inner sep=0mm,minimum height=2em,minimum width=6.5em,rounded corners=5pt,fill=red!30]
\tikzstyle{wnode}=[inner sep=0mm,minimum height=2em,minimum width=4em] \tikzstyle{wnode}=[inner sep=0mm,minimum height=2em,minimum width=4em]
......
...@@ -6,7 +6,7 @@ ...@@ -6,7 +6,7 @@
\node[node,fill=red!20] (n1) at (0,0){\scriptsize\bfnew{超网络}\\ [1ex] 模型结构参数 \\[0.4ex] 网络参数}; \node[node,fill=red!20] (n1) at (0,0){\scriptsize\bfnew{超网络}\\ [1ex] 模型结构参数 \\[0.4ex] 网络参数};
\node[anchor=west,node,fill=yellow!20] (n2) at ([xshift=4em]n1.east){\scriptsize\bfnew{优化后的超网络}\\ [1ex]模型{\color{red}结构参数}(已优化) \\ [0.4ex]网络参数(已优化)}; \node[anchor=west,node,fill=yellow!20] (n2) at ([xshift=4em]n1.east){\scriptsize\bfnew{优化后的超网络}\\ [1ex]模型{\color{red}结构参数}(已优化) \\ [0.4ex]网络参数(已优化)};
\node[anchor=west,node,fill=blue!20] (n3) at ([xshift=6em]n2.east){\scriptsize\bfnew{找到的模型结构}}; \node[anchor=west,node,fill=green!20] (n3) at ([xshift=6em]n2.east){\scriptsize\bfnew{找到的模型结构}};
\draw[-latex,thick] (n1.0) -- node[above,align=center,font=\scriptsize]{优化后的\\超网络}(n2.180); \draw[-latex,thick] (n1.0) -- node[above,align=center,font=\scriptsize]{优化后的\\超网络}(n2.180);
\draw[-latex,thick] (n2.0) -- node[above,align=center,font=\scriptsize]{根据结构参数\\离散化结构}(n3.180); \draw[-latex,thick] (n2.0) -- node[above,align=center,font=\scriptsize]{根据结构参数\\离散化结构}(n3.180);
......
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
\tikzstyle{node}=[minimum height=2em,minimum width=5em,draw,rounded corners=2pt,thick,drop shadow] \tikzstyle{node}=[minimum height=2em,minimum width=5em,draw,rounded corners=2pt,thick,drop shadow]
\node[node,fill=red!20] (n1) at (0,0){\small\bfnew{环境}}; \node[node,fill=red!20] (n1) at (0,0){\small\bfnew{环境}};
\node[anchor=south,node,fill=blue!20] (n2) at ([yshift=5em]n1.north){\small\bfnew{智能体}}; \node[anchor=south,node,fill=green!20] (n2) at ([yshift=5em]n1.north){\small\bfnew{智能体}};
\node[anchor=north,font=\footnotesize] at ([yshift=-0.2em]n1.south){(结构所应用于的任务)}; \node[anchor=north,font=\footnotesize] at ([yshift=-0.2em]n1.south){(结构所应用于的任务)};
\node[anchor=south,font=\footnotesize] at ([yshift=0.2em]n2.north){(结构生成器)}; \node[anchor=south,font=\footnotesize] at ([yshift=0.2em]n2.north){(结构生成器)};
......
...@@ -3,13 +3,13 @@ ...@@ -3,13 +3,13 @@
\begin{center} \begin{center}
\begin{tikzpicture} \begin{tikzpicture}
\tikzstyle{wrnode}=[rectangle,inner sep=0mm,minimum height=1.8em,minimum width=3em,rounded corners=5pt,fill=blue!30] \tikzstyle{wrnode}=[rectangle,inner sep=0mm,minimum height=1.6em,minimum width=3em,rounded corners=5pt,fill=blue!30]
\tikzstyle{srnode}=[rectangle,inner sep=0mm,minimum height=1.8em,minimum width=3em,rounded corners=5pt,fill=yellow!30] \tikzstyle{srnode}=[rectangle,inner sep=0mm,minimum height=1.6em,minimum width=3em,rounded corners=5pt,fill=orange!30]
\tikzstyle{dotnode}=[inner sep=0mm,minimum height=0.5em,minimum width=1.5em] \tikzstyle{dotnode}=[inner sep=0mm,minimum height=0.5em,minimum width=1.5em]
\tikzstyle{wnode}=[inner sep=0mm,minimum height=1.8em] \tikzstyle{wnode}=[inner sep=0mm,minimum height=1.6em]
{\small {\small
\begin{scope}[] \begin{scope}[scale=1]
\tikzstyle{every node}=[scale=1]
\node [anchor=west,wrnode] (wr1) at (0,0) {$\mathbi{h}_{w_1}$}; \node [anchor=west,wrnode] (wr1) at (0,0) {$\mathbi{h}_{w_1}$};
\node [anchor=west,wrnode] (wr2) at ([xshift=1em,yshift=0em]wr1.east) {$\mathbi{h}_{w_2}$}; \node [anchor=west,wrnode] (wr2) at ([xshift=1em,yshift=0em]wr1.east) {$\mathbi{h}_{w_2}$};
\node [anchor=west,wrnode] (wr3) at ([xshift=1em,yshift=0em]wr2.east) {$\mathbi{h}_{w_3}$}; \node [anchor=west,wrnode] (wr3) at ([xshift=1em,yshift=0em]wr2.east) {$\mathbi{h}_{w_3}$};
...@@ -22,27 +22,27 @@ ...@@ -22,27 +22,27 @@
\node [anchor=west,dotnode] (dot3) at ([xshift=0.8em,yshift=0em]sr3.east) {$\cdots$}; \node [anchor=west,dotnode] (dot3) at ([xshift=0.8em,yshift=0em]sr3.east) {$\cdots$};
\node [anchor=west,srnode] (sr4) at ([xshift=0.8em,yshift=0em]dot3.east) {$\mathbi{h}_{l_7}$}; \node [anchor=west,srnode] (sr4) at ([xshift=0.8em,yshift=0em]dot3.east) {$\mathbi{h}_{l_7}$};
\node [anchor=north,wnode,font=\footnotesize] (w1) at ([xshift=0em,yshift=-1em]wr1.south) {$w_1$\ :\ I}; \node [anchor=north,wnode,font=\footnotesize] (w1) at ([xshift=0em,yshift=-0.7em]wr1.south) {$w_1$\ :\ I};
\node [anchor=north,wnode,font=\footnotesize] (w2) at ([xshift=0em,yshift=-1em]wr2.south) {$w_2$\ :\ love}; \node [anchor=north,wnode,font=\footnotesize] (w2) at ([xshift=0em,yshift=-0.7em]wr2.south) {$w_2$\ :\ love};
\node [anchor=north,wnode,font=\footnotesize] (w3) at ([xshift=0em,yshift=-1em]wr3.south) {$w_3$\ :\ dogs}; \node [anchor=north,wnode,font=\footnotesize] (w3) at ([xshift=0em,yshift=-0.7em]wr3.south) {$w_3$\ :\ dogs};
\node [anchor=north,wnode,font=\footnotesize] (w4) at ([xshift=0em,yshift=-1em]sr1.south) {$l_1$\ :\ S}; \node [anchor=north,wnode,font=\footnotesize] (w4) at ([xshift=0em,yshift=-0.7em]sr1.south) {$l_1$\ :\ S};
\node [anchor=north,dotnode] (dot4) at ([xshift=0em,yshift=-2.4em]dot1.south) {$\cdots$}; \node [anchor=north,dotnode] (dot4) at ([xshift=0em,yshift=-2em]dot1.south) {$\cdots$};
\node [anchor=north,wnode,font=\footnotesize] (w5) at ([xshift=0em,yshift=-1em]sr2.south) {$l_3$\ :\ PRN}; \node [anchor=north,wnode,font=\footnotesize] (w5) at ([xshift=0em,yshift=-0.7em]sr2.south) {$l_3$\ :\ PRN};
\node [anchor=north,dotnode] (dot5) at ([xshift=0em,yshift=-2.2em]dot2.south) {$\cdots$}; \node [anchor=north,dotnode] (dot5) at ([xshift=0em,yshift=-2em]dot2.south) {$\cdots$};
\node [anchor=north,wnode,font=\footnotesize] (w6) at ([xshift=0em,yshift=-1em]sr3.south) {$l_5$\ :\ VBP}; \node [anchor=north,wnode,font=\footnotesize] (w6) at ([xshift=0em,yshift=-0.7em]sr3.south) {$l_5$\ :\ VBP};
\node [anchor=north,dotnode] (dot6) at ([xshift=0em,yshift=-2.3em]dot3.south) {$\cdots$}; \node [anchor=north,dotnode] (dot6) at ([xshift=0em,yshift=-2em]dot3.south) {$\cdots$};
\node [anchor=north,wnode,font=\footnotesize] (w7) at ([xshift=0em,yshift=-1em]sr4.south) {$l_7$\ :\ NNS}; \node [anchor=north,wnode,font=\footnotesize] (w7) at ([xshift=0em,yshift=-0.7em]sr4.south) {$l_7$\ :\ NNS};
\node [anchor=south,circle,draw,minimum size=1.2em] (c1) at ([xshift=2.5em,yshift=2em]wr2.north){}; \node [anchor=south,circle,draw,minimum size=1.2em] (c1) at ([xshift=2.5em,yshift=1.5em]wr2.north){};
\node [anchor=west,circle,draw,minimum size=1.2em] (c2) at ([xshift=8em,yshift=0em]c1.east){}; \node [anchor=west,circle,draw,minimum size=1.2em] (c2) at ([xshift=8em,yshift=0em]c1.east){};
\node [anchor=west,circle,draw,minimum size=1.2em] (c3) at ([xshift=8em,yshift=0em]c2.east){}; \node [anchor=west,circle,draw,minimum size=1.2em] (c3) at ([xshift=8em,yshift=0em]c2.east){};
\node [anchor=south,srnode] (m1) at ([xshift=0em,yshift=2em]c1.north) {$\mathbi{h}_{l_1}$}; \node [anchor=south,srnode] (m1) at ([xshift=0em,yshift=1em]c1.north) {$\mathbi{h}_{l_1}$};
\node [anchor=south,wrnode] (m2) at ([xshift=0em,yshift=0em]m1.north) {$\mathbi{h}_{w_1}$}; \node [anchor=south,wrnode] (m2) at ([xshift=0em,yshift=0em]m1.north) {$\mathbi{h}_{w_1}$};
\node [anchor=south,srnode] (m3) at ([xshift=0em,yshift=2em]c2.north) {$\mathbi{h}_{l_5}$}; \node [anchor=south,srnode] (m3) at ([xshift=0em,yshift=1em]c2.north) {$\mathbi{h}_{l_5}$};
\node [anchor=south,wrnode] (m4) at ([xshift=0em,yshift=0em]m3.north) {$\mathbi{h}_{w_2}$}; \node [anchor=south,wrnode] (m4) at ([xshift=0em,yshift=0em]m3.north) {$\mathbi{h}_{w_2}$};
\node [anchor=south,srnode] (m5) at ([xshift=0em,yshift=2em]c3.north) {$\mathbi{h}_{l_7}$}; \node [anchor=south,srnode] (m5) at ([xshift=0em,yshift=1em]c3.north) {$\mathbi{h}_{l_7}$};
\node [anchor=south,wrnode] (m6) at ([xshift=0em,yshift=0em]m5.north) {$\mathbi{h}_{w_3}$}; \node [anchor=south,wrnode] (m6) at ([xshift=0em,yshift=0em]m5.north) {$\mathbi{h}_{w_3}$};
...@@ -55,10 +55,10 @@ ...@@ -55,10 +55,10 @@
\begin{pgfonlayer}{background} \begin{pgfonlayer}{background}
\node [rectangle,inner sep=0.5em,draw=blue!80,dashed,very thick,rounded corners=10pt] [fit = (wr1) (wr3) (w1) (w3)] (box1) {}; \node [rectangle,inner sep=0.5em,draw=blue!80,dashed,very thick,rounded corners=10pt] [fit = (wr1) (wr3) (w1) (w3)] (box1) {};
\node [rectangle,inner sep=0.5em,draw=yellow!80,dashed,very thick,rounded corners=10pt] [fit = (sr1) (sr4) (w4) (w7)] (box2) {}; \node [rectangle,inner sep=0.5em,draw=orange!80,dashed,very thick,rounded corners=10pt] [fit = (sr1) (sr4) (w4) (w7)] (box2) {};
\node [rectangle,minimum height=5em,inner sep=0.6em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (m1) (m2)] (box3) {}; \node [rectangle,inner sep=0.5em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (m1) (m2)] (box3) {};
\node [rectangle,minimum height=5em,inner sep=0.6em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (m3) (m4)] (box4) {}; \node [rectangle,inner sep=0.5em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (m3) (m4)] (box4) {};
\node [rectangle,minimum height=5em,inner sep=0.6em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (m5) (m6)] (box5) {}; \node [rectangle,inner sep=0.5em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (m5) (m6)] (box5) {};
\end{pgfonlayer} \end{pgfonlayer}
\node [anchor=south,wnode] (h1) at ([xshift=0em,yshift=0.1em]box3.north) {${\mathbi{h}'}_1$\ :\ }; \node [anchor=south,wnode] (h1) at ([xshift=0em,yshift=0.1em]box3.north) {${\mathbi{h}'}_1$\ :\ };
...@@ -73,9 +73,9 @@ ...@@ -73,9 +73,9 @@
\draw [->,thick] ([xshift=0em,yshift=0em]w6.north) -- ([xshift=0em,yshift=0em]sr3.south); \draw [->,thick] ([xshift=0em,yshift=0em]w6.north) -- ([xshift=0em,yshift=0em]sr3.south);
\draw [->,thick] ([xshift=0em,yshift=0em]w7.north) -- ([xshift=0em,yshift=0em]sr4.south); \draw [->,thick] ([xshift=0em,yshift=0em]w7.north) -- ([xshift=0em,yshift=0em]sr4.south);
\draw [->,thick] ([xshift=0em,yshift=0.7em]dot4.north) -- ([xshift=0em,yshift=-0.7em]dot1.south); \draw [->,thick] ([xshift=0em,yshift=0.6em]dot4.north) -- ([xshift=0em,yshift=-0.7em]dot1.south);
\draw [->,thick] ([xshift=0em,yshift=0.7em]dot5.north) -- ([xshift=0em,yshift=-0.7em]dot2.south); \draw [->,thick] ([xshift=0em,yshift=0.6em]dot5.north) -- ([xshift=0em,yshift=-0.7em]dot2.south);
\draw [->,thick] ([xshift=0em,yshift=0.7em]dot6.north) -- ([xshift=0em,yshift=-0.7em]dot3.south); \draw [->,thick] ([xshift=0em,yshift=0.6em]dot6.north) -- ([xshift=0em,yshift=-0.7em]dot3.south);
\draw [<->,thick] ([xshift=0em,yshift=0em]wr1.east) -- ([xshift=0em,yshift=0em]wr2.west); \draw [<->,thick] ([xshift=0em,yshift=0em]wr1.east) -- ([xshift=0em,yshift=0em]wr2.west);
\draw [<->,thick] ([xshift=0em,yshift=0em]wr2.east) -- ([xshift=0em,yshift=0em]wr3.west); \draw [<->,thick] ([xshift=0em,yshift=0em]wr2.east) -- ([xshift=0em,yshift=0em]wr3.west);
...@@ -96,13 +96,13 @@ ...@@ -96,13 +96,13 @@
\draw[->,thick] ([xshift=0em,yshift=-0em]wr3.north)..controls +(north:2em) and +(south:1em)..([xshift=-0em,yshift=-0em]c3.south west) ; \draw[->,thick] ([xshift=0em,yshift=-0em]wr3.north)..controls +(north:2em) and +(south:1em)..([xshift=-0em,yshift=-0em]c3.south west) ;
\draw[->,thick] ([xshift=0em,yshift=-0em]sr4.north)..controls +(north:2em) and +(east:0em)..([xshift=-0em,yshift=-0em]c3.east) ; \draw[->,thick] ([xshift=0em,yshift=-0em]sr4.north)..controls +(north:2em) and +(east:0em)..([xshift=-0em,yshift=-0em]c3.east) ;
\draw [->,thick] ([xshift=0em,yshift=0em]c1.north) -- ([xshift=0em,yshift=0em]box3.south); \draw [->,thick] ([xshift=0em,yshift=0em]c1.north) -- ([xshift=0em,yshift=0em]m1.south);
\draw [->,thick] ([xshift=0em,yshift=0em]c2.north) -- ([xshift=0em,yshift=0em]box4.south); \draw [->,thick] ([xshift=0em,yshift=0em]c2.north) -- ([xshift=0em,yshift=0em]m3.south);
\draw [->,thick] ([xshift=0em,yshift=0em]c3.north) -- ([xshift=0em,yshift=0em]box5.south); \draw [->,thick] ([xshift=0em,yshift=0em]c3.north) -- ([xshift=0em,yshift=0em]m5.south);
\node [anchor=north] (r1) at ([xshift=0em,yshift=-1em]w2.south) {词语RNN}; \node [anchor=north,font=\small] (r1) at ([xshift=0em,yshift=-1em]w2.south) {词语RNN};
\node [anchor=north] (r2) at ([xshift=3em,yshift=-1em]w5.south) {句法RNN}; \node [anchor=north,font=\small] (r2) at ([xshift=3em,yshift=-1em]w5.south) {句法RNN};
\node [anchor=north] (label1) at ([xshift=0em,yshift=-4em]dot4.south) {(a)平行结构}; \node [anchor=north,font=\small] (label1) at ([xshift=0em,yshift=-3em]dot4.south) {(a)平行结构};
\end{scope} \end{scope}
} }
......
...@@ -4,7 +4,7 @@ ...@@ -4,7 +4,7 @@
\begin{tikzpicture} \begin{tikzpicture}
\tikzstyle{wrnode}=[rectangle,inner sep=0mm,minimum height=1.8em,minimum width=3em,rounded corners=5pt,fill=blue!30] \tikzstyle{wrnode}=[rectangle,inner sep=0mm,minimum height=1.8em,minimum width=3em,rounded corners=5pt,fill=blue!30]
\tikzstyle{srnode}=[rectangle,inner sep=0mm,minimum height=1.8em,minimum width=3em,rounded corners=5pt,fill=yellow!30] \tikzstyle{srnode}=[rectangle,inner sep=0mm,minimum height=1.8em,minimum width=3em,rounded corners=5pt,fill=orange!30]
\tikzstyle{dotnode}=[inner sep=0mm,minimum height=0.5em,minimum width=1.5em] \tikzstyle{dotnode}=[inner sep=0mm,minimum height=0.5em,minimum width=1.5em]
\tikzstyle{wnode}=[inner sep=0mm,minimum height=1.8em] \tikzstyle{wnode}=[inner sep=0mm,minimum height=1.8em]
...@@ -48,9 +48,9 @@ ...@@ -48,9 +48,9 @@
\node [anchor=south,wnode] (w10) at ([xshift=0em,yshift=0.5em]c3.north) {$\mathbi{e}_{w_2}$}; \node [anchor=south,wnode] (w10) at ([xshift=0em,yshift=0.5em]c3.north) {$\mathbi{e}_{w_2}$};
\begin{pgfonlayer}{background} \begin{pgfonlayer}{background}
\node [rectangle,minimum height=5em,inner sep=0.6em,fill=ugreen!20,rounded corners=8pt] [fit = (c1) (w8)] (box6) {}; \node [rectangle,minimum height=5em,inner sep=0.6em,fill=green!20,rounded corners=8pt] [fit = (c1) (w8)] (box6) {};
\node [rectangle,minimum height=5em,inner sep=0.6em,fill=ugreen!20,rounded corners=8pt] [fit = (c2) (w9)] (box7) {}; \node [rectangle,minimum height=5em,inner sep=0.6em,fill=green!20,rounded corners=8pt] [fit = (c2) (w9)] (box7) {};
\node [rectangle,minimum height=5em,inner sep=0.6em,fill=ugreen!20,rounded corners=8pt] [fit = (c3) (w10)] (box8) {}; \node [rectangle,minimum height=5em,inner sep=0.6em,fill=green!20,rounded corners=8pt] [fit = (c3) (w10)] (box8) {};
\end{pgfonlayer} \end{pgfonlayer}
\node [anchor=south,wrnode] (wr1) at ([xshift=0em,yshift=1em]box6.north) {$\mathbi{h}_{w_1}$}; \node [anchor=south,wrnode] (wr1) at ([xshift=0em,yshift=1em]box6.north) {$\mathbi{h}_{w_1}$};
...@@ -63,7 +63,7 @@ ...@@ -63,7 +63,7 @@
\begin{pgfonlayer}{background} \begin{pgfonlayer}{background}
\node [rectangle,minimum width=20em,minimum height=13em,inner sep=0.5em,draw=blue!80,dashed,very thick,rounded corners=10pt] [fit = (h1) (w1) (h3) (c3)] (box1) {}; \node [rectangle,minimum width=20em,minimum height=13em,inner sep=0.5em,draw=blue!80,dashed,very thick,rounded corners=10pt] [fit = (h1) (w1) (h3) (c3)] (box1) {};
\node [rectangle,inner sep=0.5em,draw=yellow!80,dashed,very thick,rounded corners=10pt] [fit = (sr1) (sr4) (w4) (w7)] (box2) {}; \node [rectangle,inner sep=0.5em,draw=orange!80,dashed,very thick,rounded corners=10pt] [fit = (sr1) (sr4) (w4) (w7)] (box2) {};
\node [rectangle,inner sep=0.4em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (wr1)] (box3) {}; \node [rectangle,inner sep=0.4em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (wr1)] (box3) {};
\node [rectangle,inner sep=0.4em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (wr2)] (box4) {}; \node [rectangle,inner sep=0.4em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (wr2)] (box4) {};
\node [rectangle,inner sep=0.4em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (wr3)] (box5) {}; \node [rectangle,inner sep=0.4em,fill=gray!20,draw=black,dashed,very thick,rounded corners=8pt] [fit = (wr3)] (box5) {};
......
...@@ -3,12 +3,12 @@ ...@@ -3,12 +3,12 @@
\begin{center} \begin{center}
\begin{tikzpicture} \begin{tikzpicture}
\tikzstyle{hnode}=[rectangle,inner sep=0mm,minimum height=1.8em,minimum width=3em,rounded corners=5pt,fill=red!30] \tikzstyle{hnode}=[rectangle,inner sep=0mm,minimum height=1.6em,minimum width=3em,rounded corners=5pt,fill=red!30]
\tikzstyle{dotnode}=[inner sep=0mm,minimum height=0.5em,minimum width=1.5em] \tikzstyle{dotnode}=[inner sep=0mm,minimum height=0.5em,minimum width=1.5em]
\tikzstyle{wnode}=[inner sep=0mm,minimum height=1.8em] \tikzstyle{wnode}=[inner sep=0mm,minimum height=1.6em]
{\small {\small
\begin{scope}[] \begin{scope}[scale=1]
\tikzstyle{every node}=[scale=1]
\node [anchor=west,hnode] (n1) at (0,0) {$\mathbi{h}_{1}$}; \node [anchor=west,hnode] (n1) at (0,0) {$\mathbi{h}_{1}$};
\node [anchor=west,hnode] (n2) at ([xshift=1em,yshift=0em]n1.east) {$\mathbi{h}_{2}$}; \node [anchor=west,hnode] (n2) at ([xshift=1em,yshift=0em]n1.east) {$\mathbi{h}_{2}$};
\node [anchor=west,dotnode] (dot1) at ([xshift=1em,yshift=0em]n2.east) {$\cdots$}; \node [anchor=west,dotnode] (dot1) at ([xshift=1em,yshift=0em]n2.east) {$\cdots$};
...@@ -18,14 +18,14 @@ ...@@ -18,14 +18,14 @@
\node [anchor=west,dotnode] (dot3) at ([xshift=1em,yshift=0em]n4.east) {$\cdots$}; \node [anchor=west,dotnode] (dot3) at ([xshift=1em,yshift=0em]n4.east) {$\cdots$};
\node [anchor=west,hnode] (n5) at ([xshift=1em,yshift=0em]dot3.east) {$\mathbi{h}_{10}$}; \node [anchor=west,hnode] (n5) at ([xshift=1em,yshift=0em]dot3.east) {$\mathbi{h}_{10}$};
\node [anchor=north,wnode,font=\footnotesize] (w1) at ([xshift=0em,yshift=-1em]n1.south) {$l_1$\ :\ S}; \node [anchor=north,wnode,font=\footnotesize] (w1) at ([xshift=0em,yshift=-0.7em]n1.south) {$l_1$\ :\ S};
\node [anchor=north,wnode,font=\footnotesize] (w2) at ([xshift=0em,yshift=-1em]n2.south) {$l_3$\ :\ NP}; \node [anchor=north,wnode,font=\footnotesize] (w2) at ([xshift=0em,yshift=-0.7em]n2.south) {$l_3$\ :\ NP};
\node [anchor=north,dotnode] (dot4) at ([xshift=0em,yshift=-2.4em]dot1.south) {$\cdots$}; \node [anchor=north,dotnode] (dot4) at ([xshift=0em,yshift=-2em]dot1.south) {$\cdots$};
\node [anchor=north,wnode,font=\footnotesize] (w3) at ([xshift=0em,yshift=-1em]n3.south) {$w_1$\ :\ I}; \node [anchor=north,wnode,font=\footnotesize] (w3) at ([xshift=0em,yshift=-0.7em]n3.south) {$w_1$\ :\ I};
\node [anchor=north,dotnode] (dot5) at ([xshift=0em,yshift=-2.2em]dot2.south) {$\cdots$}; \node [anchor=north,dotnode] (dot5) at ([xshift=0em,yshift=-2em]dot2.south) {$\cdots$};
\node [anchor=north,wnode,font=\footnotesize] (w4) at ([xshift=0em,yshift=-1em]n4.south) {$w_2$\ :\ love}; \node [anchor=north,wnode,font=\footnotesize] (w4) at ([xshift=0em,yshift=-0.7em]n4.south) {$w_2$\ :\ love};
\node [anchor=north,dotnode] (dot6) at ([xshift=0em,yshift=-2.3em]dot3.south) {$\cdots$}; \node [anchor=north,dotnode] (dot6) at ([xshift=0em,yshift=-2em]dot3.south) {$\cdots$};
\node [anchor=north,wnode,font=\footnotesize] (w5) at ([xshift=0em,yshift=-1em]n5.south) {$w_3$\ :\ dogs}; \node [anchor=north,wnode,font=\footnotesize] (w5) at ([xshift=0em,yshift=-0.7em]n5.south) {$w_3$\ :\ dogs};
\node [anchor=south,wnode] (h1) at ([xshift=0em,yshift=0.3em]n3.north) {${\mathbi{h}'}_1$\ :\ }; \node [anchor=south,wnode] (h1) at ([xshift=0em,yshift=0.3em]n3.north) {${\mathbi{h}'}_1$\ :\ };
...@@ -41,10 +41,10 @@ ...@@ -41,10 +41,10 @@
\end{pgfonlayer} \end{pgfonlayer}
\node [anchor=east] (r1) at ([xshift=-2em,yshift=0em]box1.west) {词语RNN}; \node [anchor=east,font=\small] (r1) at ([xshift=-2em,yshift=0em]box1.west) {混合RNN};
\node [anchor=south west,wnode] (l1) at ([xshift=1em,yshift=6em]r1.north west) {先序遍历句法树,得到序列:}; {\small
\node [anchor=south west,wnode] (l1) at ([xshift=1em,yshift=5em]r1.north west) {先序遍历句法树,得到序列:};
\node [anchor=north west,wnode,align=center] (l2) at ([xshift=0.5em,yshift=-0.6em]l1.north east) {S\\[0.5em]$l_1$}; \node [anchor=north west,wnode,align=center] (l2) at ([xshift=0.5em,yshift=-0.6em]l1.north east) {S\\[0.5em]$l_1$};
\node [anchor=north west,wnode,align=center] (l3) at ([xshift=0.5em,yshift=0em]l2.north east) {NP\\[0.5em]$l_2$}; \node [anchor=north west,wnode,align=center] (l3) at ([xshift=0.5em,yshift=0em]l2.north east) {NP\\[0.5em]$l_2$};
\node [anchor=north west,wnode,align=center] (l4) at ([xshift=0.5em,yshift=0em]l3.north east) {PRN\\[0.5em]$l_3$}; \node [anchor=north west,wnode,align=center] (l4) at ([xshift=0.5em,yshift=0em]l3.north east) {PRN\\[0.5em]$l_3$};
...@@ -55,7 +55,7 @@ ...@@ -55,7 +55,7 @@
\node [anchor=north west,wnode,align=center] (l9) at ([xshift=0.5em,yshift=0em]l8.north east) {NP\\[0.5em]$l_6$}; \node [anchor=north west,wnode,align=center] (l9) at ([xshift=0.5em,yshift=0em]l8.north east) {NP\\[0.5em]$l_6$};
\node [anchor=north west,wnode,align=center] (l10) at ([xshift=0.5em,yshift=0em]l9.north east) {NNS\\[0.5em]$l_7$}; \node [anchor=north west,wnode,align=center] (l10) at ([xshift=0.5em,yshift=0em]l9.north east) {NNS\\[0.5em]$l_7$};
\node [anchor=north west,wnode,align=center] (l11) at ([xshift=0.5em,yshift=0em]l10.north east) {dogs\\[0.5em]$w_3$}; \node [anchor=north west,wnode,align=center] (l11) at ([xshift=0.5em,yshift=0em]l10.north east) {dogs\\[0.5em]$w_3$};
}
\draw [->,thick] ([xshift=0em,yshift=0em]w1.north) -- ([xshift=0em,yshift=0em]n1.south); \draw [->,thick] ([xshift=0em,yshift=0em]w1.north) -- ([xshift=0em,yshift=0em]n1.south);
...@@ -65,9 +65,9 @@ ...@@ -65,9 +65,9 @@
\draw [->,thick] ([xshift=0em,yshift=0em]w5.north) -- ([xshift=0em,yshift=0em]n5.south); \draw [->,thick] ([xshift=0em,yshift=0em]w5.north) -- ([xshift=0em,yshift=0em]n5.south);
\draw [->,thick] ([xshift=0em,yshift=0.7em]dot4.north) -- ([xshift=0em,yshift=-0.7em]dot1.south); \draw [->,thick] ([xshift=0em,yshift=0.6em]dot4.north) -- ([xshift=0em,yshift=-0.7em]dot1.south);
\draw [->,thick] ([xshift=0em,yshift=0.7em]dot5.north) -- ([xshift=0em,yshift=-0.7em]dot2.south); \draw [->,thick] ([xshift=0em,yshift=0.6em]dot5.north) -- ([xshift=0em,yshift=-0.7em]dot2.south);
\draw [->,thick] ([xshift=0em,yshift=0.7em]dot6.north) -- ([xshift=0em,yshift=-0.7em]dot3.south); \draw [->,thick] ([xshift=0em,yshift=0.6em]dot6.north) -- ([xshift=0em,yshift=-0.7em]dot3.south);
\draw [<->,thick] ([xshift=0em,yshift=0em]n1.east) -- ([xshift=0em,yshift=0em]n2.west); \draw [<->,thick] ([xshift=0em,yshift=0em]n1.east) -- ([xshift=0em,yshift=0em]n2.west);
...@@ -79,7 +79,7 @@ ...@@ -79,7 +79,7 @@
\draw [<->,thick] ([xshift=0em,yshift=0em]dot3.east) -- ([xshift=0em,yshift=0em]n5.west); \draw [<->,thick] ([xshift=0em,yshift=0em]dot3.east) -- ([xshift=0em,yshift=0em]n5.west);
\node [anchor=north] (label2) at ([xshift=-2em,yshift=-2em]w3.south) {(c)混合结构}; \node [anchor=north,font=\small] (label2) at ([xshift=-2em,yshift=-1em]w3.south) {(c)混合结构};
\end{scope} \end{scope}
} }
......
This source diff could not be displayed because it is too large. You can view the blob instead.
...@@ -88,11 +88,11 @@ ...@@ -88,11 +88,11 @@
%---------------------------------------------- %----------------------------------------------
\begin{itemize} \begin{itemize}
\vspace{0.5em} \vspace{0.5em}
\item 丢掉单词:句子中的每个词均有$\funp{P}_{\rm{Drop}}$的概率被丢弃。 \item {\small\bfnew{丢掉单词}}:句子中的每个词均有$\funp{P}_{\rm{Drop}}$的概率被丢弃。
\vspace{0.5em} \vspace{0.5em}
\item 掩码单词:句子中的每个词均有$\funp{P}_{\rm{Mask}}$的概率被替换为一个额外的<Mask>词。<Mask>的作用类似于占位符,可以理解为一个句子中的部分词被屏蔽掉,无法得知该位置词的准确含义。 \item {\small\bfnew{掩码单词}}:句子中的每个词均有$\funp{P}_{\rm{Mask}}$的概率被替换为一个额外的<Mask>词。<Mask>的作用类似于占位符,可以理解为一个句子中的部分词被屏蔽掉,无法得知该位置词的准确含义。
\vspace{0.5em} \vspace{0.5em}
\item 打乱顺序:将句子中距离较近的某些词的位置进行随机交换。 \item {\small\bfnew{打乱顺序}}:将句子中距离较近的某些词的位置进行随机交换。
\vspace{0.5em} \vspace{0.5em}
\end{itemize} \end{itemize}
%---------------------------------------------- %----------------------------------------------
...@@ -112,11 +112,11 @@ ...@@ -112,11 +112,11 @@
%---------------------------------------------- %----------------------------------------------
\begin{itemize} \begin{itemize}
\vspace{0.5em} \vspace{0.5em}
\item 对单语数据加噪。通过一个端到端模型预测源语言句子的调序结果,该模型和神经机器翻译模型的编码器共享参数,从而增强编码器的特征提取能力\upcite{DBLP:conf/emnlp/ZhangZ16} \item {\small\bfnew{对单语数据加噪}}。通过一个端到端模型预测源语言句子的调序结果,该模型和神经机器翻译模型的编码器共享参数,从而增强编码器的特征提取能力\upcite{DBLP:conf/emnlp/ZhangZ16}
\vspace{0.5em} \vspace{0.5em}
\item 训练降噪自编码器。将加噪后的句子作为输入,原始句子作为输出,用来训练降噪自编码器,这一思想在无监督机器翻译中得到了广泛应用,详细方法可以参考\ref{unsupervised-NMT}节; \item {\small\bfnew{训练降噪自编码器}}。将加噪后的句子作为输入,原始句子作为输出,用来训练降噪自编码器,这一思想在无监督机器翻译中得到了广泛应用,详细方法可以参考\ref{unsupervised-NMT}节;
\vspace{0.5em} \vspace{0.5em}
\item 对伪数据进行加噪。比如在上文中提到的对伪数据加入噪声的方法中,通常也使用上述这三种加噪方法来提高伪数据的多样性; \item {\small\bfnew{对伪数据进行加噪}}。比如在上文中提到的对伪数据加入噪声的方法中,通常也使用上述这三种加噪方法来提高伪数据的多样性;
\vspace{0.5em} \vspace{0.5em}
\end{itemize} \end{itemize}
%---------------------------------------------- %----------------------------------------------
...@@ -512,9 +512,9 @@ ...@@ -512,9 +512,9 @@
\begin{itemize} \begin{itemize}
\vspace{0.5em} \vspace{0.5em}
\item 基于无监督的分布匹配。该步骤利用一些无监督的方法来得到一个包含噪声的初始化词典$D$ \item {\small\bfnew{基于无监督的分布匹配}}。该步骤利用一些无监督的方法来得到一个包含噪声的初始化词典$D$
\vspace{0.5em} \vspace{0.5em}
\item 基于有监督的微调。利用两个单语词嵌入和第一步中学习到的种子字典执行一些对齐算法来迭代微调,例如,{\small\bfnew{普氏分析}}\index{普氏分析}(Procrustes Analysis\index{Procrustes Analysis}\upcite{1966ASchnemann} \item {\small\bfnew{基于有监督的微调}}。利用两个单语词嵌入和第一步中学习到的种子字典执行一些对齐算法来迭代微调,例如,{\small\bfnew{普氏分析}}\index{普氏分析}(Procrustes Analysis\index{Procrustes Analysis}\upcite{1966ASchnemann}
\vspace{0.5em} \vspace{0.5em}
\end{itemize} \end{itemize}
...@@ -542,9 +542,9 @@ ...@@ -542,9 +542,9 @@
\begin{itemize} \begin{itemize}
\vspace{0.5em} \vspace{0.5em}
\item 基于生成对抗网络的方法\upcite{DBLP:conf/iclr/LampleCRDJ18,DBLP:conf/acl/ZhangLLS17,DBLP:conf/emnlp/XuYOW18,DBLP:conf/naacl/MohiuddinJ19}。在这个方法中,通过生成器来产生映射$\mathbi{W}$,鉴别器负责区分随机抽样的元素$\mathbi{W} \mathbi{X}$$\mathbi{Y}$,两者共同优化收敛后即可得到映射$\mathbi{W}$ \item {\small\bfnew{基于生成对抗网络的方法}}\upcite{DBLP:conf/iclr/LampleCRDJ18,DBLP:conf/acl/ZhangLLS17,DBLP:conf/emnlp/XuYOW18,DBLP:conf/naacl/MohiuddinJ19}。在这个方法中,通过生成器来产生映射$\mathbi{W}$,鉴别器负责区分随机抽样的元素$\mathbi{W} \mathbi{X}$$\mathbi{Y}$,两者共同优化收敛后即可得到映射$\mathbi{W}$
\vspace{0.5em} \vspace{0.5em}
\item 基于Gromov-wasserstein 的方法\upcite{DBLP:conf/emnlp/Alvarez-MelisJ18,DBLP:conf/lrec/GarneauGBDL20,DBLP:journals/corr/abs-1811-01124,DBLP:conf/emnlp/XuYOW18}。Wasserstein距离是度量空间中定义两个概率分布之间距离的函数。在这个任务中,它用来衡量不同语言中单词对之间的相似性,利用空间近似同构的信息可以定义出一些目标函数,之后通过优化该目标函数也可以得到映射$\mathbi{W}$ \item {\small\bfnew{基于Gromov-wasserstein 的方法}}\upcite{DBLP:conf/emnlp/Alvarez-MelisJ18,DBLP:conf/lrec/GarneauGBDL20,DBLP:journals/corr/abs-1811-01124,DBLP:conf/emnlp/XuYOW18}。Wasserstein距离是度量空间中定义两个概率分布之间距离的函数。在这个任务中,它用来衡量不同语言中单词对之间的相似性,利用空间近似同构的信息可以定义出一些目标函数,之后通过优化该目标函数也可以得到映射$\mathbi{W}$
\vspace{0.5em} \vspace{0.5em}
\end{itemize} \end{itemize}
...@@ -675,10 +675,10 @@ ...@@ -675,10 +675,10 @@
\parinterval 无监督神经机器翻译还有两个关键的技巧: \parinterval 无监督神经机器翻译还有两个关键的技巧:
\begin{itemize} \begin{itemize}
\vspace{0.5em} \vspace{0.5em}
\item 词表共享:对于源语言和目标语言里都一样的词使用同一个词嵌入,而不是源语言和目标语言各自对应一个词嵌入,比如,阿拉伯数字或者一些实体名字。这样相当于告诉模型这个词在源语言和目标语言里面表达同一个意思,隐式地引入了单词翻译的监督信号。在无监督神经机器翻译里词表共享搭配子词切分会更加有效,因为子词的覆盖范围广,比如,多个不同的词可以包含同一个子词。 \item {\small\bfnew{词表共享}}:对于源语言和目标语言里都一样的词使用同一个词嵌入,而不是源语言和目标语言各自对应一个词嵌入,比如,阿拉伯数字或者一些实体名字。这样相当于告诉模型这个词在源语言和目标语言里面表达同一个意思,隐式地引入了单词翻译的监督信号。在无监督神经机器翻译里词表共享搭配子词切分会更加有效,因为子词的覆盖范围广,比如,多个不同的词可以包含同一个子词。
\vspace{0.5em} \vspace{0.5em}
\item 模型共享:与多语言翻译系统类似,使用同一个翻译模型来进行正向翻译(源语言$\to$目标语言)和反向翻译(目标语言$\to$源语言)。这样做降低了模型的参数量。而且,两个翻译方向可以互相为对方起到正则化的作用,减小了过拟合的风险。 \item {\small\bfnew{模型共享}}:与多语言翻译系统类似,使用同一个翻译模型来进行正向翻译(源语言$\to$目标语言)和反向翻译(目标语言$\to$源语言)。这样做降低了模型的参数量。而且,两个翻译方向可以互相为对方起到正则化的作用,减小了过拟合的风险。
\vspace{0.5em} \vspace{0.5em}
\end{itemize} \end{itemize}
...@@ -752,9 +752,9 @@ ...@@ -752,9 +752,9 @@
\begin{itemize} \begin{itemize}
\vspace{0.5em} \vspace{0.5em}
\item 基于数据的方法。利用源领域的双语数据或目标领域单语数据进行数据选择或数据增强,来增加模型训练的数据量。 \item {\small\bfnew{基于数据的方法}}。利用源领域的双语数据或目标领域单语数据进行数据选择或数据增强,来增加模型训练的数据量。
\vspace{0.5em} \vspace{0.5em}
\item 基于模型的方法。针对领域适应开发特定的模型结构、训练策略和推断方法。 \item {\small\bfnew{基于模型的方法}}。针对领域适应开发特定的模型结构、训练策略和推断方法。
\vspace{0.5em} \vspace{0.5em}
\end{itemize} \end{itemize}
......
...@@ -17,10 +17,10 @@ ...@@ -17,10 +17,10 @@
\node[anchor=south,font=\footnotesize,inner sep=0pt] at ([yshift=0.2em]value.north){value}; \node[anchor=south,font=\footnotesize,inner sep=0pt] at ([yshift=0.2em]value.north){value};
\node[anchor=south,font=\footnotesize,inner sep=0pt] (cache)at ([yshift=2em,xshift=1.5em]key.north){\small\bfnew{缓存}}; \node[anchor=south,font=\footnotesize,inner sep=0pt] (cache)at ([yshift=2em,xshift=1.5em]key.north){\small\bfnew{缓存}};
\node[draw,anchor=east,minimum size=1.8em,fill=orange!15] (dt) at ([yshift=2.1em,xshift=-4em]key.west){${\mathbi{d}}_{t}$}; \node[draw,anchor=east,thick,minimum size=1.8em,fill=orange!15] (dt) at ([yshift=2.1em,xshift=-4em]key.west){${\mathbi{d}}_{t}$};
\node[anchor=north,font=\footnotesize] (readlab) at ([xshift=2.8em,yshift=0.3em]dt.north){\red{读取}}; \node[anchor=north,font=\footnotesize] (readlab) at ([xshift=2.8em,yshift=0.3em]dt.north){\red{读取}};
\node[draw,anchor=east,minimum size=1.8em,fill=ugreen!15] (st) at ([xshift=-3.7em]dt.west){${\mathbi{s}}_{t}$}; \node[draw,anchor=east,thick,minimum size=1.8em,fill=ugreen!15] (st) at ([xshift=-3.7em]dt.west){${\mathbi{s}}_{t}$};
\node[draw,anchor=east,minimum size=1.8em,fill=red!15] (st2) at ([xshift=-0.85em,yshift=3.5em]dt.west){$ \widetilde{\mathbi{s}}_{t}$}; \node[draw,anchor=east,thick,minimum size=1.8em,fill=red!15] (st2) at ([xshift=-0.85em,yshift=3.5em]dt.west){$ \widetilde{\mathbi{s}}_{t}$};
%\node[draw,anchor=north,circle,inner sep=0pt, minimum size=1.2em,fill=yellow] (add) at ([yshift=-1em]st2.south){+}; %\node[draw,anchor=north,circle,inner sep=0pt, minimum size=1.2em,fill=yellow] (add) at ([yshift=-1em]st2.south){+};
\node[draw,thick,inner sep=0pt, minimum size=1.1em, circle] (add) at ([yshift=-1.5em]st2.south){}; \node[draw,thick,inner sep=0pt, minimum size=1.1em, circle] (add) at ([yshift=-1.5em]st2.south){};
...@@ -29,7 +29,7 @@ ...@@ -29,7 +29,7 @@
\node[anchor=north,inner sep=0pt,font=\footnotesize,text=red] at ([xshift=-0em,yshift=-0.5em]add.south){融合}; \node[anchor=north,inner sep=0pt,font=\footnotesize,text=red] at ([xshift=-0em,yshift=-0.5em]add.south){融合};
\node[draw,anchor=east,minimum size=1.8em,fill=yellow!15] (ct) at ([xshift=-2em,yshift=-3.5em]st.west){$ {\mathbi{C}}_{t}$}; \node[draw,anchor=east,thick,minimum size=1.8em,fill=yellow!15] (ct) at ([xshift=-2em,yshift=-3.5em]st.west){$ {\mathbi{C}}_{t}$};
\node[anchor=north,font=\footnotesize] (matchlab) at ([xshift=6.7em,yshift=-0.1em]ct.north){\red{匹配}}; \node[anchor=north,font=\footnotesize] (matchlab) at ([xshift=6.7em,yshift=-0.1em]ct.north){\red{匹配}};
\node[anchor=east] (y) at ([xshift=-6em,yshift=1em]st.west){$\mathbi{y}_{t-1}$}; \node[anchor=east] (y) at ([xshift=-6em,yshift=1em]st.west){$\mathbi{y}_{t-1}$};
...@@ -53,12 +53,12 @@ ...@@ -53,12 +53,12 @@
%node[above,font=\footnotesize,text=red,rotate=25]{reading} %node[above,font=\footnotesize,text=red,rotate=25]{reading}
\draw[-latex,dashed,very thick,out=-5,in=-170] (ct.0) to ([yshift=-2.5em]box.180); \draw[-latex,dashed,very thick,out=-5,in=-170] (ct.0) to ([yshift=-2.5em]box.180);
%node[above,font=\footnotesize,text=red,pos=0.7,rotate=8]{matching} %node[above,font=\footnotesize,text=red,pos=0.7,rotate=8]{matching}
\draw[-,very thick,out=0,in=-135](st.0) to (add.-135); \draw[-,thick,out=0,in=-135](st.0) to (add.-135);
\draw[-,very thick,out=180,in=-45](dt.180) to (add.-45); \draw[-,thick,out=180,in=-45](dt.180) to (add.-45);
\draw[-latex,very thick] (add.90) -- (st2.-90); \draw[-latex,thick] (add.90) -- (st2.-90);
\draw[-latex,very thick,out=100,in=-100] (ct.90) to (output.-90); \draw[-latex,thick,out=100,in=-100] (ct.90) to (output.-90);
\draw[-latex,very thick,out=180,in=-100] (st2.180) to (output.-90); \draw[-latex,thick,out=180,in=-100] (st2.180) to (output.-90);
\draw[-latex,very thick,out=80,in=-100] (y.90) to (output.-90); \draw[-latex,thick,out=80,in=-100] (y.90) to (output.-90);
\draw[-latex,very thick] (output.90) -- ([yshift=1em]output.90); \draw[-latex,thick] (output.90) -- ([yshift=1em]output.90);
\draw[-latex,very thick] ([yshift=-1.2em]yt.-90) -- (yt.-90); \draw[-latex,thick] ([yshift=-1.2em]yt.-90) -- (yt.-90);
\end{tikzpicture} \end{tikzpicture}
\ No newline at end of file
...@@ -160,11 +160,11 @@ ...@@ -160,11 +160,11 @@
%---------------------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------------------
\begin{itemize} \begin{itemize}
\vspace{0.5em} \vspace{0.5em}
\item 错误传播问题。级联模型导致的一个很严重的问题在于,语音识别模型得到的文本如果存在错误,这些错误很可能在翻译过程中被放大,从而使最后翻译结果出现比较大的偏差。比如识别时在句尾少生成了个“吗”,会导致翻译模型将疑问句翻译为陈述句。 \item {\small\bfnew{错误传播问题}}。级联模型导致的一个很严重的问题在于,语音识别模型得到的文本如果存在错误,这些错误很可能在翻译过程中被放大,从而使最后翻译结果出现比较大的偏差。比如识别时在句尾少生成了个“吗”,会导致翻译模型将疑问句翻译为陈述句。
\vspace{0.5em} \vspace{0.5em}
\item 翻译效率问题。由于需要语音识别模型和文本标注模型只能串行地计算,翻译效率相对较低,而实际很多场景中都需要达到低延时的翻译。 \item {\small\bfnew{翻译效率问题}}。由于需要语音识别模型和文本标注模型只能串行地计算,翻译效率相对较低,而实际很多场景中都需要达到低延时的翻译。
\vspace{0.5em} \vspace{0.5em}
\item 语音中的副语言信息丢失。将语音识别为文本的过程中,语音中包含的语气、情感、音调等信息会丢失,而同一句话在不同的语气中表达的意思很可能是不同的。尤其是在实际应用中,由于语音识别结果通常并不包含标点,还需要额外的后处理模型将标点还原,也会带来额外的计算代价。 \item {\small\bfnew{语音中的副语言信息丢失}}。将语音识别为文本的过程中,语音中包含的语气、情感、音调等信息会丢失,而同一句话在不同的语气中表达的意思很可能是不同的。尤其是在实际应用中,由于语音识别结果通常并不包含标点,还需要额外的后处理模型将标点还原,也会带来额外的计算代价。
\vspace{0.5em} \vspace{0.5em}
\end{itemize} \end{itemize}
%---------------------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------------------
...@@ -199,9 +199,9 @@ ...@@ -199,9 +199,9 @@
%---------------------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------------------
\begin{itemize} \begin{itemize}
\vspace{0.5em} \vspace{0.5em}
\item 训练数据稀缺。虽然语音识别和文本翻译的训练数据都很多,但是直接由源语言语音到目标语言文本的平行数据十分有限,因此端到端语音翻译天然地就是一种低资源翻译任务。 \item {\small\bfnew{训练数据稀缺}}。虽然语音识别和文本翻译的训练数据都很多,但是直接由源语言语音到目标语言文本的平行数据十分有限,因此端到端语音翻译天然地就是一种低资源翻译任务。
\vspace{0.5em} \vspace{0.5em}
\item 建模复杂度更高。在语音识别中,模型是学习如何生成语音对应的文字序列,输入和输出的对齐比较简单,不涉及到调序的问题。在文本翻译中,模型要学习如何生成源语言序列对应的目标语言序列,仅需要学习不同语言之间的映射,不涉及到模态的转换。而语音翻译模型需要学习从语音到目标语言文本的生成,任务更加复杂。 \item {\small\bfnew{建模复杂度更高}}。在语音识别中,模型是学习如何生成语音对应的文字序列,输入和输出的对齐比较简单,不涉及到调序的问题。在文本翻译中,模型要学习如何生成源语言序列对应的目标语言序列,仅需要学习不同语言之间的映射,不涉及到模态的转换。而语音翻译模型需要学习从语音到目标语言文本的生成,任务更加复杂。
\vspace{0.5em} \vspace{0.5em}
\end{itemize} \end{itemize}
%---------------------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------------------
...@@ -231,9 +231,9 @@ ...@@ -231,9 +231,9 @@
%---------------------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------------------
\begin{itemize} \begin{itemize}
\vspace{0.5em} \vspace{0.5em}
\item 输入和输出之间的对齐是单调的。也就是后面的输入只会预测与前面的序列相同或后面的输出内容。比如对于图\ref{fig:17-8}中的例子,如果输入的位置t已经预测了字符l,那么t之后的位置不会再预测前面的字符h和e。 \item {\small\bfnew{输入和输出之间的对齐是单调的}}。也就是后面的输入只会预测与前面的序列相同或后面的输出内容。比如对于图\ref{fig:17-8}中的例子,如果输入的位置t已经预测了字符l,那么t之后的位置不会再预测前面的字符h和e。
\vspace{0.5em} \vspace{0.5em}
\item 输入和输出之间是多对一的关系。也就是多个输入会对应到同一个输出上。这对于语音序列来说是非常自然的一件事情,由于输入的每个位置只包含非常短的语音特征,因此多个输入才可以对应到一个输出字符。 \item {\small\bfnew{输入和输出之间是多对一的关系}}。也就是多个输入会对应到同一个输出上。这对于语音序列来说是非常自然的一件事情,由于输入的每个位置只包含非常短的语音特征,因此多个输入才可以对应到一个输出字符。
\vspace{0.5em} \vspace{0.5em}
\end{itemize} \end{itemize}
%---------------------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------------------
...@@ -604,7 +604,7 @@ ...@@ -604,7 +604,7 @@
\noindent 之后,分别计算词级和句子级注意力模型。需要注意的是句子级注意力添加了一个前馈全连接网络子层FFN。其具体计算方式如下: \noindent 之后,分别计算词级和句子级注意力模型。需要注意的是句子级注意力添加了一个前馈全连接网络子层FFN。其具体计算方式如下:
\begin{eqnarray} \begin{eqnarray}
\mathbi{s}^j&=&\textrm{WordAttention}(\mathbi{q}_{w},\mathbi{h}^{j},\mathbi{h}^{j}) \mathbi{s}^k&=&\textrm{WordAttention}(\mathbi{q}_{w},\mathbi{h}^{k},\mathbi{h}^{k})
\label{eq:17-3-7}\\ \label{eq:17-3-7}\\
\mathbi{d}_t&=&\textrm{FFN}(\textrm{SentAttention}(\mathbi{q}_{s},\mathbi{s},\mathbi{s})) \mathbi{d}_t&=&\textrm{FFN}(\textrm{SentAttention}(\mathbi{q}_{s},\mathbi{s},\mathbi{s}))
\label{eq:17-3-9} \label{eq:17-3-9}
......
...@@ -11,7 +11,7 @@ ...@@ -11,7 +11,7 @@
\node [anchor=west] (eq2) at (eq1.east) {$=$\ }; \node [anchor=west] (eq2) at (eq1.east) {$=$\ };
\draw [-] ([xshift=0.3em]eq2.east) -- ([xshift=11.6em]eq2.east); \draw [-] ([xshift=0.3em]eq2.east) -- ([xshift=11.6em]eq2.east);
\node [anchor=south west] (eq3) at ([xshift=1em]eq2.east) {$\sum_{k=1}^{K} c_{\mathbb{E}}(s_u|t_v;s^{[k]},t^{[k]})$}; \node [anchor=south west] (eq3) at ([xshift=1em]eq2.east) {$\sum_{k=1}^{K} c_{\mathbb{E}}(s_u|t_v;s^{[k]},t^{[k]})$};
\node [anchor=north west] (eq4) at (eq2.east) {$\sum_{s_u} \sum_{k=1}^{K} c_{\mathbb{E}}(s_u|t_v;s^{[k]},t^{[k]})$}; \node [anchor=north west] (eq4) at (eq2.east) {$\sum_{s'_u} \sum_{k=1}^{K} c_{\mathbb{E}}(s'_u|t_v;s^{[k]},t^{[k]})$};
{ {
\node [anchor=south] (label1) at ([yshift=-6em,xshift=3em]eq1.north west) {利用这个公式计算}; \node [anchor=south] (label1) at ([yshift=-6em,xshift=3em]eq1.north west) {利用这个公式计算};
......
...@@ -14,7 +14,7 @@ ...@@ -14,7 +14,7 @@
\node [anchor=north west] (line7) at ([yshift=-0.1em]line6.south west) {4: \quad \quad \textbf{foreach} $k = 1$ to $K$ \textbf{do}}; \node [anchor=north west] (line7) at ([yshift=-0.1em]line6.south west) {4: \quad \quad \textbf{foreach} $k = 1$ to $K$ \textbf{do}};
\node [anchor=north west] (line8) at ([yshift=-0.1em]line7.south west) {5: \quad \quad \quad \footnotesize{$c_{\mathbb{E}}(\seq{s}_u|\seq{t}_v;\seq{s}^{[k]},\seq{t}^{[k]}) = \sum\limits_{j=1}^{|\seq{s}^{[k]}|} \delta(s_j,s_u) \sum\limits_{i=0}^{|\seq{t}^{[k]}|} \delta(t_i,t_v) \cdot \frac{f(s_u|t_v)}{\sum_{i=0}^{l}f(s_u|t_i)}$}\normalsize{}}; \node [anchor=north west] (line8) at ([yshift=-0.1em]line7.south west) {5: \quad \quad \quad \footnotesize{$c_{\mathbb{E}}(\seq{s}_u|\seq{t}_v;\seq{s}^{[k]},\seq{t}^{[k]}) = \sum\limits_{j=1}^{|\seq{s}^{[k]}|} \delta(s_j,s_u) \sum\limits_{i=0}^{|\seq{t}^{[k]}|} \delta(t_i,t_v) \cdot \frac{f(s_u|t_v)}{\sum_{i=0}^{l}f(s_u|t_i)}$}\normalsize{}};
\node [anchor=north west] (line9) at ([yshift=-0.1em]line8.south west) {6: \quad \quad \textbf{foreach} $t_v$ appears at least one of $\{\seq{t}^{[1]},...,\seq{t}^{[K]}\}$ \textbf{do}}; \node [anchor=north west] (line9) at ([yshift=-0.1em]line8.south west) {6: \quad \quad \textbf{foreach} $t_v$ appears at least one of $\{\seq{t}^{[1]},...,\seq{t}^{[K]}\}$ \textbf{do}};
\node [anchor=north west] (line10) at ([yshift=-0.1em]line9.south west) {7: \quad \quad \quad $\lambda_{t_v}^{'} = \sum_{s_u} \sum_{k=1}^{K} c_{\mathbb{E}}(s_u|t_v;\seq{s}^{[k]},\seq{t}^{[k]})$}; \node [anchor=north west] (line10) at ([yshift=-0.1em]line9.south west) {7: \quad \quad \quad $\lambda_{t_v}^{'} = \sum_{s'_u} \sum_{k=1}^{K} c_{\mathbb{E}}(s'_u|t_v;\seq{s}^{[k]},\seq{t}^{[k]})$};
\node [anchor=north west] (line11) at ([yshift=-0.1em]line10.south west) {8: \quad \quad \quad \textbf{foreach} $s_u$ appears at least one of $\{\seq{s}^{[1]},...,\seq{s}^{[K]}\}$ \textbf{do}}; \node [anchor=north west] (line11) at ([yshift=-0.1em]line10.south west) {8: \quad \quad \quad \textbf{foreach} $s_u$ appears at least one of $\{\seq{s}^{[1]},...,\seq{s}^{[K]}\}$ \textbf{do}};
\node [anchor=north west] (line12) at ([yshift=-0.1em]line11.south west) {9: \quad \quad \quad \quad $f(s_u|t_v) = \sum_{k=1}^{K} c_{\mathbb{E}}(s_u|t_v;\seq{s}^{[k]},\seq{t}^{[k]}) \cdot (\lambda_{t_v}^{'})^{-1}$}; \node [anchor=north west] (line12) at ([yshift=-0.1em]line11.south west) {9: \quad \quad \quad \quad $f(s_u|t_v) = \sum_{k=1}^{K} c_{\mathbb{E}}(s_u|t_v;\seq{s}^{[k]},\seq{t}^{[k]}) \cdot (\lambda_{t_v}^{'})^{-1}$};
\node [anchor=north west] (line13) at ([yshift=-0.1em]line12.south west) {10: \textbf{return} $f(\cdot|\cdot)$}; \node [anchor=north west] (line13) at ([yshift=-0.1em]line12.south west) {10: \textbf{return} $f(\cdot|\cdot)$};
......
...@@ -330,7 +330,7 @@ $\seq{t}^{[2]}$ = So\; ,\; what\; is\; human\; \underline{translation}\; ? ...@@ -330,7 +330,7 @@ $\seq{t}^{[2]}$ = So\; ,\; what\; is\; human\; \underline{translation}\; ?
\label{eq:5-7} \label{eq:5-7}
\end{eqnarray} \end{eqnarray}
\parinterval 公式\eqref{eq:5-7}相当于在函数$g(\cdot)$上做了归一化,这样等式右端的结果具有一些概率的属性,比如,$0 \le \frac{g(\seq{s},\seq{t})}{\sum_{\seq{t'}}g(\seq{s},\seq{t'})} \le 1$。具体来说,对于源语言句子$\seq{s}$,枚举其所有的翻译结果,并把所对应的函数$g(\cdot)$相加作为分母,而分子是某个翻译结果$\seq{t}$所对应的$g(\cdot)$的值。 \parinterval 公式\eqref{eq:5-7}相当于在函数$g(\cdot)$上做了归一化,这样等式右端的结果具有一些概率的属性,比如,$0 \le \frac{g(\seq{s},\seq{t})}{\sum_{\seq{t'}}g(\seq{s},\seq{t'})} \le 1$ 具体来说,对于源语言句子$\seq{s}$,枚举其所有的翻译结果,并把所对应的函数$g(\cdot)$相加作为分母,而分子是某个翻译结果$\seq{t}$所对应的$g(\cdot)$的值。
\parinterval 上述过程初步建立了句子级翻译模型,并没有直接求$\funp{P}(\seq{t}|\seq{s})$,而是把问题转化为对$g(\cdot)$的设计和计算上。但是,面临着两个新的问题: \parinterval 上述过程初步建立了句子级翻译模型,并没有直接求$\funp{P}(\seq{t}|\seq{s})$,而是把问题转化为对$g(\cdot)$的设计和计算上。但是,面临着两个新的问题:
...@@ -1024,13 +1024,13 @@ f(s_u|t_v) &= &\lambda_{t_v}^{-1} \cdot \funp{P}(\seq{s}| \seq{t}) \cdot c_{\mat ...@@ -1024,13 +1024,13 @@ f(s_u|t_v) &= &\lambda_{t_v}^{-1} \cdot \funp{P}(\seq{s}| \seq{t}) \cdot c_{\mat
\parinterval 为了满足$f(\cdot|\cdot)$的概率归一化约束,易得$\lambda_{t_v}^{'}$为: \parinterval 为了满足$f(\cdot|\cdot)$的概率归一化约束,易得$\lambda_{t_v}^{'}$为:
\begin{eqnarray} \begin{eqnarray}
\lambda_{t_v}^{'}&=&\sum\limits_{s_u} c_{\mathbb{E}}(s_u|t_v;\seq{s},\seq{t}) \lambda_{t_v}^{'}&=&\sum\limits_{s'_u} c_{\mathbb{E}}(s'_u|t_v;\seq{s},\seq{t})
\label{eq:5-43} \label{eq:5-43}
\end{eqnarray} \end{eqnarray}
\parinterval 因此,$f(s_u|t_v)$的计算式可再一步变换成下式: \parinterval 因此,$f(s_u|t_v)$的计算式可再一步变换成下式:
\begin{eqnarray} \begin{eqnarray}
f(s_u|t_v)&=&\frac{c_{\mathbb{E}}(s_u|t_v;\seq{s},\seq{t})} { \sum\limits_{s_u} c_{\mathbb{E}}(s_u|t_v;\seq{s},\seq{t}) } f(s_u|t_v)&=&\frac{c_{\mathbb{E}}(s_u|t_v;\seq{s},\seq{t})} { \sum\limits_{s'_u} c_{\mathbb{E}}(s'_u|t_v;\seq{s},\seq{t}) }
\label{eq:5-44} \label{eq:5-44}
\end{eqnarray} \end{eqnarray}
......
...@@ -335,13 +335,13 @@ p_0+p_1 & = & 1 \label{eq:6-21} ...@@ -335,13 +335,13 @@ p_0+p_1 & = & 1 \label{eq:6-21}
\parinterval 另外,可以用$\odot_{i}$表示位置为$[i]$的目标语言单词对应的那些源语言单词位置的平均值,如果这个平均值不是整数则对它向上取整。比如在本例中,目标语句中第4个cept. (“.”)对应在源语言句子中的第5个单词。可表示为${\odot}_{4}=5$ \parinterval 另外,可以用$\odot_{i}$表示位置为$[i]$的目标语言单词对应的那些源语言单词位置的平均值,如果这个平均值不是整数则对它向上取整。比如在本例中,目标语句中第4个cept. (“.”)对应在源语言句子中的第5个单词。可表示为${\odot}_{4}=5$
\parinterval 利用这些新引进的概念,模型4对模型3的扭曲度进行了修改。主要是把扭曲度分解为两类参数。对于$[i]$对应的源语言单词列表($\tau_{[i]}$)中的第一个单词($\tau_{[i]1}$),它的扭曲度用如下公式计算: \parinterval 利用这些新引进的概念,模型4对模型3的扭曲度进行了修改。主要是把扭曲度分解为两类参数。对于$[i]$对应的源语言单词列表($\tau_{[i]}$)中的第一个单词($\tau_{[i]1}$),$[i]>0$它的扭曲度用如下公式计算:
\begin{eqnarray} \begin{eqnarray}
\funp{P}(\pi_{[i]1}=j|{\pi}_1^{[i]-1},{\tau}_0^l,{\varphi}_0^l,\seq{t}) & = & d_{1}(j-{\odot}_{i-1}|A(t_{[i-1]}),B(s_j)) \funp{P}(\pi_{[i]1}=j|{\pi}_1^{[i]-1},{\tau}_0^l,{\varphi}_0^l,\seq{t}) & = & d_{1}(j-{\odot}_{i-1}|A(t_{[i-1]}),B(s_j))
\label{eq:6-22} \label{eq:6-22}
\end{eqnarray} \end{eqnarray}
\noindent 其中,第$i$个目标语言单词生成的第$k$个源语言单词的位置用变量$\pi_{ik}$表示。而对于列表($\tau_{[i]}$)中的其他的单词($\tau_{[i]k},1 < k \le \varphi_{[i]}$)的扭曲度,用如下公式计算: \noindent 其中,第$i$个目标语言单词生成的第$k$个源语言单词的位置用变量$\pi_{ik}$表示。而对于列表($\tau_{[i]}$)中的其他的单词($\tau_{[i]k},1 < k \le \varphi_{[i]}$)的扭曲度,$[i]>0$用如下公式计算:
\begin{eqnarray} \begin{eqnarray}
\funp{P}(\pi_{[i]k}=j|{\pi}_{[i]1}^{k-1},\pi_1^{[i]-1},\tau_0^l,\varphi_0^l,\seq{t}) & = & d_{>1}(j-\pi_{[i]k-1}|B(s_j)) \funp{P}(\pi_{[i]k}=j|{\pi}_{[i]1}^{k-1},\pi_1^{[i]-1},\tau_0^l,\varphi_0^l,\seq{t}) & = & d_{>1}(j-\pi_{[i]k-1}|B(s_j))
......
...@@ -652,14 +652,14 @@ dr & = & {\rm{start}}_i-{\rm{end}}_{i-1}-1 ...@@ -652,14 +652,14 @@ dr & = & {\rm{start}}_i-{\rm{end}}_{i-1}-1
\parinterval 想要得到最优的特征权重,最简单的方法是枚举所有特征权重可能的取值,然后评价每组权重所对应的翻译性能,最后选择最优的特征权重作为调优的结果。但是特征权重是一个实数值,因此可以考虑把实数权重进行量化,即把权重看作是在固定间隔上的取值,比如,每隔0.01取值。即使是这样,同时枚举多个特征的权重也是非常耗时的工作,当特征数量增多时这种方法的效率仍然很低。 \parinterval 想要得到最优的特征权重,最简单的方法是枚举所有特征权重可能的取值,然后评价每组权重所对应的翻译性能,最后选择最优的特征权重作为调优的结果。但是特征权重是一个实数值,因此可以考虑把实数权重进行量化,即把权重看作是在固定间隔上的取值,比如,每隔0.01取值。即使是这样,同时枚举多个特征的权重也是非常耗时的工作,当特征数量增多时这种方法的效率仍然很低。
\parinterval 这里介绍一种更加高效的特征权重调优方法$\ \dash \ ${\small\bfnew{最小错误率训练}}\index{最小错误率训练}(Minimum Error Rate Training\index{Minimum Error Rate Training},MERT)。最小错误率训练是统计机器翻译发展中代表性工作,也是机器翻译领域原创的重要技术方法之一\upcite{DBLP:conf/acl/Och03}。最小错误率训练假设:翻译结果相对于标准答案的错误是可度量的,进而可以通过降低错误数量的方式来找到最优的特征权重。假设有样本集合$S = \{(s_1,\seq{r}_1),...,(s_N,\seq{r}_N)\}$$s_i$为样本中第$i$个源语言句子,$\seq{r}_i$为相应的参考译文。注意,$\seq{r}_i$ 可以包含多个参考译文。$S$通常被称为{\small\bfnew{调优集合}}\index{调优集合}(Tuning Set)\index{Tuning Set}。对于$S$中的每个源语句子$s_i$,机器翻译模型会解码出$n$-best推导$\hat{\seq{d}}_{i} = \{\hat{d}_{ij}\}$,其中$\hat{d}_{ij}$表示对于源语言句子$s_i$得到的第$j$个最好的推导。$\{\hat{d}_{ij}\}$可以被定义如下: \parinterval 这里介绍一种更加高效的特征权重调优方法$\ \dash \ ${\small\bfnew{最小错误率训练}}\index{最小错误率训练}(Minimum Error Rate Training\index{Minimum Error Rate Training},MERT)。最小错误率训练是统计机器翻译发展中代表性工作,也是机器翻译领域原创的重要技术方法之一\upcite{DBLP:conf/acl/Och03}。最小错误率训练假设:翻译结果相对于标准答案的错误是可度量的,进而可以通过降低错误数量的方式来找到最优的特征权重。假设有样本集合$S = \{(s^{[1]},\seq{r}^{[1]}),...,(s^{[N]},\seq{r}^{[N]})\}$$s^{[i]}$为样本中第$i$个源语言句子,$\seq{r}^{[i]}$为相应的参考译文。注意,$\seq{r}^{[i]}$ 可以包含多个参考译文。$S$通常被称为{\small\bfnew{调优集合}}\index{调优集合}(Tuning Set)\index{Tuning Set}。对于$S$中的每个源语句子$s^{[i]}$,机器翻译模型会解码出$n$-best推导$\hat{\seq{d}}^{[i]} = \{\hat{d}_{j}^{[i]}\}$,其中$\hat{d}_{j}^{[i]}$表示对于源语言句子$s^{[i]}$得到的第$j$个最好的推导。$\{\hat{d}_{j}^{[i]}\}$可以被定义如下:
\begin{eqnarray} \begin{eqnarray}
\{\hat{d}_{ij}\} & = & \arg\max_{\{d_{ij}\}} \sum_{i=1}^{M} \lambda_i \cdot h_i (d,\seq{t},\seq{s}) \{\hat{d}_{j}^{[i]}\} & = & \arg\max_{\{d_{j}^{[i]}\}} \sum_{i=1}^{M} \lambda_i \cdot h_i (d,\seq{t}^{[i]},\seq{s}^{[i]})
\label{eq:7-17} \label{eq:7-17}
\end{eqnarray} \end{eqnarray}
\parinterval 对于每个样本都可以得到$n$-best推导集合,整个数据集上的推导集合被记为$\hat{\seq{D}} = \{\hat{\seq{d}}_{1},...,\hat{\seq{d}}_{s}\}$。进一步,令所有样本的参考译文集合为$\seq{R} = \{\seq{r}_1,...,\seq{r}_N\}$。最小错误率训练的目标就是降低$\hat{\seq{D}}$相对于$\seq{R}$的错误。也就是,通过调整不同特征的权重$\lambda = \{ \lambda_i \}$,让错误率最小,形式化描述为: \parinterval 对于每个样本都可以得到$n$-best推导集合,整个数据集上的推导集合被记为$\hat{\seq{D}} = \{\hat{\seq{d}}^{[1]},...,\hat{\seq{d}}^{[N]}\}$。进一步,令所有样本的参考译文集合为$\seq{R} = \{\seq{r}^{[1]},...,\seq{r}^{[N]}\}$。最小错误率训练的目标就是降低$\hat{\seq{D}}$相对于$\seq{R}$的错误。也就是,通过调整不同特征的权重$\lambda = \{ \lambda_i \}$,让错误率最小,形式化描述为:
\begin{eqnarray} \begin{eqnarray}
\hat{\lambda} & = & \arg\min_{\lambda} \textrm{Error}(\hat{\seq{D}},\seq{R}) \hat{\lambda} & = & \arg\min_{\lambda} \textrm{Error}(\hat{\seq{D}},\seq{R})
\label{eq:7-18} \label{eq:7-18}
......
...@@ -23,8 +23,8 @@ ...@@ -23,8 +23,8 @@
\node [anchor=west] (t4) at ([xshift=0.5em,]t3.east) {ball}; \node [anchor=west] (t4) at ([xshift=0.5em,]t3.east) {ball};
\draw [->] ([xshift=0em]t3.north) .. controls +(north:1em) and +(north:1em) .. ([xshift=-0.2em]t4.north); \draw [->] ([xshift=0em]t3.north) .. controls +(north:1em) and +(north:1em) .. ([xshift=-0.2em]t4.north);
\draw [->] ([xshift=0.2em]t4.north) .. controls +(north:2.5em) and +(north:2.5em) .. ([xshift=0.2em]t2.north); \draw [<-] ([xshift=0.2em]t4.north) .. controls +(north:2.5em) and +(north:2.5em) .. ([xshift=0.2em]t2.north);
\draw [->] ([xshift=0.0em]t1.north) .. controls +(north:2.5em) and +(north:2.5em) .. ([xshift=-0.2em]t2.north); \draw [<-] ([xshift=0.0em]t1.north) .. controls +(north:2.5em) and +(north:2.5em) .. ([xshift=-0.2em]t2.north);
\node [anchor=north west] (cap2) at ([yshift=-0.2em,xshift=-0.5em]t2.south west) {\small{(b) 依存树}}; \node [anchor=north west] (cap2) at ([yshift=-0.2em,xshift=-0.5em]t2.south west) {\small{(b) 依存树}};
\end{scope} \end{scope}
......
...@@ -532,9 +532,9 @@ span\textrm{[0,4]}&=&\textrm{“猫} \quad \textrm{喜欢} \quad \textrm{吃} \q ...@@ -532,9 +532,9 @@ span\textrm{[0,4]}&=&\textrm{“猫} \quad \textrm{喜欢} \quad \textrm{吃} \q
\begin{itemize} \begin{itemize}
\vspace{0.5em} \vspace{0.5em}
\item 剪枝:在CKY中,每个跨度都可以生成非常多的推导(局部翻译假设)。理论上,这些推导的数量会和跨度大小成指数关系。显然不可能保存如此大量的翻译推导。对于这个问题,常用的办法是只保留top-$k$个推导。也就是每个局部结果只保留最好的$k$个,即束剪枝。在极端情况下,当$k$=1时,这个方法就变成了贪婪的方法; \item {\small\bfnew{剪枝}}:在CKY中,每个跨度都可以生成非常多的推导(局部翻译假设)。理论上,这些推导的数量会和跨度大小成指数关系。显然不可能保存如此大量的翻译推导。对于这个问题,常用的办法是只保留top-$k$个推导。也就是每个局部结果只保留最好的$k$个,即束剪枝。在极端情况下,当$k$=1时,这个方法就变成了贪婪的方法;
\vspace{0.5em} \vspace{0.5em}
\item $n$-best结果的生成$n$-best推导(译文)的生成是统计机器翻译必要的功能。比如,最小错误率训练中就需要最好的$n$个结果用于特征权重调优。在基于CKY的方法中,整个句子的翻译结果会被保存在最大跨度所对应的结构中。因此一种简单的$n$-best生成方法是从这个结构中取出排名最靠前的$n$个结果。另外,也可以考虑自上而下遍历CKY生成的推导空间,得到更好的$n$-best结果\upcite{huang2005better} \item {\small\bfnew{$n$-best结果的生成}}$n$-best推导(译文)的生成是统计机器翻译必要的功能。比如,最小错误率训练中就需要最好的$n$个结果用于特征权重调优。在基于CKY的方法中,整个句子的翻译结果会被保存在最大跨度所对应的结构中。因此一种简单的$n$-best生成方法是从这个结构中取出排名最靠前的$n$个结果。另外,也可以考虑自上而下遍历CKY生成的推导空间,得到更好的$n$-best结果\upcite{huang2005better}
\end{itemize} \end{itemize}
%---------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------
% NEW SUB-SECTION % NEW SUB-SECTION
......
...@@ -8,12 +8,12 @@ ...@@ -8,12 +8,12 @@
\draw[->,thick] (-6,0) -- (5,0); \draw[->,thick] (-6,0) -- (5,0);
\draw[->,thick] (-5,-4) -- (-5,5); \draw[->,thick] (-5,-4) -- (-5,5);
\draw [<-] (-2.5,4) -- (-2,5) node [pos=1,right,inner sep=2pt] {\footnotesize{答案$\tilde{\mathbi{y}}_i$}}; \draw [<-] (-2.5,4) -- (-2,5) node [pos=1,right,inner sep=2pt] {\footnotesize{答案${\mathbi{y}}^{[i]}$}};
{ {
\draw [<-] (-3,-3) -- (-2.5,-2) node [pos=0,left,inner sep=2pt] {\footnotesize{预测${\mathbi{y}}_i$}};} \draw [<-] (-3,-3) -- (-2.5,-2) node [pos=0,left,inner sep=2pt] {\footnotesize{预测${\hat{\mathbi{y}}}^{[i]}$}};}
{ {
\draw [<-] (2.3,1) -- (3.3,2) node [pos=1,right,inner sep=2pt] {\footnotesize{偏差$|\tilde{\mathbi{y}}_i - {\mathbi{y}}_i|$}}; \draw [<-] (2.3,1) -- (3.3,2) node [pos=1,right,inner sep=2pt] {\footnotesize{偏差$|{\mathbi{y}}^{[i]} - {\hat{\mathbi{y}}}^{[i]}|$}};
\foreach \x in {-3.8,-3.7,...,3.0}{ \foreach \x in {-3.8,-3.7,...,3.0}{
\pgfmathsetmacro{\p}{- 1/14 * (\x + 4) * (\x + 1) * (\x - 1) * (\x - 3)}; \pgfmathsetmacro{\p}{- 1/14 * (\x + 4) * (\x + 1) * (\x - 1) * (\x - 3)};
\pgfmathsetmacro{\q}{- 1/14 * (4*\x*\x*\x + 3*\x*\x - 26*\x - 1)}; \pgfmathsetmacro{\q}{- 1/14 * (4*\x*\x*\x + 3*\x*\x - 26*\x - 1)};
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论