Commit 3077a395 by 曹润柘

合并分支 'caorunzhe' 到 'master'

Caorunzhe

查看合并请求 !503
parents 5bd0c0c3 530a9dda
\tikzstyle{yy} = [circle,minimum height=1cm,text centered,draw=black]
\tikzstyle{yy} = [circle,minimum height=1cm,text centered,draw=black,thick,drop shadow={shadow xshift=0.3em,yshift=0.8em},fill=white]
\begin{tikzpicture}[node distance = 0,scale = 1]
\begin{scope}[xshift=0.2in]
\tikzstyle{every node}=[scale=1]
......@@ -8,22 +8,22 @@
\node (y4)[yy,right of = y3,xshift=1.5cm]{\large$y_4$};
\node (y5)[yy,right of = y4,xshift=1.5cm]{\large$y_5$};
\node (y6)[yy,right of = y5,xshift=1.5cm]{\large$y_6$};
\node [anchor=north,font=\scriptsize] (labela) at ([xshift=1.8em,yshift=-3em]y3.south) {(a) 自回归模型};
\node [anchor=north,font=\scriptsize] (labela) at ([xshift=1.8em,yshift=-2em]y3.south) {(a) 自回归模型};
\draw[->,thick] (y1.north) .. controls ([yshift=2em]y1.north) and ([yshift=2em]y3.north).. (y3.north);
\draw[->,thick] (y1.north) .. controls ([yshift=3em]y1.north) and ([yshift=3em]y4.north).. (y4.north);
\draw[->,thick] (y1.south) .. controls ([yshift=-2em]y1.south) and ([yshift=-2em]y5.south).. (y5.south);
\draw[->,thick] (y1.south) .. controls ([yshift=-3em]y1.south) and ([yshift=-3em]y6.south).. (y6.south);
\draw[->,thick] (y1.north) .. controls ([yshift=1.5em]y1.north) and ([yshift=1.5em]y3.north).. (y3.north);
\draw[->,thick] (y1.north) .. controls ([yshift=2em]y1.north) and ([yshift=2em]y4.north).. (y4.north);
\draw[->,thick] (y1.south) .. controls ([yshift=-1.5em]y1.south) and ([yshift=-1.5em]y5.south).. (y5.south);
\draw[->,thick] (y1.south) .. controls ([yshift=-2em]y1.south) and ([yshift=-2em]y6.south).. (y6.south);
\draw[->,thick] (y2.north) .. controls ([yshift=2em]y2.north) and ([yshift=2em]y4.north).. (y4.north);
\draw[->,thick] (y2.south) .. controls ([yshift=-2em]y2.south) and ([yshift=-2em]y5.south).. (y5.south);
\draw[->,thick] (y2.south) .. controls ([yshift=-3em]y2.south) and ([yshift=-3em]y6.south).. (y6.south);
\draw[->,thick] (y2.north) .. controls ([yshift=1.5em]y2.north) and ([yshift=1.5em]y4.north).. (y4.north);
\draw[->,thick] (y2.south) .. controls ([yshift=-1.5em]y2.south) and ([yshift=-1.6em]y5.south).. (y5.south);
\draw[->,thick] (y2.south) .. controls ([yshift=-2em]y2.south) and ([yshift=-2em]y6.south).. (y6.south);
\draw[->,thick] (y3.south) .. controls ([yshift=-2em]y3.south) and ([yshift=-2em]y5.south).. (y5.south);
\draw[->,thick] (y3.south) .. controls ([yshift=-3em]y3.south) and ([yshift=-3em]y6.south).. (y6.south);
\draw[->,thick] (y3.south) .. controls ([yshift=-1.5em]y3.south) and ([yshift=-1.5em]y5.south).. (y5.south);
\draw[->,thick] (y3.south) .. controls ([yshift=-2em]y3.south) and ([yshift=-2em]y6.south).. (y6.south);
\draw[->,thick] (y4.south) .. controls ([yshift=-2em]y4.south) and ([yshift=-2em]y6.south).. (y6.south);
\draw[->,thick] (y4.south) .. controls ([yshift=-1.5em]y4.south) and ([yshift=-1.5em]y6.south).. (y6.south);
\draw[->,red,very thick](y1.east)to(y2.west);
\draw[->,red,very thick](y2.east)to(y3.west);
......@@ -32,7 +32,7 @@
\draw[->,red,very thick](y5.east)to(y6.west);
\end{scope}
\begin{scope}[yshift=-1.6in]
\begin{scope}[yshift=-1.45in]
\tikzstyle{rec} = [rectangle,minimum width=2.8cm,minimum height=1.5cm,text centered,draw=black,dashed]
\tikzstyle{every node}=[scale=1]
\node (y1)[yy]{\large$y_1$};
......@@ -49,9 +49,9 @@
\draw[->,red,very thick](rec1.east)to(rec2.west);
\draw[->,red,very thick](rec2.east)to(rec3.west);
\draw[->,thick] (rec1.north) .. controls ([yshift=3em]rec1.north) and ([yshift=3em]rec3.north).. (rec3.north);
\draw[->,thick] (rec1.north) .. controls ([yshift=2.5em]rec1.north) and ([yshift=2.5em]rec3.north).. (rec3.north);
\end{scope}
\begin{scope}[xshift=0.3in,yshift=-2.5in]
\begin{scope}[xshift=0.3in,yshift=-2.35in]
\tikzstyle{every node}=[scale=1]
\node (y1)[yy]{\large$y_1$};
\node (y2)[yy,right of = y1,xshift=1.5cm]{\large$y_2$};
......
\begin{tikzpicture}
\tikzstyle{snode} = [draw,inner sep=1pt,minimum width=3em,minimum height=0.5em,rounded corners=1pt,fill=green!30!white]
\tikzstyle{snode} = [draw,inner sep=1pt,minimum width=3em,minimum height=0.5em,rounded corners=1pt,fill=green!20!white]
\tikzstyle{pnode} = [draw,inner sep=1pt,minimum width=1em,minimum height=0.5em,rounded corners=1pt]
\node [anchor=west,snode] (s1) at (0,0) {\tiny{}};
\node [anchor=north west,snode,minimum width=6.3em] (s2) at ([yshift=-0.3em]s1.south west) {\tiny{}};
......@@ -18,7 +18,7 @@
\node [anchor=west,pnode,minimum width=3em] (p6) at ([xshift=0.3em]s6.east) {\tiny{}};
\node [rectangle,inner sep=0.5em,rounded corners=2pt,very thick,dotted,draw=ugreen!80] [fit = (s1) (s6) (p1) (p6)] (box0) {};
\node[rectangle,inner sep=0.5em,rounded corners=1pt,draw,fill=blue!15] (model) at ([xshift=4em]box0.east){{模型}};
\node[rectangle,inner sep=0.5em,rounded corners=1pt,draw,fill=blue!20] (model) at ([xshift=3.5em]box0.east){{模型}};
% big batch
\node [anchor=west,snode] (sbi1) at ([xshift=3em,yshift=6em]model.east) {\tiny{}};
......@@ -49,7 +49,7 @@
\node [rectangle,inner sep=0.5em,rounded corners=2pt,very thick,dotted,draw=ugreen!80] [fit = (sma1) (sma3) (pma1) (pma2)] (box2) {};
% small batch
\node [anchor=west,snode,minimum width=2em] (sma4) at ([xshift=4em,yshift=0em]sma1.east) {\tiny{}};
\node [anchor=west,snode,minimum width=2em] (sma4) at ([xshift=3.5em,yshift=0em]sma1.east) {\tiny{}};
\node [anchor=north west,snode,minimum width=3em] (sma5) at ([yshift=-0.3em]sma4.south west) {\tiny{}};
\node [anchor=north west,snode,minimum width=3em] (sma6) at ([yshift=-0.3em]sma5.south west) {\tiny{}};
......@@ -76,7 +76,7 @@
\draw [very thick,decorate,decoration={brace}] ([xshift=3pt]box1.north east) to node [midway,name=final] {} ([xshift=3pt]box1.south east);
\draw [very thick,decorate,decoration={brace}] ([xshift=3pt]box3.north east) to node [midway,name=final] {} ([xshift=3pt]box3.south east);
\node [rectangle,inner sep=0.5em,rounded corners=2pt,draw,fill=red!5,font=\scriptsize] at ([yshift=-2em,xshift=10em]sbi1.east) {
\node [rectangle,inner sep=0.5em,rounded corners=2pt,draw,fill=red!5,font=\scriptsize] at ([yshift=-2em,xshift=9em]sbi1.east) {
\begin{tabular}{l}
$m$: 显存 \\
$t$: 时间 \\
......
\definecolor{cocoabrown}{rgb}{0.82, 0.41, 0.12}
\centering
......@@ -10,19 +10,19 @@
\tikzstyle{output} = [rectangle,very thick,rounded corners=3pt,minimum width=1cm,align=center,font=\tiny];
\begin{scope}
\node [system,draw=orange,text=orange] (model3) at (0,0) {模型 $3$};
\node [system,draw=ugreen,text=ugreen,anchor=south] (model2) at ([yshift=0.3cm]model3.north) {模型 $2$};
\node [system,draw=red,text=red,anchor=south] (model1) at ([yshift=0.3cm]model2.north) {模型 $1$};
\node [system,draw=orange!70,text=orange] (model3) at (0,0) {模型 $3$};
\node [system,draw=ugreen!70,text=ugreen,anchor=south] (model2) at ([yshift=0.3cm]model3.north) {模型 $2$};
\node [system,draw=red!70,text=red,anchor=south] (model1) at ([yshift=0.3cm]model2.north) {模型 $1$};
\node [output,draw=orange,text=orange,anchor=west] (output3) at ([xshift=0.5cm]model3.east) {输出 $3$};
\node [output,draw=ugreen,text=ugreen,anchor=west] (output2) at ([xshift=0.5cm]model2.east) {输出 $2$};
\node [output,draw=red,text=red,anchor=west] (output1) at ([xshift=0.5cm]model1.east) {输出 $1$};
\node [output,draw=orange!70,text=orange,anchor=west] (output3) at ([xshift=0.5cm]model3.east) {输出 $3$};
\node [output,draw=ugreen!70,text=ugreen,anchor=west] (output2) at ([xshift=0.5cm]model2.east) {输出 $2$};
\node [output,draw=red!70,text=red,anchor=west] (output1) at ([xshift=0.5cm]model1.east) {输出 $1$};
\begin{pgfonlayer}{background}
\node [draw,thick,dashed,rounded corners=3pt,inner sep=2pt,fit=(output1) (output2) (output3)] (output) {};
\end{pgfonlayer}
\node [output,draw=ublue,text=ublue,minimum width=1cm,right=1cm of output] (final) {最终\\输出};
\node [output,draw=cocoabrown!70,text=cocoabrown,minimum width=1cm,right=1cm of output] (final) {最终\\输出};
\draw [->,very thick] (model1) to (output1);
\draw [->,very thick] (model2) to (output2);
......@@ -40,17 +40,17 @@
\tikzstyle{output} = [rectangle,very thick,rounded corners=3pt,minimum width=1cm,align=center,font=\tiny];
\begin{scope}
\node [system,draw=orange,text=orange] (model3) at (0,0) {模型 $3$};
\node [system,draw=ugreen,text=ugreen,anchor=south] (model2) at ([yshift=0.3cm]model3.north) {模型 $2$};
\node [system,draw=red,text=red,anchor=south] (model1) at ([yshift=0.3cm]model2.north) {模型 $1$};
\node [system,draw=orange!70,text=orange] (model3) at (0,0) {模型 $3$};
\node [system,draw=ugreen!70,text=ugreen,anchor=south] (model2) at ([yshift=0.3cm]model3.north) {模型 $2$};
\node [system,draw=red!70,text=red,anchor=south] (model1) at ([yshift=0.3cm]model2.north) {模型 $1$};
\begin{pgfonlayer}{background}
\node [draw,thick,dashed,inner sep=2pt,fit=(model3) (model2) (model1)] (ensemble) {};
\end{pgfonlayer}
\node [system,draw=ugreen,text=ugreen,right=1cm of ensemble] (model) {模型};
\node [system,draw=ugreen!70,text=ugreen,right=1cm of ensemble] (model) {模型};
\node [output,draw=ublue,text=ublue,minimum width=1cm,anchor=west] (final) at ([xshift=0.5cm]model.east) {最终\\输出};
\node [output,draw=cocoabrown!70,text=cocoabrown,minimum width=1cm,anchor=west] (final) at ([xshift=0.5cm]model.east) {最终\\输出};
\draw [->,very thick] (ensemble) to node [above,pos=0.5,font=\tiny] {融合} (model);
......@@ -68,13 +68,13 @@
\tikzstyle{dot} = [circle,fill=blue!40!white,minimum size=5pt,inner sep=0pt];
\begin{scope}
\node [system,draw=orange,text=orange] (model3) at (0,0) {模型 $3$};
\node [system,draw=ugreen,text=ugreen,anchor=south] (model2) at ([yshift=0.3cm]model3.north) {模型 $2$};
\node [system,draw=red,text=red,anchor=south] (model1) at ([yshift=0.3cm]model2.north) {模型 $1$};
\node [system,draw=orange!70,text=orange] (model3) at (0,0) {模型 $3$};
\node [system,draw=ugreen!70,text=ugreen,anchor=south] (model2) at ([yshift=0.3cm]model3.north) {模型 $2$};
\node [system,draw=red!70,text=red,anchor=south] (model1) at ([yshift=0.3cm]model2.north) {模型 $1$};
\node [output,draw=orange,text=orange,anchor=west] (output3) at ([xshift=0.5cm]model3.east) {输出 $3$};
\node [output,draw=ugreen,text=ugreen,anchor=west] (output2) at ([xshift=0.5cm]model2.east) {输出 $2$};
\node [output,draw=red,text=red,anchor=west] (output1) at ([xshift=0.5cm]model1.east) {输出 $1$};
\node [output,draw=orange!70,text=orange,anchor=west] (output3) at ([xshift=0.5cm]model3.east) {输出 $3$};
\node [output,draw=ugreen!70,text=ugreen,anchor=west] (output2) at ([xshift=0.5cm]model2.east) {输出 $2$};
\node [output,draw=red!70,text=red,anchor=west] (output1) at ([xshift=0.5cm]model1.east) {输出 $1$};
\draw [->,very thick] (model1) to (output1);
\draw [->,very thick] (model2) to (output2);
......@@ -105,7 +105,7 @@
\node [system,draw=purple,text=purple,anchor=west] (model) at ([xshift=5.3cm]output1.east) {模型};
\node [output,draw=ublue,text=ublue,minimum width=1cm,right=1.3cm of lattice] (final) {最终输出};
\node [output,draw=cocoabrown!70,text=cocoabrown,minimum width=1cm,right=1.3cm of lattice] (final) {最终输出};
\draw [->,very thick] (model) |- (final);
\draw [->,very thick] (lattice) -- (final);
......
\definecolor{cocoabrown}{rgb}{0.82, 0.41, 0.12}
\begin{tikzpicture}
\tikzstyle{system} = [rectangle,very thick,minimum width=1.5cm,font=\scriptsize];
\tikzstyle{output} = [rectangle,very thick,rounded corners=3pt,minimum width=1.5cm,align=center,font=\scriptsize];
\begin{scope}[local bounding box=MULTIPLE]
\node [system,draw=orange,text=orange] (engine3) at (0,0) {系统 $n$};
\node [system,draw=ugreen,text=ugreen,anchor=south] (engine2) at ([yshift=0.6cm]engine3.north) {系统 $2$};
\node [system,draw=red,text=red,anchor=south] (engine1) at ([yshift=0.3cm]engine2.north) {系统 $1$};
\node [system,draw=orange!70,text=orange] (engine3) at (0,0) {系统 $n$};
\node [system,draw=ugreen!70,text=ugreen,anchor=south] (engine2) at ([yshift=0.6cm]engine3.north) {系统 $2$};
\node [system,draw=red!70,text=red,anchor=south] (engine1) at ([yshift=0.3cm]engine2.north) {系统 $1$};
\node [output,draw=orange,text=orange,anchor=west] (output3) at ([xshift=0.5cm]engine3.east) {输出 $n$};
\node [output,draw=ugreen,text=ugreen,anchor=west] (output2) at ([xshift=0.5cm]engine2.east) {输出 $2$};
\node [output,draw=red,text=red,anchor=west] (output1) at ([xshift=0.5cm]engine1.east) {输出 $1$};
\node [output,draw=orange!70,text=orange,anchor=west] (output3) at ([xshift=0.5cm]engine3.east) {输出 $n$};
\node [output,draw=ugreen!70,text=ugreen,anchor=west] (output2) at ([xshift=0.5cm]engine2.east) {输出 $2$};
\node [output,draw=red!70,text=red,anchor=west] (output1) at ([xshift=0.5cm]engine1.east) {输出 $1$};
\draw [very thick,decorate,decoration={brace}] ([xshift=3pt]output1.north east) to node [midway,name=final] {} ([xshift=3pt]output3.south east);
\node [output,draw=ublue,text=ublue,minimum width=1cm,right=0pt of final,minimum height=2.5em] () {最终\\输出};
\node [output,draw=cocoabrown!70,text=cocoabrown,minimum width=1cm,right=0pt of final,minimum height=2.5em] () {最终\\输出};
\draw [->,very thick] (engine1) to (output1);
\draw [->,very thick] (engine2) to (output2);
......@@ -25,15 +25,15 @@
\end{scope}
\begin{scope}[local bounding box=SINGLE]
\node [output,draw=ugreen,text=ugreen,anchor=west] (output3) at ([xshift=4cm]output3.east) {输出 $n$};
\node [output,draw=ugreen,text=ugreen,anchor=west] (output2) at ([xshift=4cm]output2.east) {输出 $2$};
\node [output,draw=ugreen,text=ugreen,anchor=west] (output1) at ([xshift=4cm]output1.east) {输出 $1$};
\node [output,draw=ugreen!70,text=ugreen,anchor=west] (output3) at ([xshift=4cm]output3.east) {输出 $n$};
\node [output,draw=ugreen!70,text=ugreen,anchor=west] (output2) at ([xshift=4cm]output2.east) {输出 $2$};
\node [output,draw=ugreen!70,text=ugreen,anchor=west] (output1) at ([xshift=4cm]output1.east) {输出 $1$};
\node [system,draw=ugreen,text=ugreen,anchor=east,align=center,inner sep=1.9pt] (engine) at ([xshift=-0.5cm]output2.west) {单系统};
\node [system,draw=ugreen!70,text=ugreen,anchor=east,align=center,inner sep=1.9pt] (engine) at ([xshift=-0.5cm]output2.west) {单系统};
\draw [very thick,decorate,decoration={brace}] ([xshift=3pt]output1.north east) to node [midway,name=final] {} ([xshift=3pt]output3.south east);
\node [output,draw=ublue,text=ublue,minimum width=1cm,right=0pt of final,minimum height=2.5em] () {最终\\输出};
\node [output,draw=cocoabrown!70,text=cocoabrown,minimum width=1cm,right=0pt of final,minimum height=2.5em] () {最终\\输出};
\draw [->,very thick] (engine.east) to (output1.west);
\draw [->,very thick] (engine.east) to (output2.west);
......
\tikzstyle{er} = [rectangle,minimum width=2.5cm,minimum height=1.5cm,text centered,draw=black]
\definecolor{taupegray}{rgb}{0.55, 0.52, 0.54}
\definecolor{babyblueeyes}{rgb}{0.63, 0.79, 0.95}
\tikzstyle{er} = [rectangle,minimum width=2.5cm,minimum height=1.5cm,rounded corners,text centered,draw=taupegray,drop shadow]
\begin{tikzpicture}[node distance = 0,scale = 0.75]
\tikzstyle{every node}=[scale=0.75]
\node (encoder)[er,very thick,draw=black!70,fill=ugreen!20]{\Large{编码器}};
\node (decoder_1)[er,very thick,draw=black!70,right of=encoder,xshift=4cm,fill=red!20]{\Large{解码器1}};
\node (decoder_2)[er,very thick,draw=black!70,right of=decoder_1,xshift=4cm,fill=red!20]{\Large{解码器2}};
\node (encoder)[er,very thick,draw=taupegray,fill=ugreen!20]{\Large{编码器}};
\node (decoder_1)[er,very thick,draw=taupegray,right of=encoder,xshift=4cm,fill=red!20]{\Large{解码器1}};
\node (decoder_2)[er,very thick,draw=taupegray,right of=decoder_1,xshift=4cm,fill=red!20]{\Large{解码器2}};
\node (point)[right of=decoder_2,xshift=2.5cm,]{\LARGE{...}};
\node (decoder_3)[er,very thick,draw=black!70,right of=point,xshift=2.5cm,fill=red!20]{\Large{解码器3}};
\node (decoder_3)[er,very thick,draw=taupegray,right of=point,xshift=2.5cm,fill=red!20]{\Large{解码器3}};
\draw [->,very thick,draw=black!70]([xshift=0.2cm]encoder.east) -- ([xshift=-0.2cm]decoder_1.west);
\draw [->,very thick,draw=black!70]([xshift=0.2cm]decoder_1.east) -- ([xshift=-0.2cm]decoder_2.west);
\draw [->,very thick,draw=black!70]([xshift=0.2cm]decoder_2.east) -- ([xshift=-0.1cm]point.west);
......
\begin{tikzpicture}
%左
\node [anchor=west,draw=black,very thick,minimum width=6em,minimum height=3.5em,fill=blue!15,align=center,text=black] (part1) at (0,0) {\scriptsize{预测模块} \\ \tiny{(RNN/Transsformer)}};
\node [anchor=west,draw=black!70,rounded corners,drop shadow,very thick,minimum width=6em,minimum height=3.5em,fill=blue!15,align=center,text=black] (part1) at (0,0) {\scriptsize{预测模块} \\ \tiny{(RNN/Transsformer)}};
\node [anchor=south] (text) at ([xshift=0.5em,yshift=-3.5em]part1.south) {\scriptsize{源语言句子(编码)}};
\node [anchor=east,draw=black,very thick,minimum width=6em,minimum height=3.5em,fill=blue!15,align=center,text=black] (part2) at ([xshift=10em]part1.east) {\scriptsize{搜索模块}};
\node [anchor=east,draw=black!70,rounded corners,drop shadow,very thick,minimum width=6em,minimum height=3.5em,fill=blue!15,align=center,text=black] (part2) at ([xshift=10em]part1.east) {\scriptsize{搜索模块}};
\node [anchor=south] (text1) at ([xshift=0.5em,yshift=2.2em]part1.north) {\scriptsize{已经生成的目标语单词}};
\node [anchor=south] (text2) at ([xshift=0.5em,yshift=2.2em]part2.north) {\scriptsize{预测当前位置的单词分布}};
......
\tikzstyle{er} = [rectangle,minimum width=7cm,minimum height=2.5cm,text centered,draw=black]
\definecolor{taupegray}{rgb}{0.55, 0.52, 0.54}
\tikzstyle{er} = [rectangle,minimum width=7cm,minimum height=2.5cm,text centered,draw=taupegray,drop shadow,rounded corners]
\begin{tikzpicture}[node distance = 0,scale = 0.55]
\tikzstyle{every node}=[scale=0.55]
\node (encoder)[er,very thick,minimum width=5cm,draw=black!70,fill=ugreen!20]{\huge{编码器}};
\node (decoder)[er,very thick,right of=encoder,xshift=7.75cm,draw=black!70,fill=red!20]{\huge{解码器}};
\node (decoder_1)[er,very thick,right of=decoder,xshift=8.75cm,draw=black!70,fill=red!20]{\huge{解码器}};
\node (encoder)[er,very thick,minimum width=5.5cm,fill=ugreen!20]{\huge{编码器}};
\node (decoder)[er,very thick,right of=encoder,xshift=7.75cm,fill=red!20]{\huge{解码器}};
\node (decoder_1)[er,very thick,right of=decoder,xshift=8.75cm,fill=red!20]{\huge{解码器}};
\draw [->,very thick,draw=black!70]([xshift=0.2cm]encoder.east) -- ([xshift=-0.2cm]decoder.west);
\draw [->,very thick,draw=black!70]([xshift=0.2cm]decoder.east) -- ([xshift=-0.2cm]decoder_1.west);
\foreach \x in {-1.8cm,-0.9cm,...,1.9cm}
\foreach \x in {-2.2cm,-1.1cm,...,2.2cm}
\draw [->,very thick,draw=black!70]([xshift=\x,yshift=-1cm]encoder.south) -- ([xshift=\x,yshift=-0.2cm]encoder.south);
\node [below of = encoder,xshift=-1.8cm,yshift=-2.95cm,scale=1.2]{[cls]};
\node [below of = encoder,xshift=-0.7cm,yshift=-2.9cm,scale=1.2]{hello};
\node [below of = encoder,xshift=-2.3cm,yshift=-2.95cm,scale=1.2]{\large{[LEN]}};
\node [below of = encoder,xshift=-1.05cm,yshift=-2.9cm,scale=1.2]{\large{hello}};
\node [below of = encoder,xshift=0cm,yshift=-3.05cm,scale=1.2]{,};
\node [below of = encoder,xshift=0.9cm,yshift=-2.9cm,scale=1.2]{world};
\node [below of = encoder,xshift=1.8cm,yshift=-2.9cm,scale=1.2]{!};
\node [below of = encoder,xshift=1.1cm,yshift=-2.9cm,scale=1.2]{\large{world}};
\node [below of = encoder,xshift=2.2cm,yshift=-2.9cm,scale=1.2]{!};
\draw [->,very thick,draw=black!70]([xshift=-1.8cm,yshift=0.2cm]encoder.north) -- ([xshift=-1.8cm,yshift=1cm]encoder.north);
\node [below of = encoder,xshift=-1.8cm,yshift=2.9cm,scale=1.5]{length:6};
\draw [->,very thick,draw=black!70]([xshift=-2.2cm,yshift=0.2cm]encoder.north) -- ([xshift=-2.2cm,yshift=1cm]encoder.north);
\node [below of = encoder,xshift=-2.2cm,yshift=2.9cm,scale=1.5]{4};
\foreach \x in {-2.8cm,-1.7cm,...,2.9cm}
\foreach \x in {-2.7cm,-0.9cm,...,2.8cm}
{\draw [->,very thick,draw=black!70]([xshift=\x,yshift=-1cm]decoder.south) -- ([xshift=\x,yshift=-0.2cm]decoder.south);
\draw [->,very thick,draw=black!70]([xshift=\x,yshift=0.2cm]decoder.north) -- ([xshift=\x,yshift=1cm]decoder.north);}
\node [below of = decoder,xshift=0cm,yshift=-2.9cm,scale=1.2]{{Mask\ Mask\ Mask\ Mask\ Mask\ Mask}};
\node [below of = decoder,xshift=-2.8cm,yshift=2.9cm,scale=1.6]{};
\node [below of = decoder,xshift=-1.7cm,yshift=2.9cm,scale=1.6]{};
\node [below of = decoder,xshift=-0.6cm,yshift=2.7cm,scale=1.6]{};
\node [below of = decoder,xshift=0.5cm,yshift=2.7cm,scale=1.6]{};
\node [below of = decoder,xshift=1.6cm,yshift=2.9cm,scale=1.6]{};
\node [below of = decoder,xshift=2.7cm,yshift=2.75cm,scale=1.6]{};
\foreach \x in {-2.8cm,-1.7cm,...,2.9cm}
\node [below of = decoder,xshift=-2.7cm,yshift=-2.9cm,scale=1.6]{\small{[Mask]}};
\node [below of = decoder,xshift=-0.9cm,yshift=-2.9cm,scale=1.6]{\small{[Mask]}};
\node [below of = decoder,xshift=0.9cm,yshift=-2.9cm,scale=1.6]{\small{[Mask]}};
\node [below of = decoder,xshift=2.7cm,yshift=-2.9cm,scale=1.6]{\small{[Mask]}};
\node [below of = decoder,xshift=-2.7cm,yshift=2.9cm,scale=1.6]{你好};
\node [below of = decoder,xshift=-0.9cm,yshift=2.7cm,scale=1.6]{};
\node [below of = decoder,xshift=0.9cm,yshift=2.9cm,scale=1.6]{你好};
\node [below of = decoder,xshift=2.6cm,yshift=2.9cm,scale=1.6]{};
\foreach \x in {-2.7cm,-0.9cm,...,2.8cm}
{\draw [->,very thick,draw=black!70]([xshift=\x,yshift=-1cm]decoder_1.south) -- ([xshift=\x,yshift=-0.2cm]decoder_1.south);
\draw [->,very thick,draw=black!70]([xshift=\x,yshift=0.2cm]decoder_1.north) -- ([xshift=\x,yshift=1cm]decoder_1.north);}
\node [below of = decoder_1,xshift=-2.8cm,yshift=2.9cm,scale=1.6]{};
\node [below of = decoder_1,xshift=-1.7cm,yshift=2.9cm,scale=1.6]{};
\node [below of = decoder_1,xshift=-0.6cm,yshift=2.75cm,scale=1.6]{};
\node [below of = decoder_1,xshift=0.5cm,yshift=2.9cm,scale=1.6]{};
\node [below of = decoder_1,xshift=1.6cm,yshift=2.9cm,scale=1.6]{};
\node [below of = decoder_1,xshift=2.7cm,yshift=2.75cm,scale=1.6]{};
\node [below of = decoder_1,xshift=-2.8cm,yshift=-2.9cm,scale=1.2]{Mask};
\node [below of = decoder_1,xshift=-1.7cm,yshift=-2.9cm,scale=1.3]{};
\node [below of = decoder_1,xshift=-0.6cm,yshift=-3cm,scale=1.3]{};
\node [below of = decoder_1,xshift=0.5cm,yshift=-2.9cm,scale=1.2]{Mask};
\node [below of = decoder_1,xshift=1.6cm,yshift=-2.9cm,scale=1.2]{};
\node [below of = decoder_1,xshift=2.7cm,yshift=-3cm,scale=1.3]{};
\node [below of = decoder_1,xshift=-2.7cm,yshift=2.9cm,scale=1.6]{你好};
\node [below of = decoder_1,xshift=-0.9cm,yshift=2.7cm,scale=1.6]{};
\node [below of = decoder_1,xshift=0.9cm,yshift=2.9cm,scale=1.6]{世界};
\node [below of = decoder_1,xshift=2.7cm,yshift=2.9cm,scale=1.6]{};
\node [below of = decoder_1,xshift=-2.7cm,yshift=-2.9cm,scale=1.6]{你好};
\node [below of = decoder_1,xshift=-0.9cm,yshift=-3cm,scale=1.6]{};
\node [below of = decoder_1,xshift=0.9cm,yshift=-2.9cm,scale=1.6]{\small{[Mask]}};
\node [below of = decoder_1,xshift=2.7cm,yshift=-2.9cm,scale=1.6]{};
\end{tikzpicture}
\ No newline at end of file
......@@ -8,7 +8,7 @@
\tikzstyle{tgt} = [minimum height=1.6em,minimum width=5.2em,fill=black!10!yellow!30,font=\footnotesize,drop shadow={shadow xshift=0.15em,shadow yshift=-0.15em,}]
\tikzstyle{p} = [fill=blue!15,minimum width=0.4em,inner sep=0pt]
\node[ rounded corners=3pt, fill=red!20, drop shadow, minimum width=10em,minimum height=4em] (encoder) at (0,0) {Transformer 编码器 };
\node[anchor=west, rounded corners=3pt, fill=red!20, drop shadow, minimum width=14em,minimum height=4em] (decoder) at ([xshift=1cm]encoder.east) {Transformer 解码器};
\node[anchor=west, rounded corners=3pt, fill=red!20, drop shadow, minimum width=14em,minimum height=4em] (decoder) at ([xshift=0.8cm]encoder.east) {Transformer 解码器};
\node[anchor=north,word] (en1) at ([yshift=-1.3em,xshift=-3em]encoder.south) {};
\node[anchor=north,word] (en2) at ([yshift=-1.3em]encoder.south) {};
......
......@@ -4,48 +4,48 @@
%%% outline
%-------------------------------------------------------------------------
\begin{tikzpicture}
\tikzstyle{word} = [draw=ugreen!20,minimum size=1.8em, fill=ugreen!40, font=\scriptsize, rounded corners=1pt]
\node[rounded corners=3pt, fill=red!20, drop shadow, minimum width=12em,minimum height=4em] (encoder) at (0,0) {Transformer 编码器 };
\tikzstyle{word} = [draw=ugreen!20,minimum size=1.5em, fill=ugreen!40, font=\scriptsize, rounded corners=1pt]
\node[rounded corners=3pt, fill=red!20, drop shadow, minimum width=11em,minimum height=4em] (encoder) at (0,0) {Transformer 编码器 };
\node[draw=blue!10,anchor=west, rounded corners=2pt, fill=blue!20,minimum width=2.5cm,minimum height=2em] (attention) at ([xshift=0.8cm]encoder.east) {注意力模块};
\node[anchor=west, rounded corners=3pt, fill=red!20, drop shadow, minimum width=14em,minimum height=4em] (decoder) at ([xshift=0.8cm]attention.east) {Transformer 解码器};
\node[anchor=north,word] (en1) at ([yshift=-1.6em,xshift=-4.8em]encoder.south) {hello};
\node[anchor=north,word] (en2) at ([yshift=-1.6em,xshift=-1.6em]encoder.south) {,};
\node[anchor=north,word] (en3) at ([yshift=-1.6em,xshift=1.6em]encoder.south) {world};
\node[anchor=north,word] (en4) at ([yshift=-1.6em,xshift=4.8em]encoder.south) {!};
\node[anchor=north,word] (de1) at ([yshift=-1.6em,xshift=-5.75em]decoder.south) {1};
\node[anchor=north,word] (de2) at ([yshift=-1.6em,xshift=-3.45em]decoder.south) {2};
\node[anchor=north,word] (de3) at ([yshift=-1.6em,xshift=-1.15em]decoder.south) {3};
\node[anchor=north,word] (de4) at ([yshift=-1.6em,xshift=1.15em]decoder.south) {4};
\node[anchor=north,word] (de5) at ([yshift=-1.6em,xshift=3.45em]decoder.south) {5};
\node[anchor=north,word] (de6) at ([yshift=-1.6em,xshift=5.75em]decoder.south) {6};
\node[anchor=south,word] (out1) at ([yshift=1.6em,xshift=-5.75em]decoder.north) {};
\node[anchor=south,word] (out2) at ([yshift=1.6em,xshift=-3.45em]decoder.north) {};
\node[anchor=south,word] (out3) at ([yshift=1.6em,xshift=-1.15em]decoder.north) {};
\node[anchor=south,word] (out4) at ([yshift=1.6em,xshift=1.15em]decoder.north) {};
\node[anchor=south,word] (out5) at ([yshift=1.6em,xshift=3.45em]decoder.north) {};
\node[anchor=south,word] (out6) at ([yshift=1.6em,xshift=5.75em]decoder.north) {};
\draw[-latex, very thick,ublue] ([yshift=0.1em]en1.north) -- ([xshift=-4.8em,yshift=-0.1em]encoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]en2.north) -- ([xshift=-1.6em,yshift=-0.1em]encoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]en3.north) -- ([xshift=1.6em,yshift=-0.1em]encoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]en4.north) -- ([xshift=4.8em,yshift=-0.1em]encoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de1.north) -- ([xshift=-5.75em]decoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de2.north) -- ([xshift=-3.45em]decoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de3.north) -- ([xshift=-1.15em]decoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de4.north) -- ([xshift=1.15em]decoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de5.north) -- ([xshift=3.45em]decoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de6.north) -- ([xshift=5.75em]decoder.south);
\draw[-latex, very thick,ublue] ([xshift=-5.75em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out1.south);
\draw[-latex, very thick,ublue] ([xshift=-3.45em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out2.south);
\draw[-latex, very thick,ublue] ([xshift=-1.15em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out3.south);
\draw[-latex, very thick,ublue] ([xshift=1.15em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out4.south);
\draw[-latex, very thick,ublue] ([xshift=3.45em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out5.south);
\draw[-latex, very thick,ublue] ([xshift=5.75em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out6.south);
\node[anchor=west, rounded corners=3pt, fill=red!20, drop shadow, minimum width=12em,minimum height=4em] (decoder) at ([xshift=0.8cm]attention.east) {Transformer 解码器};
\node[anchor=north,word] (en1) at ([yshift=-1.4em,xshift=-3.6em]encoder.south) {hello};
\node[anchor=north,word] (en2) at ([yshift=-1.4em,xshift=-1.2em]encoder.south) {,};
\node[anchor=north,word] (en3) at ([yshift=-1.4em,xshift=1.2em]encoder.south) {world};
\node[anchor=north,word] (en4) at ([yshift=-1.4em,xshift=3.6em]encoder.south) {!};
\node[anchor=north,word] (de1) at ([yshift=-1.4em,xshift=-5em]decoder.south) {1};
\node[anchor=north,word] (de2) at ([yshift=-1.4em,xshift=-3em]decoder.south) {2};
\node[anchor=north,word] (de3) at ([yshift=-1.4em,xshift=-1 em]decoder.south) {3};
\node[anchor=north,word] (de4) at ([yshift=-1.4em,xshift=1em]decoder.south) {4};
\node[anchor=north,word] (de5) at ([yshift=-1.4em,xshift=3em]decoder.south) {5};
\node[anchor=north,word] (de6) at ([yshift=-1.4em,xshift=5em]decoder.south) {6};
\node[anchor=south,word] (out1) at ([yshift=1.4em,xshift=-5em]decoder.north) {};
\node[anchor=south,word] (out2) at ([yshift=1.4em,xshift=-3em]decoder.north) {};
\node[anchor=south,word] (out3) at ([yshift=1.4em,xshift=-1em]decoder.north) {};
\node[anchor=south,word] (out4) at ([yshift=1.4em,xshift=1em]decoder.north) {};
\node[anchor=south,word] (out5) at ([yshift=1.4em,xshift=3em]decoder.north) {};
\node[anchor=south,word] (out6) at ([yshift=1.4em,xshift=5em]decoder.north) {};
\draw[-latex, very thick,ublue] ([yshift=0.1em]en1.north) -- ([xshift=-3.6em,yshift=-0.1em]encoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]en2.north) -- ([xshift=-1.2em,yshift=-0.1em]encoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]en3.north) -- ([xshift=1.2em,yshift=-0.1em]encoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]en4.north) -- ([xshift=3.6em,yshift=-0.1em]encoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de1.north) -- ([xshift=-5em]decoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de2.north) -- ([xshift=-3em]decoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de3.north) -- ([xshift=-1em]decoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de4.north) -- ([xshift=1em]decoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de5.north) -- ([xshift=3em]decoder.south);
\draw[-latex, very thick,ublue] ([yshift=0.1em]de6.north) -- ([xshift=5em]decoder.south);
\draw[-latex, very thick,ublue] ([xshift=-5em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out1.south);
\draw[-latex, very thick,ublue] ([xshift=-3em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out2.south);
\draw[-latex, very thick,ublue] ([xshift=-1em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out3.south);
\draw[-latex, very thick,ublue] ([xshift=1em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out4.south);
\draw[-latex, very thick,ublue] ([xshift=3em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out5.south);
\draw[-latex, very thick,ublue] ([xshift=5em,yshift=0.1em]decoder.north) -- ([yshift=-0.1em]out6.south);
\draw[-latex, very thick, ublue] (encoder.east) -- (attention.west);
\draw[-latex, very thick, ublue] (attention.east) -- (decoder.west);
......
\definecolor{beige}{rgb}{0.96, 0.96, 0.86}
\definecolor{aliceblue}{rgb}{0.94, 0.97, 1.0}
\definecolor{brown(traditional)}{rgb}{0.59, 0.29, 0.0}
\definecolor{taupegray}{rgb}{0.55, 0.52, 0.54}
\definecolor{bananamania}{rgb}{0.98, 0.91, 0.71}
\definecolor{beaublue}{rgb}{0.74, 0.83, 0.9}
%%% outline
%-------------------------------------------------------------------------
\begin{tikzpicture}
\tikzstyle{module} = [draw=taupegray,very thick,rounded corners=2pt,inner ysep=8pt,font=\footnotesize,align=center,fill=yellow!15]
\tikzstyle{box} = [draw=taupegray,very thick,rounded corners=4pt,inner ysep=4pt,inner xsep=8pt,fill=ugreen!10,drop shadow];
\tikzstyle{line} = [very thick,-latex];
\node[module, minimum width=8em] (encoder) at (0,0) {编码器组件};
\node[module,anchor=west, minimum width=8em] (decoder) at ([xshift=4em]encoder.east){解码器组件};
\node[module,anchor=west, minimum width=8em] (decoder2) at ([xshift=4em]decoder.east){解码器组件};
\node[module,anchor=north, minimum width=6em,font=\scriptsize,inner ysep=4pt] (deinput) at ([yshift=-2em]decoder2.south){解码端输入};
\node[anchor=south,font=\footnotesize] (mod1) at ([yshift=0.4em]encoder.north){\small\bfnew{编码器模块}};
\node[anchor=south,font=\footnotesize] (mod2) at ([yshift=0.4em]decoder.north){\small\bfnew{重排序模块}};
\node[anchor=south,font=\footnotesize] (mod3) at ([yshift=0.4em]decoder2.north){\small\bfnew{解码端}};
\begin{pgfonlayer}{background}
{
\node[box][fit=(encoder)(mod1)] (box1) {};
\node[box][fit=(decoder)(mod2)] (box2) {};
\node[box][fit=(decoder2)(mod3)] (box3) {};
}
\end{pgfonlayer}
\node[anchor=north,font=\scriptsize,align=center] (w1) at ([yshift=-2em]encoder.south){\scriptsize\bfnew{There exist different} \\ \scriptsize\bfnew{opinions on this question}};
\node[anchor=north,font=\scriptsize,align=center] (w2) at ([yshift=-2em]decoder.south){\scriptsize\bfnew{There exist different} \\ \scriptsize\bfnew{opinions on this question}};
\node[anchor=north,font=\scriptsize,text=gray] (w3) at ([yshift=0.6em]w2.south){\scriptsize\bfnew{(copy source sentence)}};
\node[anchor=south,font=\scriptsize,align=center] (w4) at ([yshift=1.6em]box2.north){\scriptsize\bfnew{on this question} \\ \scriptsize\bfnew{There exist different opinions}};
\node[anchor=south,font=\scriptsize,align=center] (w5) at ([yshift=1.6em]box3.north){\tiny\bfnew{\ 这个 \ 问题 \ 存在 \ 不同的 \ 看法}};
\node[font=\tiny] at ([xshift=-0.8em,yshift=-0.6em]encoder.east) {$N\times$};
\node[font=\tiny] at ([xshift=-0.8em,yshift=-0.6em]decoder.east) {$1\times$};
\node[font=\tiny] at ([xshift=-1em,yshift=-0.6em]decoder2.east) {$N$-1$\times$};
\draw[line] (w1.north) -- (box1.south);
\draw[line] (w2.north) -- (box2.south);
\draw[line] (box2.north) -- (w4.south);
\draw[line] (box3.north) -- (w5.south);
\draw[line] (deinput.north) -- (box3.south);
\draw[line] (box1.east) -- (box2.west);
\draw[line] (box2.east) -- (box3.west);
\draw[line,rounded corners=2pt,dotted,brown(traditional)] (w1.south) -- ([yshift=-1.6em]w1.south) -- ([yshift=-2.3em]deinput.south) -- (deinput.south);
\draw[line,rounded corners=2pt,dotted,brown(traditional)] (w4.east) -- ([xshift=0.9em]w4.east) -- ([xshift=-3em]deinput.west) -- (deinput.west);
\end{tikzpicture}
......@@ -2,13 +2,14 @@
\definecolor{Goldenrod}{rgb}{0.85, 0.65, 0.13}
\definecolor{Cerulean}{rgb}{0, 0.48, 0.65}
\definecolor{Gray}{rgb}{0.5, 0.5, 0.5}
\tikzstyle{emb} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=0.85cm,text centered,draw=black!70,fill=Melon!25]
\tikzstyle{sa} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=1cm,text centered,draw=black!70,fill=Goldenrod!25]
\tikzstyle{edsa} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=1.5cm,text centered,align=center,draw=black!70,fill=Goldenrod!25]
\tikzstyle{an} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=0.7cm,text centered,draw=black!70,fill=ugreen!20]
\tikzstyle{ff} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=1cm,text centered,align=center,draw=black!70,fill=Cerulean!10]
\tikzstyle{linear} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=0.7cm,text centered,draw=black!70,fill=Gray!20]
\tikzstyle{softmax} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=0.7cm,text centered,draw=black!70,fill=Melon!40]
\definecolor{aliceblue}{rgb}{0.94, 0.97, 1.0}
\tikzstyle{emb} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=0.85cm,text centered,draw=black!70,fill=red!15]
\tikzstyle{sa} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=1cm,text centered,draw=black!70,fill=yellow!20]
\tikzstyle{edsa} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=1.5cm,text centered,align=center,draw=black!70,fill=yellow!20]
\tikzstyle{an} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=0.7cm,text centered,draw=black!70,fill=aliceblue]
\tikzstyle{ff} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=1cm,text centered,align=center,draw=black!70,fill=orange!20]
\tikzstyle{linear} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=0.7cm,text centered,draw=black!70,fill=green!20]
\tikzstyle{softmax} = [rectangle,very thick,rounded corners,minimum width=3cm,minimum height=0.7cm,text centered,draw=black!70,fill=blue!20]
\begin{tikzpicture}[node distance = 0,scale = 0.7]
\tikzstyle{every node}=[scale=0.7]
%left
......@@ -23,7 +24,7 @@
\node(left_Add_bottom)[an,above of = left_Self,yshift=1.1cm]{\textbf{Add$\&\&$Norm}};
\node(left_Feed)[ff,above of = left_Add_bottom,yshift=1.2cm]{\textbf{Feed}\\\textbf{Forward}};
\node(left_Add_top)[an,above of = left_Feed,yshift=1.1cm]{\textbf{Add$\&\&$Norm}};
\node(left_text_bottom)[below of = left_Emb,xshift=0cm,yshift=-1.2cm,scale=1]{\small\sffamily\bfseries{爱我的}};
\node(left_text_bottom)[below of = left_Emb,xshift=0cm,yshift=-1.2cm,scale=1]{\small\sffamily\bfseries{\quad\quad 我的\quad }};
\draw [->,very thick,draw=black!70]([yshift=-0.5cm]left_Emb.south)--(left_Emb.south);
\draw [->,very thick,draw=black!70](left_Emb.north)--(left_cir.south);
\draw [->,very thick,draw=black!70](left_cir.north)--(left_Self.south);
......
......@@ -17,7 +17,7 @@
\begin{tabular}{C{.20\textwidth}C{.20\textwidth}C{.20\textwidth}C{.20\textwidth}}
\setlength{\tabcolsep}{0pt}
\subfigure [\footnotesize{自注意力}] {
\begin{tabular}{cc}
\begin{tabular}{ccC{1em}}
\setlength{\tabcolsep}{0pt}
~
&
......@@ -57,24 +57,27 @@
0.5531 & 0.0332 & 0.0296 & 0.0552 & 0.0389 & 0.0000 \\
\end{tabular}
&
\end{tabular}
}
&
\subfigure [\footnotesize{编码-解码注意力}] {
\setlength{\tabcolsep}{0pt}
\begin{tabular}{cc}
\begin{tabular}{ccC{1em}}
\setlength{\tabcolsep}{0pt}
~
&
\begin{tikzpicture}
\begin{scope}
\node [inner sep=1.5pt] (w1) at (0,0) {\small{$1$} };
\foreach \x/\y/\z in {2/1/$2$, 3/2/$3$, 4/3/$4$, 5/4/$5$, 6/5/$6$}
{
\node [inner sep=1.5pt,anchor=south west] (w\x) at ([xshift=1.15em]w\y.south west) {\small{\z} };
}
\end{scope}
\end{tikzpicture}
\\
\renewcommand\arraystretch{1}
......@@ -102,6 +105,7 @@
0.3603 & 0.3324 & 0.4163 & 0.2022 & 0.0658 & 0.0000 \\
\end{tabular}
&
\end{tabular}
}
\end{tabular}
......
......@@ -521,7 +521,7 @@ b &=& \omega_{\textrm{high}}\cdot |\seq{x}| \label{eq:14-4}
\begin{itemize}
\vspace{0.5em}
\item 基于层级知识蒸馏的方法\upcite{Li2019HintBasedTF}。由于自回归模型和非自回归模型的结构相差不大,因此可以将翻译质量更高的自回归模型作为“教师”,通过给非自回归模型提供监督信号,使其逐块的学习前者的分布。研究者发现了两点非常有意思的现象:1)非自回归模型输出的重复单词的位置的隐藏状态非常相似。2)非自回归模型的注意力分布比自回归模型的分布更加分散。这两点发现启发了研究者使用自回归模型中的隐层状态来指导非自回归模型学习。通过计算两个模型隐层状态的距离以及注意力矩阵的{\red{KL散度}}作为额外的损失来帮助非自回归模型的训练过程。
\item 基于层级知识蒸馏的方法\upcite{Li2019HintBasedTF}。由于自回归模型和非自回归模型的结构相差不大,因此可以将翻译质量更高的自回归模型作为“教师”,通过给非自回归模型提供监督信号,使其逐块的学习前者的分布。研究者发现了两点非常有意思的现象:1)非自回归模型输出的重复单词的位置的隐藏状态非常相似。2)非自回归模型的注意力分布比自回归模型的分布更加分散。这两点发现启发了研究者使用自回归模型中的隐层状态来指导非自回归模型学习。通过计算两个模型隐层状态的距离以及注意力矩阵的KL散度\footnote{KL散度即相对熵}作为额外的损失来帮助非自回归模型的训练过程。
\vspace{0.5em}
\item 基于模仿学习的方法\upcite{Wei2019ImitationLF}。这种观点认为非自回归模型可以从性能优越的自回归模型中学得知识。模仿学习是强化学习中的一个概念,即从专家那里学习正确的行为,与监督学习很相似\upcite{Ho2016ModelFreeIL,Ho2016GenerativeAI,Duan2017OneShotIL}。与其不同的是,模仿学习不是照搬专家的行为,而是学习专家为什么要那样做。换句话说,学习的不是专家的镜像,而是一个专家的行为分布。这里,可以将自回归模型作为专家,非自回归模型学习不同时间步和不同层的中的解码状态,最后将模仿学习的损失与交叉熵损失加权求和后作为最终的优化目标。
\vspace{0.5em}
......@@ -564,7 +564,7 @@ b &=& \omega_{\textrm{high}}\cdot |\seq{x}| \label{eq:14-4}
%----------------------------------------------------------------------
\begin{figure}[htp]
\centering
% \input{}
\input{./Chapter14/Figures/figure-reranking}
\caption{引入重排序模块的非自回归模型}
\label{fig:14-22}
\end{figure}
......@@ -578,7 +578,7 @@ b &=& \omega_{\textrm{high}}\cdot |\seq{x}| \label{eq:14-4}
\parinterval 如果一次并行生成整个序列,往往会导致单词之间的关系很难捕捉,因此也限制了这类方法的能力。即使生成了错误的译文单词,这类方法也无法修改。针对这些问题,也可以使用迭代式的生成方式\upcite{Lee2018DeterministicNN,Ghazvininejad2019MaskPredictPD,Kasai2020NonAutoregressiveMT}。这种方法放弃了一次生成最终的目标句子,而是将解码出的文本再重新送给解码器,在每次迭代中来改进之前生成的单词,可以理解为句子级上的自回归模型。这样做的好处在于,每次迭代的过程中可以利用已经生成的部分翻译结果,来指导其它部分的生成。
\parinterval\ref{fig:14-18}展示了这种方法的简单示例。它拥有一个解码器和$N$个编码器。解码器首先预测出目标句子的长度,然后将输入$\seq{x}$按照长度复制出$\seq{x'}$作为第一个解码器的输入,之后生成$\seq{y'}$出作为第一轮迭代的输出。接下来再把$\seq{y'}$输入给解码器2输出$\seq{y''}$,以此类推。那么迭代到什么时候结束呢?一种简单的做法是提前制定好迭代次数,这种方法能够自主地对生成句子的质量和效率进行平衡。另一种称之为“自适应”的方法,具体是通过计算当前生成的句子上一次生成的变化量来自动停止,例如,使用{\red 杰卡德相似系数}作为变化量函数。
\parinterval\ref{fig:14-18}展示了这种方法的简单示例。它拥有一个解码器和$N$个编码器。解码器首先预测出目标句子的长度,然后将输入$\seq{x}$按照长度复制出$\seq{x'}$作为第一个解码器的输入,之后生成$\seq{y'}$出作为第一轮迭代的输出。接下来再把$\seq{y'}$输入给解码器2输出$\seq{y''}$,以此类推。那么迭代到什么时候结束呢?一种简单的做法是提前制定好迭代次数,这种方法能够自主地对生成句子的质量和效率进行平衡。另一种称之为“自适应”的方法,具体是通过计算当前生成的句子上一次生成的变化量来自动停止,例如,使用杰卡德相似系数\footnote{杰卡德相似系数是衡量有限样本集之间的相似性与差异性的一种指标,杰卡德相似系数值越大,样本相似度越高。}作为变化量函数。
%----------------------------------------------
\begin{figure}[htp]
......@@ -593,7 +593,7 @@ b &=& \omega_{\textrm{high}}\cdot |\seq{x}| \label{eq:14-4}
\parinterval 另一种方法借鉴了BERT的思想\upcite{devlin2019bert},提出了一种新的解码方法:Mask-Predict\upcite{Ghazvininejad2019MaskPredictPD}
\parinterval 类似于BERT的[CLS],该方法在源语句子的最前面加上了一个特殊符号[LENGTH]作为输入,用来预测目标句的长度$n$。之后,将特殊符[MASK](与BERT中的[MASK]有相似的含义)复制$n$次作为解码器的输入,然后用非自回归的方式生成目标端所有的词。这样生成的翻译可能是比较差的,因此可以将第一次生成出的这些词中不确定(即生成概率比较低)的一些词再“擦”掉,依据目标端剩余的单词以及源语言句子重新进行预测,不断迭代,直到满足停止条件为止。图\ref{fig:14-19}给出了一个示例。
\parinterval 类似于BERT的[CLS],该方法在源语句子的最前面加上了一个特殊符号[LEN]作为输入,用来预测目标句的长度$n$。之后,将特殊符[Mask](与BERT中的[Mask]有相似的含义)复制$n$次作为解码器的输入,然后用非自回归的方式生成目标端所有的词。这样生成的翻译可能是比较差的,因此可以将第一次生成出的这些词中不确定(即生成概率比较低)的一些词再“擦”掉,依据目标端剩余的单词以及源语言句子重新进行预测,不断迭代,直到满足停止条件为止。图\ref{fig:14-19}给出了一个示例。
%----------------------------------------------
\begin{figure}[htp]
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论