Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
T
Toy-MT-Introduction
概览
Overview
Details
Activity
Cycle Analytics
版本库
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
问题
0
Issues
0
列表
Board
标记
里程碑
合并请求
0
Merge Requests
0
CI / CD
CI / CD
流水线
作业
日程表
图表
维基
Wiki
代码片段
Snippets
成员
Collapse sidebar
Close sidebar
活动
图像
聊天
创建新问题
作业
提交
Issue Boards
Open sidebar
单韦乔
Toy-MT-Introduction
Commits
714d515e
Commit
714d515e
authored
Dec 19, 2019
by
xiaotong
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update
parent
e46c4b29
隐藏空白字符变更
内嵌
并排
正在显示
1 个修改的文件
包含
31 行增加
和
31 行删除
+31
-31
Section06-Neural-Machine-Translation/section06.tex
+31
-31
没有找到文件。
Section06-Neural-Machine-Translation/section06.tex
查看文件 @
714d515e
...
@@ -1216,14 +1216,14 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
...
@@ -1216,14 +1216,14 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
\node
[anchor=north] (w) at ([yshift=3pt]one.south)
{
\scriptsize
{
\color
{
ugreen
}
you
}}
;
\node
[anchor=north] (w) at ([yshift=3pt]one.south)
{
\scriptsize
{
\color
{
ugreen
}
you
}}
;
\node
[anchor=north west] (words) at ([xshift=10pt]one.north east)
{
\scriptsize
{$
\begin
{
matrix
}
\langle\textrm
{
eos
}
\rangle
\\
\langle\textrm
{
sos
}
\rangle
\\
\textrm
{
Do
}
\\
\vdots
\\
\textrm
{
know
}
\\
\textrm
{
you
}
\\
\textrm
{
?
}
\\
\textrm
{
have
}
\end
{
matrix
}$}}
;
\node
[anchor=north west] (words) at ([xshift=10pt]one.north east)
{
\scriptsize
{$
\begin
{
matrix
}
\langle\textrm
{
eos
}
\rangle
\\
\langle\textrm
{
sos
}
\rangle
\\
\textrm
{
Do
}
\\
\vdots
\\
\textrm
{
know
}
\\
\textrm
{
you
}
\\
\textrm
{
?
}
\\
\textrm
{
have
}
\end
{
matrix
}$}}
;
\node
[anchor=north west] (mat) at ([xshift=-6pt]words.north east)
{
\scriptsize
{$
\node
[anchor=north west] (mat) at ([xshift=-6pt]words.north east)
{
\scriptsize
{$
\begin
{
bmatrix
}
\begin
{
bmatrix
}
.
1
&
-
4
&
\cdots
&
2
\\
.
1
&
-
4
&
\cdots
&
2
\\
5
&
2
&
\cdots
&
.
2
\\
5
&
2
&
\cdots
&
.
2
\\
2
&
.
1
&
\cdots
&
.
3
\\
2
&
.
1
&
\cdots
&
.
3
\\
\vdots
&
\vdots
&
\ddots
&
\vdots
\\
\vdots
&
\vdots
&
\ddots
&
\vdots
\\
0
&
.
8
&
\cdots
&
4
\\
0
&
.
8
&
\cdots
&
4
\\
-
1
&
-
2
&
\cdots
&
-
3
\\
-
1
&
-
2
&
\cdots
&
-
3
\\
.
7
&
.
5
&
\cdots
&
3
\\
.
7
&
.
5
&
\cdots
&
3
\\
-
2
&
.
3
&
\cdots
&
.
1
-
2
&
.
3
&
\cdots
&
.
1
\end
{
bmatrix
}$
\end
{
bmatrix
}$
}}
;
}}
;
...
@@ -1329,13 +1329,13 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
...
@@ -1329,13 +1329,13 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
\begin{scope}
\begin{scope}
\coordinate
(start) at (8.5
\base
,0.1
\base
);
\coordinate
(start) at (8.5
\base
,0.1
\base
);
\node
[anchor=center,minimum width=5.7em,minimum height=1.3em,draw,rounded corners=0.3em] (hidden) at (start)
{}
;
\node
[anchor=center,minimum width=5.7em,minimum height=1.3em,draw,rounded corners=0.3em] (hidden) at (start)
{}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!20] (cell01) at ([xshift=0.2em]hidden.west)
{
\scriptsize
{
.2
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!20] (cell01) at ([xshift=0.2em]hidden.west)
{
\scriptsize
{
.2
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!10] (cell02) at (cell01.east)
{
\scriptsize
{
-1
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!10] (cell02) at (cell01.east)
{
\scriptsize
{
-1
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=white] (cell03) at (cell02.east)
{
\scriptsize
{$
\cdots
$}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=white] (cell03) at (cell02.east)
{
\scriptsize
{$
\cdots
$}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!50] (cell04) at (cell03.east)
{
\scriptsize
{
5
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!50] (cell04) at (cell03.east)
{
\scriptsize
{
5
}}
;
\node
[anchor=south,minimum width=10.9em,minimum height=1.3em,draw,rounded corners=0.3em] (target) at ([yshift=1.5em]hidden.north)
{}
;
\node
[anchor=south,minimum width=10.9em,minimum height=1.3em,draw,rounded corners=0.3em] (target) at ([yshift=1.5em]hidden.north)
{}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!10] (cell11) at ([xshift=0.2em]target.west)
{
\scriptsize
{
-2
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!10] (cell11) at ([xshift=0.2em]target.west)
{
\scriptsize
{
-2
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!10] (cell12) at (cell11.east)
{
\scriptsize
{
-1
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!10] (cell12) at (cell11.east)
{
\scriptsize
{
-1
}}
;
...
@@ -1365,7 +1365,7 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
...
@@ -1365,7 +1365,7 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
\filldraw
[fill=red!20,draw=white] (target.south west) -- (target.south east) -- ([xshift=-0.2em,yshift=0.1em]hidden.north east) -- ([xshift=0.2em,yshift=0.1em]hidden.north west);
\filldraw
[fill=red!20,draw=white] (target.south west) -- (target.south east) -- ([xshift=-0.2em,yshift=0.1em]hidden.north east) -- ([xshift=0.2em,yshift=0.1em]hidden.north west);
\draw
[->,thick] ([xshift=0.2em,yshift=0.1em]hidden.north west) -- (target.south west);
\draw
[->,thick] ([xshift=0.2em,yshift=0.1em]hidden.north west) -- (target.south west);
\draw
[->,thick] ([xshift=-0.2em,yshift=0.1em]hidden.north east) -- (target.south east);
\draw
[->,thick] ([xshift=-0.2em,yshift=0.1em]hidden.north east) -- (target.south east);
\node
[rounded corners=0.3em] (softmax) at ([yshift=1.25em]target.north)
{
\scriptsize
{$
p
(
\hat
{
s
}_
i
)=
\frac
{
e
^{
\hat
{
s
}_
i
}}{
\sum
_
j e
^{
\hat
{
s
}_
j
}}$}}
;
\node
[rounded corners=0.3em] (softmax) at ([yshift=1.25em]target.north)
{
\scriptsize
{$
p
(
\hat
{
s
}_
i
)=
\frac
{
e
^{
\hat
{
s
}_
i
}}{
\sum
_
j e
^{
\hat
{
s
}_
j
}}$}}
;
\begin{pgfonlayer}
{
background
}
\begin{pgfonlayer}
{
background
}
\filldraw
[fill=blue!20,draw=white] ([yshift=0.1em]cell11.north west)
{
[rounded corners=0.3em] -- (softmax.west)
}
-- (label1.south west) -- (label8.south east)
{
[rounded corners=0.3em] -- (softmax.east)
}
-- ([yshift=0.1em]cell18.north east) -- ([yshift=0.1em]cell11.north west);
\filldraw
[fill=blue!20,draw=white] ([yshift=0.1em]cell11.north west)
{
[rounded corners=0.3em] -- (softmax.west)
}
-- (label1.south west) -- (label8.south east)
{
[rounded corners=0.3em] -- (softmax.east)
}
-- ([yshift=0.1em]cell18.north east) -- ([yshift=0.1em]cell11.north west);
...
@@ -3002,9 +3002,9 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
...
@@ -3002,9 +3002,9 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
\item
对
$
\textrm
{
P
}
(
y
_
j|
\textbf
{
y
}_{
<j
}
,
\textbf
{
x
}
)
$
进行乘积会导致长句的概率很低
\item
对
$
\textrm
{
P
}
(
y
_
j|
\textbf
{
y
}_{
<j
}
,
\textbf
{
x
}
)
$
进行乘积会导致长句的概率很低
\item
模型本身并没有考虑每个源语言单词被使用的程度,比如一个单词可能会被翻译了很多``次''
\item
模型本身并没有考虑每个源语言单词被使用的程度,比如一个单词可能会被翻译了很多``次''
\end{itemize}
\end{itemize}
\item
<2-> 因此,解码时会使用其它特征与
$
\textrm
{
P
}
(
\textbf
{
y
}
|
\textbf
{
x
}
)
$
一起组成模型得分
$
score
(
\textbf
{
y
}
,
\textbf
{
x
}
)
$
,
$
score
(
\textbf
{
y
}
,
\textbf
{
x
}
)
$
也作为beam search
的排序依据
\item
<2-> 因此,解码时会使用其它特征与
$
\textrm
{
P
}
(
\textbf
{
y
}
|
\textbf
{
x
}
)
$
一起组成模型得分
$
\textrm
{
score
}
(
\textbf
{
y
}
,
\textbf
{
x
}
)
$
,
$
\textrm
{
score
}
(
\textbf
{
y
}
,
\textbf
{
x
}
)
$
也作为beam search
的排序依据
\begin{eqnarray}
\begin{eqnarray}
score
(
\textbf
{
y
}
,
\textbf
{
x
}
)
&
=
&
\textrm
{
P
}
(
\textbf
{
y
}
|
\textbf
{
x
}
)/
\textrm
{
lp
}
(
\textbf
{
y
}
) +
\textrm
{
cp
}
(
\textbf
{
y
}
,
\textbf
{
x
}
)
\nonumber
\\
\textrm
{
score
}
(
\textbf
{
y
}
,
\textbf
{
x
}
)
&
=
&
\textrm
{
P
}
(
\textbf
{
y
}
|
\textbf
{
x
}
)/
\textrm
{
lp
}
(
\textbf
{
y
}
) +
\textrm
{
cp
}
(
\textbf
{
y
}
,
\textbf
{
x
}
)
\nonumber
\\
\textrm
{
lp
}
(
\textbf
{
y
}
)
&
=
&
\frac
{
(5 + |
\textbf
{
y
}
|)
^
\alpha
}{
(5 + 1)
^
\alpha
}
\nonumber
\\
\textrm
{
lp
}
(
\textbf
{
y
}
)
&
=
&
\frac
{
(5 + |
\textbf
{
y
}
|)
^
\alpha
}{
(5 + 1)
^
\alpha
}
\nonumber
\\
\textrm
{
cp
}
(
\textbf
{
y
}
,
\textbf
{
x
}
)
&
=
&
\beta
\cdot
\sum\nolimits
_{
i=1
}^{
|
\textbf
{
x
}
|
}
\log
(
\min
(
\sum\nolimits
_{
j
}^{
|
\textbf
{
y
}
|
}
a
_{
ij
}
, 1)))
\nonumber
\textrm
{
cp
}
(
\textbf
{
y
}
,
\textbf
{
x
}
)
&
=
&
\beta
\cdot
\sum\nolimits
_{
i=1
}^{
|
\textbf
{
x
}
|
}
\log
(
\min
(
\sum\nolimits
_{
j
}^{
|
\textbf
{
y
}
|
}
a
_{
ij
}
, 1)))
\nonumber
\end{eqnarray}
\end{eqnarray}
...
@@ -3077,7 +3077,7 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
...
@@ -3077,7 +3077,7 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
\draw
[-latex']
(enc11) to (enc12);
\draw
[-latex']
(enc11) to (enc12);
\draw
[-latex']
(enc12) to (enc13);
\draw
[-latex']
(enc12) to (enc13);
\draw
[-latex']
(enc13) to (enc14);
\draw
[-latex']
(enc13) to (enc14);
\draw
[-latex']
(enc24) to (enc23);
\draw
[-latex']
(enc24) to (enc23);
\draw
[-latex']
(enc23) to (enc22);
\draw
[-latex']
(enc23) to (enc22);
\draw
[-latex']
(enc22) to (enc21);
\draw
[-latex']
(enc22) to (enc21);
...
@@ -3105,7 +3105,7 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
...
@@ -3105,7 +3105,7 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
\draw
[-latex']
([xshift=-2pt]enc11.north) to [out=150,in=-150] ([xshift=-2pt]enc31.south);
\draw
[-latex']
([xshift=-2pt]enc11.north) to [out=150,in=-150] ([xshift=-2pt]enc31.south);
\draw
[-latex']
([xshift=-2pt]enc12.north) to [out=150,in=-150] ([xshift=-2pt]enc32.south);
\draw
[-latex']
([xshift=-2pt]enc12.north) to [out=150,in=-150] ([xshift=-2pt]enc32.south);
\draw
[-latex']
([xshift=-2pt]enc14.north) to [out=150,in=-150] ([xshift=-2pt]enc34.south);
\draw
[-latex']
([xshift=-2pt]enc14.north) to [out=150,in=-150] ([xshift=-2pt]enc34.south);
\draw
[-latex']
(enc22) to (enc32);
\draw
[-latex']
(enc22) to (enc32);
\draw
[-latex']
(enc21) to (enc31);
\draw
[-latex']
(enc21) to (enc31);
\draw
[-latex']
(enc24) to (enc34);
\draw
[-latex']
(enc24) to (enc34);
...
@@ -3113,19 +3113,19 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
...
@@ -3113,19 +3113,19 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
\draw
[-latex']
([xshift=-2pt]enc31.north) to [out=150,in=-150] ([xshift=-2pt]enc51.south);
\draw
[-latex']
([xshift=-2pt]enc31.north) to [out=150,in=-150] ([xshift=-2pt]enc51.south);
\draw
[-latex']
([xshift=-2pt]enc32.north) to [out=150,in=-150] ([xshift=-2pt]enc52.south);
\draw
[-latex']
([xshift=-2pt]enc32.north) to [out=150,in=-150] ([xshift=-2pt]enc52.south);
\draw
[-latex']
([xshift=-2pt]enc34.north) to [out=150,in=-150] ([xshift=-2pt]enc54.south);
\draw
[-latex']
([xshift=-2pt]enc34.north) to [out=150,in=-150] ([xshift=-2pt]enc54.south);
\draw
[-latex']
(enc31) to (enc41);
\draw
[-latex']
(enc31) to (enc41);
\draw
[-latex']
(enc32) to (enc42);
\draw
[-latex']
(enc32) to (enc42);
\draw
[-latex']
(enc34) to (enc44);
\draw
[-latex']
(enc34) to (enc44);
\draw
[-latex']
(enc41) to (enc51);
\draw
[-latex']
(enc41) to (enc51);
\draw
[-latex']
(enc42) to (enc52);
\draw
[-latex']
(enc42) to (enc52);
\draw
[-latex']
(enc44) to (enc54);
\draw
[-latex']
(enc44) to (enc54);
\draw
[-latex']
(enc51) to (enc61);
\draw
[-latex']
(enc51) to (enc61);
\draw
[-latex']
(enc52) to (enc62);
\draw
[-latex']
(enc52) to (enc62);
\draw
[-latex']
(enc54) to (enc64);
\draw
[-latex']
(enc54) to (enc64);
\draw
[-latex']
(enc61) to ([yshift=
\base
]enc61.north);
\draw
[-latex']
(enc61) to ([yshift=
\base
]enc61.north);
\draw
[-latex']
(enc62) to ([yshift=
\base
]enc62.north);
\draw
[-latex']
(enc62) to ([yshift=
\base
]enc62.north);
\draw
[-latex']
(enc64) to ([yshift=
\base
]enc64.north);
\draw
[-latex']
(enc64) to ([yshift=
\base
]enc64.north);
...
@@ -3138,32 +3138,32 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
...
@@ -3138,32 +3138,32 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
\node
[rnnnode,fill=green!20,right=\base of decemb1]
(decemb2)
{}
;
\node
[rnnnode,fill=green!20,right=\base of decemb1]
(decemb2)
{}
;
\node
[rnnnode,draw=white,fill=white,right=\base of decemb2]
(decemb3)
{$
\cdots
$}
;
\node
[rnnnode,draw=white,fill=white,right=\base of decemb2]
(decemb3)
{$
\cdots
$}
;
\node
[rnnnode,fill=green!20,right=\base of decemb3]
(decemb4)
{}
;
\node
[rnnnode,fill=green!20,right=\base of decemb3]
(decemb4)
{}
;
\node
[rnnnode,above=\base of decemb1]
(dec11)
{}
;
\node
[rnnnode,above=\base of decemb1]
(dec11)
{}
;
\node
[rnnnode,above=\base of decemb2]
(dec12)
{}
;
\node
[rnnnode,above=\base of decemb2]
(dec12)
{}
;
\node
[rnnnode,draw=white,fill=white,above=\base of decemb3]
(dec13)
{$
\cdots
$}
;
\node
[rnnnode,draw=white,fill=white,above=\base of decemb3]
(dec13)
{$
\cdots
$}
;
\node
[rnnnode,above=\base of decemb4]
(dec14)
{}
;
\node
[rnnnode,above=\base of decemb4]
(dec14)
{}
;
\node
[rnnnode,above=\base of dec11]
(dec21)
{}
;
\node
[rnnnode,above=\base of dec11]
(dec21)
{}
;
\node
[rnnnode,above=\base of dec12]
(dec22)
{}
;
\node
[rnnnode,above=\base of dec12]
(dec22)
{}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec13]
(dec23)
{$
\cdots
$}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec13]
(dec23)
{$
\cdots
$}
;
\node
[rnnnode,above=\base of dec14]
(dec24)
{}
;
\node
[rnnnode,above=\base of dec14]
(dec24)
{}
;
\node
[rnnnode,above=\base of dec21]
(dec31)
{}
;
\node
[rnnnode,above=\base of dec21]
(dec31)
{}
;
\node
[rnnnode,above=\base of dec22]
(dec32)
{}
;
\node
[rnnnode,above=\base of dec22]
(dec32)
{}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec23]
(dec33)
{$
\cdots
$}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec23]
(dec33)
{$
\cdots
$}
;
\node
[rnnnode,above=\base of dec24]
(dec34)
{}
;
\node
[rnnnode,above=\base of dec24]
(dec34)
{}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec31]
(dec41)
{$
\cdots
$}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec31]
(dec41)
{$
\cdots
$}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec32]
(dec42)
{$
\cdots
$}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec32]
(dec42)
{$
\cdots
$}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec33]
(dec43)
{}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec33]
(dec43)
{}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec34]
(dec44)
{$
\cdots
$}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec34]
(dec44)
{$
\cdots
$}
;
\node
[rnnnode,above=\base of dec41]
(dec51)
{}
;
\node
[rnnnode,above=\base of dec41]
(dec51)
{}
;
\node
[rnnnode,above=\base of dec42]
(dec52)
{}
;
\node
[rnnnode,above=\base of dec42]
(dec52)
{}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec43]
(dec53)
{$
\cdots
$}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec43]
(dec53)
{$
\cdots
$}
;
\node
[rnnnode,above=\base of dec44]
(dec54)
{}
;
\node
[rnnnode,above=\base of dec44]
(dec54)
{}
;
\node
[rnnnode,fill=blue!20,above=\base of dec51]
(softmax1)
{}
;
\node
[rnnnode,fill=blue!20,above=\base of dec51]
(softmax1)
{}
;
\node
[rnnnode,fill=blue!20,above=\base of dec52]
(softmax2)
{}
;
\node
[rnnnode,fill=blue!20,above=\base of dec52]
(softmax2)
{}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec53]
(softmax3)
{$
\cdots
$}
;
\node
[rnnnode,draw=white,fill=white,above=\base of dec53]
(softmax3)
{$
\cdots
$}
;
...
@@ -3173,7 +3173,7 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
...
@@ -3173,7 +3173,7 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
\node
[wnode,below=0pt of decemb1]
(decinword1)
{
SOS
}
;
\node
[wnode,below=0pt of decemb1]
(decinword1)
{
SOS
}
;
\node
[wnode,below=0pt of decemb2]
(decinword2)
{
Have
}
;
\node
[wnode,below=0pt of decemb2]
(decinword2)
{
Have
}
;
\node
[wnode,below=0pt of decemb4]
(decinword4)
{
?
}
;
\node
[wnode,below=0pt of decemb4]
(decinword4)
{
?
}
;
\node
[wnode,above=0pt of softmax1]
(decoutword1)
{
Have
}
;
\node
[wnode,above=0pt of softmax1]
(decoutword1)
{
Have
}
;
\ExtractX
{$
(
softmax
2
.north
)
$}
\ExtractX
{$
(
softmax
2
.north
)
$}
\ExtractY
{$
(
decoutword
1
.base
)
$}
\ExtractY
{$
(
decoutword
1
.base
)
$}
...
@@ -3186,15 +3186,15 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
...
@@ -3186,15 +3186,15 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
\draw
[-latex']
(dec11) to (dec12);
\draw
[-latex']
(dec11) to (dec12);
\draw
[-latex']
(dec12) to (dec13);
\draw
[-latex']
(dec12) to (dec13);
\draw
[-latex']
(dec13) to (dec14);
\draw
[-latex']
(dec13) to (dec14);
\draw
[-latex']
(dec21) to (dec22);
\draw
[-latex']
(dec21) to (dec22);
\draw
[-latex']
(dec22) to (dec23);
\draw
[-latex']
(dec22) to (dec23);
\draw
[-latex']
(dec23) to (dec24);
\draw
[-latex']
(dec23) to (dec24);
\draw
[-latex']
(dec31) to (dec32);
\draw
[-latex']
(dec31) to (dec32);
\draw
[-latex']
(dec32) to (dec33);
\draw
[-latex']
(dec32) to (dec33);
\draw
[-latex']
(dec33) to (dec34);
\draw
[-latex']
(dec33) to (dec34);
\draw
[-latex']
(dec51) to (dec52);
\draw
[-latex']
(dec51) to (dec52);
\draw
[-latex']
(dec52) to (dec53);
\draw
[-latex']
(dec52) to (dec53);
\draw
[-latex']
(dec53) to (dec54);
\draw
[-latex']
(dec53) to (dec54);
...
@@ -3202,7 +3202,7 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
...
@@ -3202,7 +3202,7 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
\draw
[-latex']
(decemb1) to (dec11);
\draw
[-latex']
(decemb1) to (dec11);
\draw
[-latex']
(decemb2) to (dec12);
\draw
[-latex']
(decemb2) to (dec12);
\draw
[-latex']
(decemb4) to (dec14);
\draw
[-latex']
(decemb4) to (dec14);
\foreach
\cur
[count=
\prev
from 1] in
{
2,...,5
}
\foreach
\cur
[count=
\prev
from 1] in
{
2,...,5
}
{
{
\draw
[-latex']
(dec
\prev
1) to (dec
\cur
1);
\draw
[-latex']
(dec
\prev
1) to (dec
\cur
1);
...
@@ -4696,7 +4696,7 @@ PE_{(pos,2i+1)} = cos(pos/10000^{2i/d_{model}})
...
@@ -4696,7 +4696,7 @@ PE_{(pos,2i+1)} = cos(pos/10000^{2i/d_{model}})
\item
由于自回归性,Transformer在推断阶段无法进行并行化操作,导致推断速度非常慢!
\item
由于自回归性,Transformer在推断阶段无法进行并行化操作,导致推断速度非常慢!
\item
<2-> 加速手段:Cache(缓存需要重复计算的变量) 、Average Attention Network、Share Attention Network
\item
<2-> 加速手段:
低精度、
Cache(缓存需要重复计算的变量) 、Average Attention Network、Share Attention Network
\end{itemize}
\end{itemize}
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论