Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
T
Toy-MT-Introduction
概览
Overview
Details
Activity
Cycle Analytics
版本库
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
问题
0
Issues
0
列表
Board
标记
里程碑
合并请求
0
Merge Requests
0
CI / CD
CI / CD
流水线
作业
日程表
图表
维基
Wiki
代码片段
Snippets
成员
Collapse sidebar
Close sidebar
活动
图像
聊天
创建新问题
作业
提交
Issue Boards
Open sidebar
NiuTrans
Toy-MT-Introduction
Commits
1485b49e
Commit
1485b49e
authored
Jan 04, 2020
by
Lee
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Update RNN input & output figures
parent
cbd21156
显示空白字符变更
内嵌
并排
正在显示
1 个修改的文件
包含
52 行增加
和
32 行删除
+52
-32
Section06-Neural-Machine-Translation/section06.tex
+52
-32
没有找到文件。
Section06-Neural-Machine-Translation/section06.tex
查看文件 @
1485b49e
...
...
@@ -1134,17 +1134,17 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
%%%------------------------------------------------------------------------------------------------------------
%%% 词嵌入
\begin{frame}
{
模块1:词嵌入层
}
\begin{itemize}
\item
词嵌入的作用是把离散化的单词表示转换为连续空间上的分布式表示
\begin{itemize}
\item
把输入的词转换成唯一对应的词表大小的0-1向量
\item
根据0-1向量,从词嵌入矩阵中取出对应的词嵌入
$
e
()
$
\item
取出的词嵌入
$
e
()
$
作为循环神经网络的输入
\end{itemize}
\end{itemize}
\vspace
{
-1em
}
%%% 图
\begin{center}
\item
词嵌入的作用是把离散化的单词表示转换为连续空间上的分布式表示
\begin{itemize}
\item
<2-> 把输入的词转换成唯一对应的词表大小的0-1向量
\item
<3-> 根据0-1向量,从词嵌入矩阵中取出对应的词嵌入
$
e
()
$
\item
<4-> 取出的词嵌入
$
e
()
$
作为循环神经网络的输入
\end{itemize}
\end{itemize}
\vspace
{
-1em
}
%%% 图
\begin{center}
\hspace*
{
-0.6cm
}
\begin{tikzpicture}
\setlength
{
\base
}{
0.9cm
}
...
...
@@ -1220,7 +1220,12 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
\begin{scope}
\coordinate
(start) at (5.8
\base
,0.3
\base
);
\visible
<2->
{
\node
[anchor=south west] (one) at (start)
{
\scriptsize
{$
\begin
{
bmatrix
}
0
\\
0
\\
0
\\
\vdots
\\
0
\\
{
\color
{
ugreen
}
1
}
\\
0
\\
0
\end
{
bmatrix
}$}}
;
}
\visible
<3->
{
\node
[draw=ugreen,fill=green!20!white,rounded corners=0.3em,minimum width=3.8cm,minimum height=0.9em,anchor=south west] (emb) at ([shift=
{
(1.25cm,0.8cm)
}
]start)
{}
;
}
\node
[anchor=north] (w) at ([yshift=3pt]one.south)
{
\scriptsize
{
\color
{
ugreen
}
you
}}
;
\node
[anchor=north west] (words) at ([xshift=10pt]one.north east)
{
\scriptsize
{$
\begin
{
matrix
}
\langle\textrm
{
eos
}
\rangle
\\
\langle\textrm
{
sos
}
\rangle
\\
\textrm
{
Do
}
\\
\vdots
\\
\textrm
{
know
}
\\
\textrm
{
you
}
\\
\textrm
{
?
}
\\
\textrm
{
have
}
\end
{
matrix
}$}}
;
\node
[anchor=north west] (mat) at ([xshift=-6pt]words.north east)
{
\scriptsize
{$
...
...
@@ -1233,17 +1238,17 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
-
1
&
-
2
&
\cdots
&
-
3
\\
.
7
&
.
5
&
\cdots
&
3
\\
-
2
&
.
3
&
\cdots
&
.
1
\end
{
bmatrix
}$
}}
;
\begin{pgfonlayer}
{
background
}
\node
[draw=ugreen,fill=green!20!white,rounded corners=0.3em,minimum width=3.8cm,minimum height=0.9em,anchor=south west] (emb) at ([shift=
{
(1.25cm,0.8cm)
}
]start)
{}
;
\end{pgfonlayer}
\end
{
bmatrix
}
$}}
;
\draw
[decorate,decoration=
{
brace,mirror
}
] ([shift=
{
(6pt,2pt)
}
]mat.south west) to node [auto,swap,font=
\scriptsize
]
{
词嵌入矩阵
}
([shift=
{
(-6pt,2pt)
}
]mat.south east);
\visible
<3->
{
\draw
[-latex'] ([xshift=-2pt,yshift=-0.65cm]one.east) to ([yshift=-0.65cm]words.west);
}
\visible
<4->
{
\draw
[-latex'] (emb.east) -| ([yshift=0.4cm]mat.north east) node [pos=1,above]
{
\scriptsize
{
RNN输入
}}
;
}
\draw
[-latex'] ([yshift=-0.4cm]w.south) to ([yshift=2pt]w.south);
\node
[anchor=north] (wlabel) at ([yshift=-0.6em]w.south)
{
\scriptsize
{
输入的单词
}}
;
...
...
@@ -1252,21 +1257,21 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
\draw
[->,thick,densely dashed,ugreen] ([yshift=-0.2em]demb3.east) to [out=0,in=180] ([yshift=-1cm]input.west);
\end{tikzpicture}
\end{center}
\end{center}
\end{frame}
%%%------------------------------------------------------------------------------------------------------------
%%% 输出
\begin{frame}
{
模块2:输出层
}
\begin{itemize}
\item
输出层需要得到每个目标语单词的生成概率,进而选取概率最高的词作为输出。但RNN中的隐藏层并不会输出单词概率,而是输出
$
s
$
,其每一行对应一个单词表示
\begin{itemize}
\item
$
s
$
经过权重矩阵
$
W
$
变成
$
\hat
{
s
}$
,其隐藏层维度变换成词表的大小
\item
$
\hat
{
s
}$
经过Softmax变换得到不同词作为输出的概率,即单词
$
i
$
的概率
$
p
_
i
=
\textrm
{
Softmax
}
(
i
)
=
\frac
{
e
^{
\hat
{
s
}_
i
}}{
\sum
_{
j
}
e
^{
\hat
{
s
}_{
j
}}}
$
\item
输出层需要得到每个目标语单词的生成概率,进而选取概率最高的词作为输出。但RNN中的隐藏层并不会输出单词概率,而是输出
$
s
$
,其每一行对应一个单词表示
\begin{itemize}
\item
<2->
$
s
$
经过权重矩阵
$
W
$
变成
$
\hat
{
s
}$
,其隐藏层维度变换成词表的大小
\item
<3->
$
\hat
{
s
}$
经过Softmax变换得到不同词作为输出的概率,即单词
$
i
$
的概率
$
p
_
i
=
\textrm
{
Softmax
}
(
i
)
=
\frac
{
e
^{
\hat
{
s
}_
i
}}{
\sum
_{
j
}
e
^{
\hat
{
s
}_{
j
}}}
$
\end{itemize}
\end{itemize}
%%% 图
\begin{center}
\end{itemize}
%%% 图
\begin{center}
\hspace*
{
-0.6cm
}
\begin{tikzpicture}
\setlength
{
\base
}{
0.9cm
}
...
...
@@ -1349,16 +1354,19 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=white] (cell03) at (cell02.east)
{
\scriptsize
{$
\cdots
$}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!50] (cell04) at (cell03.east)
{
\scriptsize
{
5
}}
;
\visible
<2->
{
\node
[anchor=south,minimum width=10.9em,minimum height=1.3em,draw,rounded corners=0.3em] (target) at ([yshift=1.5em]hidden.north)
{}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!10] (cell11) at ([xshift=0.2em]target.west)
{
\scriptsize
{
-2
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!10] (cell12) at (cell11.east)
{
\scriptsize
{
-1
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!1
0] (cell13) at (cell12.east)
{
\scriptsize
{
.7
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!3
0] (cell13) at (cell12.east)
{
\scriptsize
{
.7
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=white] (cell14) at (cell13.east)
{
\scriptsize
{$
\cdots
$}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!3
0] (cell15) at (cell14.east)
{
\scriptsize
{
6
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!7
0] (cell16) at (cell15.east)
{
\scriptsize
{
-3
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!7
0] (cell15) at (cell14.east)
{
\scriptsize
{
6
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!1
0] (cell16) at (cell15.east)
{
\scriptsize
{
-3
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!10] (cell17) at (cell16.east)
{
\scriptsize
{
-1
}}
;
\node
[anchor=west,minimum width=1em,minimum size=1em,fill=ugreen!20] (cell18) at (cell17.east)
{
\scriptsize
{
.2
}}
;
}
\visible
<3->
{
\node
[anchor=south,minimum width=1em,minimum height=0.2em,fill=ublue!80,inner sep=0pt] (label1) at ([yshift=2.5em]cell11.north)
{}
;
\node
[anchor=west,rotate=90,font=
\tiny
] (w1) at (label1.north)
{$
\langle
$
eos
$
\rangle
$}
;
\node
[anchor=south,minimum width=1em,minimum height=0.3em,fill=ublue!80,inner sep=0pt] (label2) at ([yshift=2.5em]cell12.north)
{}
;
...
...
@@ -1367,26 +1375,38 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
\node
[anchor=west,rotate=90,font=
\tiny
] (w3) at (label3.north)
{
Do
}
;
\node
[anchor=south,font=
\scriptsize
] (w4) at ([yshift=2.5em]cell14.north)
{$
\cdots
$}
;
\node
[anchor=south,minimum width=1em,minimum height=1em,fill=ublue!80,inner sep=0pt] (label5) at ([yshift=2.5em]cell15.north)
{}
;
\alt
<4->
{
\node
[anchor=west,rotate=90,font=
\tiny
] (w5) at (label5.north)
{{
\color
{
red
}
know
}}
;
}
{
\node
[anchor=west,rotate=90,font=
\tiny
] (w5) at (label5.north)
{
know
}
;
}
\node
[anchor=south,minimum width=1em,minimum height=0.1em,fill=ublue!80,inner sep=0pt] (label6) at ([yshift=2.5em]cell16.north)
{}
;
\node
[anchor=west,rotate=90,font=
\tiny
] (w6) at (label6.north)
{
you
}
;
\node
[anchor=south,minimum width=1em,minimum height=0.3em,fill=ublue!80,inner sep=0pt] (label7) at ([yshift=2.5em]cell17.north)
{}
;
\node
[anchor=west,rotate=90,font=
\tiny
] (w7) at (label7.north)
{
?
}
;
\node
[anchor=south,minimum width=1em,minimum height=0.4em,fill=ublue!80,inner sep=0pt] (label8) at ([yshift=2.5em]cell18.north)
{}
;
\node
[anchor=west,rotate=90,font=
\tiny
] (w8) at (label8.north)
{
have
}
;
}
\visible
<2->
{
\filldraw
[fill=red!20,draw=white] (target.south west) -- (target.south east) -- ([xshift=-0.2em,yshift=0.1em]hidden.north east) -- ([xshift=0.2em,yshift=0.1em]hidden.north west);
\draw
[->,thick] ([xshift=0.2em,yshift=0.1em]hidden.north west) -- (target.south west);
\draw
[->,thick] ([xshift=-0.2em,yshift=0.1em]hidden.north east) -- (target.south east);
\node
[anchor=south] () at ([yshift=0.3em]hidden.north)
{
\scriptsize
{$
\hat
{
s
}
=
Ws
$}}
;
}
\visible
<3->
{
\node
[rounded corners=0.3em] (softmax) at ([yshift=1.25em]target.north)
{
\scriptsize
{$
p
(
\hat
{
s
}_
i
)=
\frac
{
e
^{
\hat
{
s
}_
i
}}{
\sum
_
j e
^{
\hat
{
s
}_
j
}}$}}
;
\begin{pgfonlayer}
{
background
}
\filldraw
[fill=blue!20,draw=white] ([yshift=0.1em]cell11.north west)
{
[rounded corners=0.3em] -- (softmax.west)
}
-- (label1.south west) -- (label8.south east)
{
[rounded corners=0.3em] -- (softmax.east)
}
-- ([yshift=0.1em]cell18.north east) -- ([yshift=0.1em]cell11.north west);
\end{pgfonlayer}
\node
[anchor=south] () at ([yshift=0.3em]hidden.north)
{
\scriptsize
{$
\hat
{
s
}
=
Ws
$}}
;
\node
[rounded corners=0.3em] (softmax) at ([yshift=1.25em]target.north)
{
\scriptsize
{$
p
(
\hat
{
s
}_
i
)=
\frac
{
e
^{
\hat
{
s
}_
i
}}{
\sum
_
j e
^{
\hat
{
s
}_
j
}}$}}
;
}
\draw
[-latex'] ([yshift=-0.3cm]hidden.south) to (hidden.south);
\visible
<4->
{
\draw
[-latex'] (w5.east) to ([yshift=0.3cm]w5.east);
}
\coordinate
(tmp) at ([yshift=-3pt]w5.east);
\node
[draw=red,thick,densely dashed,rounded corners=3pt,inner sep=5pt,fit=(cell01) (cell11) (label1) (label8) (target) (hidden) (tmp)] (output)
{}
;
...
...
@@ -1394,7 +1414,7 @@ NLP问题的隐含结构假设 & 无隐含结构假设,端到端学习 \\
\draw
[->,thick,densely dashed,red] ([yshift=-0.2em]softmax3.east) .. controls +(east:2
\base
) and +(west:
\base
) .. (output.west);
\end{tikzpicture}
\end{center}
\end{center}
\end{frame}
%%%------------------------------------------------------------------------------------------------------------
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论