Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
M
mtbookv2
概览
Overview
Details
Activity
Cycle Analytics
版本库
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
问题
0
Issues
0
列表
Board
标记
里程碑
合并请求
0
Merge Requests
0
CI / CD
CI / CD
流水线
作业
日程表
图表
维基
Wiki
代码片段
Snippets
成员
Collapse sidebar
Close sidebar
活动
图像
聊天
创建新问题
作业
提交
Issue Boards
Open sidebar
NiuTrans
mtbookv2
Commits
681229f0
Commit
681229f0
authored
4 years ago
by
孟霞
Browse files
Options
Browse Files
Download
Plain Diff
合并分支 'master' 到 'mengxia'
Master 查看合并请求
!475
parents
dda9e150
62211247
全部展开
显示空白字符变更
内嵌
并排
正在显示
19 个修改的文件
包含
64 行增加
和
108 行删除
+64
-108
Chapter10/Figures/figure-encoder-decoder-process.tex
+1
-1
Chapter10/chapter10.tex
+2
-2
Chapter11/Figures/figure-fairseq-0.tex
+3
-3
Chapter11/Figures/figure-fairseq-2.tex
+3
-3
Chapter11/Figures/figure-fairseq-3.tex
+3
-3
Chapter11/Figures/figure-max-pooling.tex
+2
-2
Chapter11/Figures/figure-single-glu.tex
+5
-4
Chapter11/Figures/figure-standard.tex
+4
-3
Chapter11/Figures/figure-use-cnn-in-sentence-classification.tex
+2
-2
Chapter11/chapter11.tex
+0
-0
Chapter12/Figures/figure-position-of-difference-and-layer-regularization-in-the-model.tex
+2
-2
Chapter12/Figures/figure-position-of-feedforward-neural-network-in-the-model.tex
+2
-2
Chapter12/Figures/figure-position-of-self-attention-mechanism-in-the-model.tex
+2
-2
Chapter12/Figures/figure-transformer-input-and-position-encoding.tex
+2
-2
Chapter12/Figures/figure-transformer.tex
+2
-2
Chapter12/chapter12.tex
+0
-0
Chapter16/Figures/figure-application-process-of-back-translation.tex
+29
-75
Chapter16/chapter16.tex
+0
-0
bibliography.bib
+0
-0
没有找到文件。
Chapter10/Figures/figure-encoder-decoder-process.tex
查看文件 @
681229f0
...
@@ -2,7 +2,7 @@
...
@@ -2,7 +2,7 @@
\begin{scope}
\begin{scope}
\small
{
\small
{
\node
[anchor=south west,minimum width=15em] (source) at (0,0)
{
\textbf
{
源语言
}
:
我
\ \ \ \
对
\ \ \ \
你
\ \ \ \
感到
\ \ \ \
满意
}
;
\node
[anchor=south west,minimum width=15em] (source) at (0,0)
{
\textbf
{
源语言
}
:
\ \
我
\ \ \ \
对
\ \ \ \
你
\ \ \ \
感到
\ \ \ \
满意
\ \
}
;
{
{
\node
[anchor=south west,minimum width=15em] (target) at ([yshift=12em]source.north west)
{
\textbf
{
目标语言
}
: I
\ \
am
\ \ \
satisfied
\ \ \
with
\ \ \
you
}
;
\node
[anchor=south west,minimum width=15em] (target) at ([yshift=12em]source.north west)
{
\textbf
{
目标语言
}
: I
\ \
am
\ \ \
satisfied
\ \ \
with
\ \ \
you
}
;
}
}
...
...
This diff is collapsed.
Click to expand it.
Chapter10/chapter10.tex
查看文件 @
681229f0
...
@@ -84,7 +84,7 @@
...
@@ -84,7 +84,7 @@
\vspace
{
0.3em
}
\vspace
{
0.3em
}
\item
2016年谷歌公司发布了基于多层循环神经网络方法的GNMT系统。该系统集成了当时的神经机器翻译技术,并进行了诸多的改进。它的性能显著优于基于短语的机器翻译系统
\upcite
{
Wu2016GooglesNM
}
,引起了研究者的广泛关注。在之后不到一年的时间里,脸书公司采用卷积神经网络(CNN)研发了新的神经机器翻译系统
\upcite
{
DBLP:journals/corr/GehringAGYD17
}
,实现了比基于循环神经网络(RNN)系统更高的翻译水平,并大幅提升翻译速度。
\item
2016年谷歌公司发布了基于多层循环神经网络方法的GNMT系统。该系统集成了当时的神经机器翻译技术,并进行了诸多的改进。它的性能显著优于基于短语的机器翻译系统
\upcite
{
Wu2016GooglesNM
}
,引起了研究者的广泛关注。在之后不到一年的时间里,脸书公司采用卷积神经网络(CNN)研发了新的神经机器翻译系统
\upcite
{
DBLP:journals/corr/GehringAGYD17
}
,实现了比基于循环神经网络(RNN)系统更高的翻译水平,并大幅提升翻译速度。
\vspace
{
0.3em
}
\vspace
{
0.3em
}
\item
2017年,Ashish Vaswani等人提出了新的翻译模型Transformer。其完全抛弃了CNN、RNN等结构,仅仅通过自注意力机制和前馈神经网络,不需要使用序列对齐的循环框架就展示出强大的性能,并且巧妙地解决了翻译中长距离依赖问题
\upcite
{
NIPS2017
_
7181
}
。Transformer是第一个完全基于注意力机制搭建的模型,不仅训练速度更快,在翻译任务上也获得了更好的结果,一跃成为目前最主流的神经机器翻译框架。
\item
2017年,Ashish Vaswani等人提出了新的翻译模型Transformer。其完全抛弃了CNN、RNN等结构,仅仅通过自注意力机制和前馈神经网络,不需要使用序列对齐的循环框架就展示出强大的性能,并且巧妙地解决了翻译中长距离依赖问题
\upcite
{
vaswani2017attention
}
。Transformer是第一个完全基于注意力机制搭建的模型,不仅训练速度更快,在翻译任务上也获得了更好的结果,一跃成为目前最主流的神经机器翻译框架。
\vspace
{
0.3em
}
\vspace
{
0.3em
}
\end{itemize}
\end{itemize}
...
@@ -1010,7 +1010,7 @@ L(\mathbi{Y},\widehat{\mathbi{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbi{y}_j,\
...
@@ -1010,7 +1010,7 @@ L(\mathbi{Y},\widehat{\mathbi{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbi{y}_j,\
\end{eqnarray}
\end{eqnarray}
%\vspace{0.5em}
%\vspace{0.5em}
\noindent
其中,
$
\gamma
$
是手工设定的梯度大小阈值,
$
\|
\cdot
\|
_
2
$
是
L2
范数,
$
\mathbi
{
w
}
'
$
表示梯度裁剪后的参数。这个公式的含义在于只要梯度大小超过阈值,就按照阈值与当前梯度大小的比例进行放缩。
\noindent
其中,
$
\gamma
$
是手工设定的梯度大小阈值,
$
\|
\cdot
\|
_
2
$
是
$
l
_
2
$
范数,
$
\mathbi
{
w
}
'
$
表示梯度裁剪后的参数。这个公式的含义在于只要梯度大小超过阈值,就按照阈值与当前梯度大小的比例进行放缩。
%----------------------------------------------------------------------------------------
%----------------------------------------------------------------------------------------
% NEW SUBSUB-SECTION
% NEW SUBSUB-SECTION
...
...
This diff is collapsed.
Click to expand it.
Chapter11/Figures/figure-fairseq-0.tex
查看文件 @
681229f0
...
@@ -34,7 +34,7 @@
...
@@ -34,7 +34,7 @@
\node
[anchor=north,word]
(tgt
_
1) at ([yshift=-0.4em]i
_
0.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
(tgt
_
1) at ([yshift=-0.4em]i
_
0.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{$
<
$
s
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{$
<
$
s
os
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
4.south)
{
to
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
4.south)
{
to
}
;
\node
[anchor=north,word]
(tgt
_
2) at ([yshift=-0.4em]i
_
5.south)
{
school
}
;
\node
[anchor=north,word]
(tgt
_
2) at ([yshift=-0.4em]i
_
5.south)
{
school
}
;
...
@@ -103,7 +103,7 @@
...
@@ -103,7 +103,7 @@
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
0.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
0.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{
to
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{
to
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{
school
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{
school
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{$
<
$
/
s
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{$
<
$
eo
s
$
>
$}
;
\foreach
\point
in
{
0,1,2,3
}{
\foreach
\point
in
{
0,1,2,3
}{
\node
[cir,font=\fontsize{6}{6}\selectfont,inner sep=0.8pt]
(c
_
\point
) at (8.2cm+
\point*
2em,7.5cm-1em*
\point
)
{
\bm
{$
\sum
$}}
;
\node
[cir,font=\fontsize{6}{6}\selectfont,inner sep=0.8pt]
(c
_
\point
) at (8.2cm+
\point*
2em,7.5cm-1em*
\point
)
{
\bm
{$
\sum
$}}
;
...
@@ -140,7 +140,7 @@
...
@@ -140,7 +140,7 @@
\node
[anchor=south,word]
(src
_
1) at ([xshift=-2em,yshift=0.4em]r
_
0.north)
{$
<
$
p
$
>
$}
;
\node
[anchor=south,word]
(src
_
1) at ([xshift=-2em,yshift=0.4em]r
_
0.north)
{$
<
$
p
$
>
$}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
0.north)
{
去
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
0.north)
{
去
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
1.north)
{
上学
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
1.north)
{
上学
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
2.north)
{$
<
$
s
$
>
$}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
2.north)
{$
<
$
s
os
$
>
$}
;
\node
[anchor=south,word]
(src
_
2) at ([xshift=2em,yshift=0.4em]r
_
2.north)
{$
<
$
p
$
>
$}
;
\node
[anchor=south,word]
(src
_
2) at ([xshift=2em,yshift=0.4em]r
_
2.north)
{$
<
$
p
$
>
$}
;
...
...
This diff is collapsed.
Click to expand it.
Chapter11/Figures/figure-fairseq-2.tex
查看文件 @
681229f0
...
@@ -34,7 +34,7 @@
...
@@ -34,7 +34,7 @@
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
0.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
0.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{$
<
$
s
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{$
<
$
s
os
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
4.south)
{
to
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
4.south)
{
to
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
5.south)
{
school
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
5.south)
{
school
}
;
...
@@ -98,7 +98,7 @@
...
@@ -98,7 +98,7 @@
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
0.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
0.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{
to
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{
to
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{
school
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{
school
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{$
<
$
/
s
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{$
<
$
eo
s
$
>
$}
;
\foreach
\point
in
{
0,1,2,3
}{
\foreach
\point
in
{
0,1,2,3
}{
\node
[cir,font=\fontsize{6}{6}\selectfont,inner sep=0.8pt]
(c
_
\point
) at (8.2cm+
\point*
2em,7.5cm-1em*
\point
)
{
\bm
{$
\sum
$}}
;
\node
[cir,font=\fontsize{6}{6}\selectfont,inner sep=0.8pt]
(c
_
\point
) at (8.2cm+
\point*
2em,7.5cm-1em*
\point
)
{
\bm
{$
\sum
$}}
;
...
@@ -135,7 +135,7 @@
...
@@ -135,7 +135,7 @@
\node
[anchor=south,word]
(src
_
1) at ([xshift=-2em,yshift=0.4em]r
_
0.north)
{$
<
$
p
$
>
$}
;
\node
[anchor=south,word]
(src
_
1) at ([xshift=-2em,yshift=0.4em]r
_
0.north)
{$
<
$
p
$
>
$}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
0.north)
{
去
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
0.north)
{
去
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
1.north)
{
上学
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
1.north)
{
上学
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
2.north)
{$
<
$
s
$
>
$}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
2.north)
{$
<
$
s
os
$
>
$}
;
\node
[anchor=south,word]
(src
_
2) at ([xshift=2em,yshift=0.4em]r
_
2.north)
{$
<
$
p
$
>
$}
;
\node
[anchor=south,word]
(src
_
2) at ([xshift=2em,yshift=0.4em]r
_
2.north)
{$
<
$
p
$
>
$}
;
...
...
This diff is collapsed.
Click to expand it.
Chapter11/Figures/figure-fairseq-3.tex
查看文件 @
681229f0
...
@@ -34,7 +34,7 @@
...
@@ -34,7 +34,7 @@
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
0.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
0.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{$
<
$
p
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{$
<
$
s
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{$
<
$
s
os
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
4.south)
{
to
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
4.south)
{
to
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
5.south)
{
school
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
5.south)
{
school
}
;
...
@@ -99,7 +99,7 @@
...
@@ -99,7 +99,7 @@
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
0.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
0.south)
{
go
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{
to
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
1.south)
{
to
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{
school
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
2.south)
{
school
}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{$
<
$
/
s
$
>
$}
;
\node
[anchor=north,word]
at ([yshift=-0.4em]i
_
3.south)
{$
<
$
eo
s
$
>
$}
;
\foreach
\point
in
{
0,1,2,3
}{
\foreach
\point
in
{
0,1,2,3
}{
\node
[cir,font=\fontsize{6}{6}\selectfont,inner sep=0.8pt]
(c
_
\point
) at (8.2cm+
\point*
2em,7.5cm-1em*
\point
)
{
\bm
{$
\sum
$}}
;
\node
[cir,font=\fontsize{6}{6}\selectfont,inner sep=0.8pt]
(c
_
\point
) at (8.2cm+
\point*
2em,7.5cm-1em*
\point
)
{
\bm
{$
\sum
$}}
;
...
@@ -136,7 +136,7 @@
...
@@ -136,7 +136,7 @@
\node
[anchor=south,word]
(src
_
1) at ([xshift=-2em,yshift=0.4em]r
_
0.north)
{$
<
$
p
$
>
$}
;
\node
[anchor=south,word]
(src
_
1) at ([xshift=-2em,yshift=0.4em]r
_
0.north)
{$
<
$
p
$
>
$}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
0.north)
{
去
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
0.north)
{
去
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
1.north)
{
上学
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
1.north)
{
上学
}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
2.north)
{$
<
$
s
$
>
$}
;
\node
[anchor=south,word]
at ([yshift=0.4em]r
_
2.north)
{$
<
$
s
os
$
>
$}
;
\node
[anchor=south,word]
(src
_
2) at ([xshift=2em,yshift=0.4em]r
_
2.north)
{$
<
$
p
$
>
$}
;
\node
[anchor=south,word]
(src
_
2) at ([xshift=2em,yshift=0.4em]r
_
2.north)
{$
<
$
p
$
>
$}
;
...
...
This diff is collapsed.
Click to expand it.
Chapter11/Figures/figure-max-pooling.tex
查看文件 @
681229f0
...
@@ -22,8 +22,8 @@
...
@@ -22,8 +22,8 @@
\draw
[->,thick]
([xshift=0.4cm,yshift=-0.4cm]num8.east)--([xshift=1.5cm,yshift=-0.4cm]num8.east);
\draw
[->,thick]
([xshift=0.4cm,yshift=-0.4cm]num8.east)--([xshift=1.5cm,yshift=-0.4cm]num8.east);
\node
(num17)[num,right of = num8,xshift= 2.5cm,fill=red!10]
{
6
}
;
\node
(num17)[num,right of = num8,xshift= 2.5cm,fill=red!10]
{
6
}
;
\node
(num18)[num,right of = num17,xshift= 0.6cm,fill=green!10]
{
3
}
;
\node
(num18)[num,right of = num17,xshift= 0.6cm,fill=green!10]
{
8
}
;
\node
(num19)[num,below of = num17,yshift=-0.6cm,fill=yellow!10]
{
8
}
;
\node
(num19)[num,below of = num17,yshift=-0.6cm,fill=yellow!10]
{
3
}
;
\node
(num20)[num,below of = num18,yshift= -0.6cm,fill=blue!10]
{
4
}
;
\node
(num20)[num,below of = num18,yshift= -0.6cm,fill=blue!10]
{
4
}
;
\node
[right of = num20,xshift= 0.7cm]
{}
;
\node
[right of = num20,xshift= 0.7cm]
{}
;
...
...
This diff is collapsed.
Click to expand it.
Chapter11/Figures/figure-single-glu.tex
查看文件 @
681229f0
...
@@ -63,9 +63,9 @@ $\otimes$: & 按位乘运算 \\
...
@@ -63,9 +63,9 @@ $\otimes$: & 按位乘运算 \\
\draw
[-latex,thick]
(b.east) -- (c2.west);
\draw
[-latex,thick]
(b.east) -- (c2.west);
\draw
[-latex,thick]
(c2.east) -- ([xshift=0.4cm]c2.east);
\draw
[-latex,thick]
(c2.east) -- ([xshift=0.4cm]c2.east);
\node
[inner sep=0pt, font=\tiny]
at (0.75cm, -0.4cm)
{$
\mathbi
{
X
}$}
;
\node
[inner sep=0pt, font=\tiny]
at (0.75cm, -0.4cm)
{$
\mathbi
{
x
}$}
;
\node
[inner sep=0pt, font=\tiny]
at ([yshift=-0.8cm]a.south)
{$
\mathbi
{
B
}
=
\mathbi
{
X
}
*
\mathbi
{
V
}
+
\mathbi
{
b
}_{
\mathbi
{
W
}}$}
;
\node
[inner sep=0pt, font=\tiny]
at ([yshift=-0.8cm]a.south)
{$
\mathbi
{
B
}
=
\mathbi
{
x
}
*
\mathbi
{
V
}
+
\mathbi
{
b
}_{
\mathbi
{
W
}}$}
;
\node
[inner sep=0pt, font=\tiny]
at ([yshift=-0.8cm]b.south)
{$
\mathbi
{
A
}
=
\mathbi
{
X
}
*
\mathbi
{
W
}
+
\mathbi
{
b
}_{
\mathbi
{
V
}}$}
;
\node
[inner sep=0pt, font=\tiny]
at ([yshift=-0.8cm]b.south)
{$
\mathbi
{
A
}
=
\mathbi
{
x
}
*
\mathbi
{
W
}
+
\mathbi
{
b
}_{
\mathbi
{
V
}}$}
;
\node
[inner sep=0pt, font=\tiny]
at (8.2cm, -0.4cm)
{$
\mathbi
{
Y
}
=
\mathbi
{
A
}
\otimes
\sigma
(
\mathbi
{
B
}
)
$}
;
\node
[inner sep=0pt, font=\tiny]
at (8.2cm, -0.4cm)
{$
\mathbi
{
y
}
=
\mathbi
{
A
}
\otimes
\sigma
(
\mathbi
{
B
}
)
$}
;
\end{tikzpicture}
\end{tikzpicture}
\ No newline at end of file
This diff is collapsed.
Click to expand it.
Chapter11/Figures/figure-standard.tex
查看文件 @
681229f0
...
@@ -40,7 +40,7 @@
...
@@ -40,7 +40,7 @@
\node
[vuale]
at ([xshift=0.9em]r3
_
1.east)
{$
\mathbi
{
z
}_
1
$}
;
\node
[vuale]
at ([xshift=0.9em]r3
_
1.east)
{$
\mathbi
{
z
}_
1
$}
;
\node
(t1) at (2.5em, -1em)
{
\large
{$
\cdots
$}}
;
\node
(t1) at (2.5em, -1em)
{
\large
{$
\cdots
$}}
;
\node
[anchor=north,font=
\tiny
] at ([yshift=-0.2em]t1.south)
{
传统
卷积
}
;
\node
[anchor=north,font=
\tiny
] at ([yshift=-0.2em]t1.south)
{
(a) 标准
卷积
}
;
\end{scope}
\end{scope}
\begin{scope}
[xshift=4cm]
\begin{scope}
[xshift=4cm]
...
@@ -74,7 +74,7 @@
...
@@ -74,7 +74,7 @@
\node
[vuale]
at ([xshift=0.9em]r3
_
1.east)
{$
\mathbi
{
z
}_
1
$}
;
\node
[vuale]
at ([xshift=0.9em]r3
_
1.east)
{$
\mathbi
{
z
}_
1
$}
;
\node
(t2) at (2.5em, -1em)
{
\large
{$
\cdots
$}}
;
\node
(t2) at (2.5em, -1em)
{
\large
{$
\cdots
$}}
;
\node
[anchor=north,font=
\tiny
] at ([yshift=-0.2em]t2.south)
{
深度卷积
}
;
\node
[anchor=north,font=
\tiny
] at ([yshift=-0.2em]t2.south)
{
(b)
深度卷积
}
;
\end{scope}
\end{scope}
\begin{scope}
[xshift=8cm]
\begin{scope}
[xshift=8cm]
...
@@ -110,7 +110,7 @@
...
@@ -110,7 +110,7 @@
\node
[vuale]
at ([xshift=0.9em]r3
_
1.east)
{$
\mathbi
{
z
}_
1
$}
;
\node
[vuale]
at ([xshift=0.9em]r3
_
1.east)
{$
\mathbi
{
z
}_
1
$}
;
\node
(t3) at (2.5em, -1em)
{
\large
{$
\cdots
$}}
;
\node
(t3) at (2.5em, -1em)
{
\large
{$
\cdots
$}}
;
\node
[anchor=north,font=
\tiny
] at ([yshift=-0.2em]t3.south)
{
逐点卷积
}
;
\node
[anchor=north,font=
\tiny
] at ([yshift=-0.2em]t3.south)
{
(c)
逐点卷积
}
;
\end{scope}
\end{scope}
\end{tikzpicture}
\end{tikzpicture}
\ No newline at end of file
This diff is collapsed.
Click to expand it.
Chapter11/Figures/figure-use-cnn-in-sentence-classification.tex
查看文件 @
681229f0
...
@@ -85,10 +85,10 @@
...
@@ -85,10 +85,10 @@
%\draw [thick] (3.6cm, -0.3cm) -- (3.6cm, -0.5cm) -- node[font=\tiny, align=center,yshift=-0.5cm]{Convolutional layer with \\ multiple filter widths and \\ feature maps} (6cm,-0.5cm) -- (6cm, -0.3cm);
%\draw [thick] (3.6cm, -0.3cm) -- (3.6cm, -0.5cm) -- node[font=\tiny, align=center,yshift=-0.5cm]{Convolutional layer with \\ multiple filter widths and \\ feature maps} (6cm,-0.5cm) -- (6cm, -0.3cm);
%\draw [thick] (7.2cm, -0.3cm) -- (7.2cm, -0.5cm) -- node[font=\tiny, align=center,yshift=-0.5cm]{Max-over-time\\ pooling} (9cm,-0.5cm) -- (9cm, -0.3cm);
%\draw [thick] (7.2cm, -0.3cm) -- (7.2cm, -0.5cm) -- node[font=\tiny, align=center,yshift=-0.5cm]{Max-over-time\\ pooling} (9cm,-0.5cm) -- (9cm, -0.3cm);
%\draw [thick] (10cm, -0.3cm) -- (10cm, -0.5cm) -- node[font=\tiny, align=center,yshift=-0.5cm]{Fully connected layer \\ with dropout and \\ softmax output} (11.7cm,-0.5cm) -- (11.7cm, -0.3cm);
%\draw [thick] (10cm, -0.3cm) -- (10cm, -0.5cm) -- node[font=\tiny, align=center,yshift=-0.5cm]{Fully connected layer \\ with dropout and \\ softmax output} (11.7cm,-0.5cm) -- (11.7cm, -0.3cm);
\draw
[thick] (0cm, -0.3cm) -- (0cm, -0.5cm) -- node[font=
\tiny
, align=center,yshift=-0.5cm]
{
维度大小为
$
m
\times
k
$
\\
的静态与非静态通道
\\
的句子表示
}
(2.4cm,-0.5cm) -- (2.4cm, -0.3cm);
\draw
[thick] (0cm, -0.3cm) -- (0cm, -0.5cm) -- node[font=
\tiny
, align=center,yshift=-0.5cm]
{
维度大小为
$
m
\times
K
$
\\
的静态与非静态通道
\\
的句子表示
}
(2.4cm,-0.5cm) -- (2.4cm, -0.3cm);
\draw
[thick] (3.6cm, -0.3cm) -- (3.6cm, -0.5cm) -- node[font=
\tiny
, align=center,yshift=-0.5cm]
{
具有多个不同大小
\\
的卷积核和特征图
\\
的卷积层
}
(6cm,-0.5cm) -- (6cm, -0.3cm);
\draw
[thick] (3.6cm, -0.3cm) -- (3.6cm, -0.5cm) -- node[font=
\tiny
, align=center,yshift=-0.5cm]
{
具有多个不同大小
\\
的卷积核和特征图
\\
的卷积层
}
(6cm,-0.5cm) -- (6cm, -0.3cm);
\draw
[thick] (7.2cm, -0.3cm) -- (7.2cm, -0.5cm) -- node[font=
\tiny
, align=center,yshift=-0.5cm]
{
最大池化
}
(9cm,-0.5cm) -- (9cm, -0.3cm);
\draw
[thick] (7.2cm, -0.3cm) -- (7.2cm, -0.5cm) -- node[font=
\tiny
, align=center,yshift=-0.5cm]
{
最大池化
}
(9cm,-0.5cm) -- (9cm, -0.3cm);
\draw
[thick] (10cm, -0.3cm) -- (10cm, -0.5cm) -- node[font=
\tiny
, align=center,yshift=-0.5cm]
{
带有
dropout
\\
和s
oftmax输出
\\
的全连接层
}
(11.7cm,-0.5cm) -- (11.7cm, -0.3cm);
\draw
[thick] (10cm, -0.3cm) -- (10cm, -0.5cm) -- node[font=
\tiny
, align=center,yshift=-0.5cm]
{
带有
Dropout
\\
和S
oftmax输出
\\
的全连接层
}
(11.7cm,-0.5cm) -- (11.7cm, -0.3cm);
%\node [font=\Large] at (5.2cm,-2cm){$h_i = dot(F,x_{i:i+l-1})+b$};
%\node [font=\Large] at (5.2cm,-2cm){$h_i = dot(F,x_{i:i+l-1})+b$};
...
...
This diff is collapsed.
Click to expand it.
Chapter11/chapter11.tex
查看文件 @
681229f0
差异被折叠。
点击展开。
Chapter12/Figures/figure-position-of-difference-and-layer-regularization-in-the-model.tex
查看文件 @
681229f0
...
@@ -17,7 +17,7 @@
...
@@ -17,7 +17,7 @@
\node
[ffnnode,anchor=south] (ffn1) at ([yshift=1em]res1.north)
{
\tiny
{$
\textbf
{
Feed Forward Network
}$}}
;
\node
[ffnnode,anchor=south] (ffn1) at ([yshift=1em]res1.north)
{
\tiny
{$
\textbf
{
Feed Forward Network
}$}}
;
\node
[Resnode,anchor=south] (res2) at ([yshift=0.3em]ffn1.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[Resnode,anchor=south] (res2) at ([yshift=0.3em]ffn1.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[inputnode,anchor=north west] (input1) at ([yshift=-1em]sa1.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[inputnode,anchor=north west] (input1) at ([yshift=-1em]sa1.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[posnode,anchor=north east] (pos1) at ([yshift=-1em]sa1.south east)
{
\tiny
{$
\textbf
{
Postion
}$}}
;
\node
[posnode,anchor=north east] (pos1) at ([yshift=-1em]sa1.south east)
{
\tiny
{$
\textbf
{
Pos
i
tion
}$}}
;
\node
[anchor=north] (inputs) at ([yshift=-3em]sa1.south)
{
\scriptsize
{$
\textbf
{
编码器输入: 我
\ \
很
\ \
好
}$}}
;
\node
[anchor=north] (inputs) at ([yshift=-3em]sa1.south)
{
\scriptsize
{$
\textbf
{
编码器输入: 我
\ \
很
\ \
好
}$}}
;
\node
[anchor=south] (encoder) at ([xshift=0.2em,yshift=0.6em]res2.north west)
{
\scriptsize
{
\textbf
{
编码器
}}}
;
\node
[anchor=south] (encoder) at ([xshift=0.2em,yshift=0.6em]res2.north west)
{
\scriptsize
{
\textbf
{
编码器
}}}
;
...
@@ -36,7 +36,7 @@
...
@@ -36,7 +36,7 @@
\node
[Resnode,anchor=south] (res5) at ([yshift=0.3em]ffn2.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[Resnode,anchor=south] (res5) at ([yshift=0.3em]ffn2.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[outputnode,anchor=south] (o1) at ([yshift=1em]res5.north)
{
\tiny
{$
\textbf
{
Output layer
}$}}
;
\node
[outputnode,anchor=south] (o1) at ([yshift=1em]res5.north)
{
\tiny
{$
\textbf
{
Output layer
}$}}
;
\node
[inputnode,anchor=north west] (input2) at ([yshift=-1em]sa2.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[inputnode,anchor=north west] (input2) at ([yshift=-1em]sa2.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[posnode,anchor=north east] (pos2) at ([yshift=-1em]sa2.south east)
{
\tiny
{$
\textbf
{
Postion
}$}}
;
\node
[posnode,anchor=north east] (pos2) at ([yshift=-1em]sa2.south east)
{
\tiny
{$
\textbf
{
Pos
i
tion
}$}}
;
\node
[anchor=north] (outputs) at ([yshift=-3em]sa2.south)
{
\scriptsize
{$
\textbf
{
解码器输入:
$
<
$
sos
$
>
$
I am fine
}$}}
;
\node
[anchor=north] (outputs) at ([yshift=-3em]sa2.south)
{
\scriptsize
{$
\textbf
{
解码器输入:
$
<
$
sos
$
>
$
I am fine
}$}}
;
\node
[anchor=east] (decoder) at ([xshift=-1em,yshift=-1.5em]o1.west)
{
\scriptsize
{
\textbf
{
解码器
}}}
;
\node
[anchor=east] (decoder) at ([xshift=-1em,yshift=-1.5em]o1.west)
{
\scriptsize
{
\textbf
{
解码器
}}}
;
\node
[anchor=north] (decoutputs) at ([yshift=1.5em]o1.north)
{
\scriptsize
{$
\textbf
{
解码器输出: I am fine
$
<
$
eos
$
>
$
}$}}
;
\node
[anchor=north] (decoutputs) at ([yshift=1.5em]o1.north)
{
\scriptsize
{$
\textbf
{
解码器输出: I am fine
$
<
$
eos
$
>
$
}$}}
;
...
...
This diff is collapsed.
Click to expand it.
Chapter12/Figures/figure-position-of-feedforward-neural-network-in-the-model.tex
查看文件 @
681229f0
...
@@ -15,7 +15,7 @@
...
@@ -15,7 +15,7 @@
\node
[ffnnode,anchor=south] (ffn1) at ([yshift=1em]res1.north)
{
\tiny
{$
\textbf
{
Feed Forward Network
}$}}
;
\node
[ffnnode,anchor=south] (ffn1) at ([yshift=1em]res1.north)
{
\tiny
{$
\textbf
{
Feed Forward Network
}$}}
;
\node
[Resnode,anchor=south] (res2) at ([yshift=0.3em]ffn1.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[Resnode,anchor=south] (res2) at ([yshift=0.3em]ffn1.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[inputnode,anchor=north west] (input1) at ([yshift=-1em]sa1.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[inputnode,anchor=north west] (input1) at ([yshift=-1em]sa1.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[posnode,anchor=north east] (pos1) at ([yshift=-1em]sa1.south east)
{
\tiny
{$
\textbf
{
Postion
}$}}
;
\node
[posnode,anchor=north east] (pos1) at ([yshift=-1em]sa1.south east)
{
\tiny
{$
\textbf
{
Pos
i
tion
}$}}
;
\node
[anchor=north] (inputs) at ([yshift=-3em]sa1.south)
{
\scriptsize
{$
\textbf
{
编码器输入: 我
\ \
很
\ \
好
}$}}
;
\node
[anchor=north] (inputs) at ([yshift=-3em]sa1.south)
{
\scriptsize
{$
\textbf
{
编码器输入: 我
\ \
很
\ \
好
}$}}
;
\node
[anchor=south] (encoder) at ([xshift=0.2em,yshift=0.6em]res2.north west)
{
\scriptsize
{
\textbf
{
编码器
}}}
;
\node
[anchor=south] (encoder) at ([xshift=0.2em,yshift=0.6em]res2.north west)
{
\scriptsize
{
\textbf
{
编码器
}}}
;
...
@@ -34,7 +34,7 @@
...
@@ -34,7 +34,7 @@
\node
[Resnode,anchor=south] (res5) at ([yshift=0.3em]ffn2.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[Resnode,anchor=south] (res5) at ([yshift=0.3em]ffn2.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[outputnode,anchor=south] (o1) at ([yshift=1em]res5.north)
{
\tiny
{$
\textbf
{
Output layer
}$}}
;
\node
[outputnode,anchor=south] (o1) at ([yshift=1em]res5.north)
{
\tiny
{$
\textbf
{
Output layer
}$}}
;
\node
[inputnode,anchor=north west] (input2) at ([yshift=-1em]sa2.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[inputnode,anchor=north west] (input2) at ([yshift=-1em]sa2.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[posnode,anchor=north east] (pos2) at ([yshift=-1em]sa2.south east)
{
\tiny
{$
\textbf
{
Postion
}$}}
;
\node
[posnode,anchor=north east] (pos2) at ([yshift=-1em]sa2.south east)
{
\tiny
{$
\textbf
{
Pos
i
tion
}$}}
;
\node
[anchor=north] (outputs) at ([yshift=-3em]sa2.south)
{
\scriptsize
{$
\textbf
{
解码器输入:
$
<
$
sos
$
>
$
I am fine
}$}}
;
\node
[anchor=north] (outputs) at ([yshift=-3em]sa2.south)
{
\scriptsize
{$
\textbf
{
解码器输入:
$
<
$
sos
$
>
$
I am fine
}$}}
;
\node
[anchor=east] (decoder) at ([xshift=-1em,yshift=-1.5em]o1.west)
{
\scriptsize
{
\textbf
{
解码器
}}}
;
\node
[anchor=east] (decoder) at ([xshift=-1em,yshift=-1.5em]o1.west)
{
\scriptsize
{
\textbf
{
解码器
}}}
;
\node
[anchor=north] (decoutputs) at ([yshift=1.5em]o1.north)
{
\scriptsize
{$
\textbf
{
解码器输出: I am fine
$
<
$
eos
$
>
$
}$}}
;
\node
[anchor=north] (decoutputs) at ([yshift=1.5em]o1.north)
{
\scriptsize
{$
\textbf
{
解码器输出: I am fine
$
<
$
eos
$
>
$
}$}}
;
...
...
This diff is collapsed.
Click to expand it.
Chapter12/Figures/figure-position-of-self-attention-mechanism-in-the-model.tex
查看文件 @
681229f0
...
@@ -16,7 +16,7 @@
...
@@ -16,7 +16,7 @@
\node
[ffnnode,anchor=south] (ffn1) at ([yshift=1em]res1.north)
{
\tiny
{$
\textbf
{
Feed Forward Network
}$}}
;
\node
[ffnnode,anchor=south] (ffn1) at ([yshift=1em]res1.north)
{
\tiny
{$
\textbf
{
Feed Forward Network
}$}}
;
\node
[Resnode,anchor=south] (res2) at ([yshift=0.3em]ffn1.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[Resnode,anchor=south] (res2) at ([yshift=0.3em]ffn1.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[inputnode,anchor=north west] (input1) at ([yshift=-1em]sa1.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[inputnode,anchor=north west] (input1) at ([yshift=-1em]sa1.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[posnode,anchor=north east] (pos1) at ([yshift=-1em]sa1.south east)
{
\tiny
{$
\textbf
{
Postion
}$}}
;
\node
[posnode,anchor=north east] (pos1) at ([yshift=-1em]sa1.south east)
{
\tiny
{$
\textbf
{
Pos
i
tion
}$}}
;
\node
[anchor=north] (inputs) at ([yshift=-3em]sa1.south)
{
\scriptsize
{$
\textbf
{
编码器输入: 我
\ \
很
\ \
好
}$}}
;
\node
[anchor=north] (inputs) at ([yshift=-3em]sa1.south)
{
\scriptsize
{$
\textbf
{
编码器输入: 我
\ \
很
\ \
好
}$}}
;
\node
[anchor=south] (encoder) at ([xshift=0.2em,yshift=0.6em]res2.north west)
{
\scriptsize
{
\textbf
{
编码器
}}}
;
\node
[anchor=south] (encoder) at ([xshift=0.2em,yshift=0.6em]res2.north west)
{
\scriptsize
{
\textbf
{
编码器
}}}
;
...
@@ -35,7 +35,7 @@
...
@@ -35,7 +35,7 @@
\node
[Resnode,anchor=south] (res5) at ([yshift=0.3em]ffn2.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[Resnode,anchor=south] (res5) at ([yshift=0.3em]ffn2.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[outputnode,anchor=south] (o1) at ([yshift=1em]res5.north)
{
\tiny
{$
\textbf
{
Output layer
}$}}
;
\node
[outputnode,anchor=south] (o1) at ([yshift=1em]res5.north)
{
\tiny
{$
\textbf
{
Output layer
}$}}
;
\node
[inputnode,anchor=north west] (input2) at ([yshift=-1em]sa2.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[inputnode,anchor=north west] (input2) at ([yshift=-1em]sa2.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[posnode,anchor=north east] (pos2) at ([yshift=-1em]sa2.south east)
{
\tiny
{$
\textbf
{
Postion
}$}}
;
\node
[posnode,anchor=north east] (pos2) at ([yshift=-1em]sa2.south east)
{
\tiny
{$
\textbf
{
Pos
i
tion
}$}}
;
\node
[anchor=north] (outputs) at ([yshift=-3em]sa2.south)
{
\scriptsize
{$
\textbf
{
解码器输入:
$
<
$
sos
$
>
$
I am fine
}$}}
;
\node
[anchor=north] (outputs) at ([yshift=-3em]sa2.south)
{
\scriptsize
{$
\textbf
{
解码器输入:
$
<
$
sos
$
>
$
I am fine
}$}}
;
\node
[anchor=east] (decoder) at ([xshift=-1em,yshift=-1.5em]o1.west)
{
\scriptsize
{
\textbf
{
解码器
}}}
;
\node
[anchor=east] (decoder) at ([xshift=-1em,yshift=-1.5em]o1.west)
{
\scriptsize
{
\textbf
{
解码器
}}}
;
\node
[anchor=north] (decoutputs) at ([yshift=1.5em]o1.north)
{
\scriptsize
{$
\textbf
{
解码器输出: I am fine
$
<
$
eos
$
>
$
}$}}
;
\node
[anchor=north] (decoutputs) at ([yshift=1.5em]o1.north)
{
\scriptsize
{$
\textbf
{
解码器输出: I am fine
$
<
$
eos
$
>
$
}$}}
;
...
...
This diff is collapsed.
Click to expand it.
Chapter12/Figures/figure-transformer-input-and-position-encoding.tex
查看文件 @
681229f0
...
@@ -15,7 +15,7 @@
...
@@ -15,7 +15,7 @@
\node
[ffnnode,anchor=south] (ffn1) at ([yshift=1em]res1.north)
{
\tiny
{$
\textbf
{
Feed Forward Network
}$}}
;
\node
[ffnnode,anchor=south] (ffn1) at ([yshift=1em]res1.north)
{
\tiny
{$
\textbf
{
Feed Forward Network
}$}}
;
\node
[Resnode,anchor=south] (res2) at ([yshift=0.3em]ffn1.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[Resnode,anchor=south] (res2) at ([yshift=0.3em]ffn1.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[inputnode,anchor=north west] (input1) at ([yshift=-1em]sa1.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[inputnode,anchor=north west] (input1) at ([yshift=-1em]sa1.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[posnode,anchor=north east] (pos1) at ([yshift=-1em]sa1.south east)
{
\tiny
{$
\textbf
{
Postion
}$}}
;
\node
[posnode,anchor=north east] (pos1) at ([yshift=-1em]sa1.south east)
{
\tiny
{$
\textbf
{
Pos
i
tion
}$}}
;
\node
[anchor=north] (inputs) at ([yshift=-3em]sa1.south)
{
\scriptsize
{$
\textbf
{
编码器输入: 我
\ \
很
\ \
好
}$}}
;
\node
[anchor=north] (inputs) at ([yshift=-3em]sa1.south)
{
\scriptsize
{$
\textbf
{
编码器输入: 我
\ \
很
\ \
好
}$}}
;
\node
[anchor=south] (encoder) at ([xshift=0.2em,yshift=0.6em]res2.north west)
{
\scriptsize
{
\textbf
{
编码器
}}}
;
\node
[anchor=south] (encoder) at ([xshift=0.2em,yshift=0.6em]res2.north west)
{
\scriptsize
{
\textbf
{
编码器
}}}
;
...
@@ -34,7 +34,7 @@
...
@@ -34,7 +34,7 @@
\node
[Resnode,anchor=south] (res5) at ([yshift=0.3em]ffn2.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[Resnode,anchor=south] (res5) at ([yshift=0.3em]ffn2.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[outputnode,anchor=south] (o1) at ([yshift=1em]res5.north)
{
\tiny
{$
\textbf
{
Output layer
}$}}
;
\node
[outputnode,anchor=south] (o1) at ([yshift=1em]res5.north)
{
\tiny
{$
\textbf
{
Output layer
}$}}
;
\node
[inputnode,anchor=north west] (input2) at ([yshift=-1em]sa2.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[inputnode,anchor=north west] (input2) at ([yshift=-1em]sa2.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[posnode,anchor=north east] (pos2) at ([yshift=-1em]sa2.south east)
{
\tiny
{$
\textbf
{
Postion
}$}}
;
\node
[posnode,anchor=north east] (pos2) at ([yshift=-1em]sa2.south east)
{
\tiny
{$
\textbf
{
Pos
i
tion
}$}}
;
\node
[anchor=north] (outputs) at ([yshift=-3em]sa2.south)
{
\scriptsize
{$
\textbf
{
解码器输入:
$
<
$
sos
$
>
$
I am fine
}$}}
;
\node
[anchor=north] (outputs) at ([yshift=-3em]sa2.south)
{
\scriptsize
{$
\textbf
{
解码器输入:
$
<
$
sos
$
>
$
I am fine
}$}}
;
\node
[anchor=east] (decoder) at ([xshift=-1em,yshift=-1.5em]o1.west)
{
\scriptsize
{
\textbf
{
解码器
}}}
;
\node
[anchor=east] (decoder) at ([xshift=-1em,yshift=-1.5em]o1.west)
{
\scriptsize
{
\textbf
{
解码器
}}}
;
\node
[anchor=north] (decoutputs) at ([yshift=1.5em]o1.north)
{
\scriptsize
{$
\textbf
{
解码器输出: I am fine
$
<
$
eos
$
>
$
}$}}
;
\node
[anchor=north] (decoutputs) at ([yshift=1.5em]o1.north)
{
\scriptsize
{$
\textbf
{
解码器输出: I am fine
$
<
$
eos
$
>
$
}$}}
;
...
...
This diff is collapsed.
Click to expand it.
Chapter12/Figures/figure-transformer.tex
查看文件 @
681229f0
...
@@ -15,7 +15,7 @@
...
@@ -15,7 +15,7 @@
\node
[ffnnode,anchor=south] (ffn1) at ([yshift=1em]res1.north)
{
\tiny
{$
\textbf
{
Feed Forward Network
}$}}
;
\node
[ffnnode,anchor=south] (ffn1) at ([yshift=1em]res1.north)
{
\tiny
{$
\textbf
{
Feed Forward Network
}$}}
;
\node
[Resnode,anchor=south] (res2) at ([yshift=0.3em]ffn1.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[Resnode,anchor=south] (res2) at ([yshift=0.3em]ffn1.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[inputnode,anchor=north west] (input1) at ([yshift=-1em]sa1.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[inputnode,anchor=north west] (input1) at ([yshift=-1em]sa1.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[posnode,anchor=north east] (pos1) at ([yshift=-1em]sa1.south east)
{
\tiny
{$
\textbf
{
Postion
}$}}
;
\node
[posnode,anchor=north east] (pos1) at ([yshift=-1em]sa1.south east)
{
\tiny
{$
\textbf
{
Pos
i
tion
}$}}
;
\node
[anchor=north] (inputs) at ([yshift=-3em]sa1.south)
{
\scriptsize
{$
\textbf
{
编码器输入: 我
\ \
很
\ \
好
}$}}
;
\node
[anchor=north] (inputs) at ([yshift=-3em]sa1.south)
{
\scriptsize
{$
\textbf
{
编码器输入: 我
\ \
很
\ \
好
}$}}
;
\node
[anchor=south] (encoder) at ([xshift=0.2em,yshift=0.6em]res2.north west)
{
\scriptsize
{
\textbf
{
编码器
}}}
;
\node
[anchor=south] (encoder) at ([xshift=0.2em,yshift=0.6em]res2.north west)
{
\scriptsize
{
\textbf
{
编码器
}}}
;
...
@@ -34,7 +34,7 @@
...
@@ -34,7 +34,7 @@
\node
[Resnode,anchor=south] (res5) at ([yshift=0.3em]ffn2.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[Resnode,anchor=south] (res5) at ([yshift=0.3em]ffn2.north)
{
\tiny
{$
\textbf
{
Add
\&
LayerNorm
}$}}
;
\node
[outputnode,anchor=south] (o1) at ([yshift=1em]res5.north)
{
\tiny
{$
\textbf
{
Output layer
}$}}
;
\node
[outputnode,anchor=south] (o1) at ([yshift=1em]res5.north)
{
\tiny
{$
\textbf
{
Output layer
}$}}
;
\node
[inputnode,anchor=north west] (input2) at ([yshift=-1em]sa2.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[inputnode,anchor=north west] (input2) at ([yshift=-1em]sa2.south west)
{
\tiny
{$
\textbf
{
Embedding
}$}}
;
\node
[posnode,anchor=north east] (pos2) at ([yshift=-1em]sa2.south east)
{
\tiny
{$
\textbf
{
Postion
}$}}
;
\node
[posnode,anchor=north east] (pos2) at ([yshift=-1em]sa2.south east)
{
\tiny
{$
\textbf
{
Pos
i
tion
}$}}
;
\node
[anchor=north] (outputs) at ([yshift=-3em]sa2.south)
{
\scriptsize
{$
\textbf
{
解码器输入:
$
<
$
sos
$
>
$
I am fine
}$}}
;
\node
[anchor=north] (outputs) at ([yshift=-3em]sa2.south)
{
\scriptsize
{$
\textbf
{
解码器输入:
$
<
$
sos
$
>
$
I am fine
}$}}
;
\node
[anchor=east] (decoder) at ([xshift=-1em,yshift=-1.5em]o1.west)
{
\scriptsize
{
\textbf
{
解码器
}}}
;
\node
[anchor=east] (decoder) at ([xshift=-1em,yshift=-1.5em]o1.west)
{
\scriptsize
{
\textbf
{
解码器
}}}
;
\node
[anchor=north] (decoutputs) at ([yshift=1.5em]o1.north)
{
\scriptsize
{$
\textbf
{
解码器输出: I am fine
$
<
$
eos
$
>
$
}$}}
;
\node
[anchor=north] (decoutputs) at ([yshift=1.5em]o1.north)
{
\scriptsize
{$
\textbf
{
解码器输出: I am fine
$
<
$
eos
$
>
$
}$}}
;
...
...
This diff is collapsed.
Click to expand it.
Chapter12/chapter12.tex
查看文件 @
681229f0
差异被折叠。
点击展开。
Chapter16/Figures/figure-application-process-of-back-translation.tex
查看文件 @
681229f0
\begin{tikzpicture}
\begin{tikzpicture}
\tikzstyle
{
bignode
}
= [line width=0.6pt,draw=black,minimum width=6.3em,minimum height=2.2em,fill=
blue!20,rounded corners=2pt
]
\tikzstyle
{
bignode
}
= [line width=0.6pt,draw=black,minimum width=6.3em,minimum height=2.2em,fill=
white
]
\tikzstyle
{
middlenode
}
= [line width=0.6pt,draw=black,minimum width=5.6em,minimum height=2.2em,fill=
blue!20,rounded corners=2pt
]
\tikzstyle
{
middlenode
}
= [line width=0.6pt,draw=black,minimum width=5.6em,minimum height=2.2em,fill=
white
]
\node
[anchor=center] (node1-1) at (0,0)
{
\scriptsize
{
汉语
}}
;
\node
[anchor=center] (node1-1) at (0,0)
{
\scriptsize
{
汉语
}}
;
\node
[anchor=west] (node1-2) at ([xshift=
1.0
em]node1-1.east)
{
\scriptsize
{
英语
}}
;
\node
[anchor=west] (node1-2) at ([xshift=
0.8
em]node1-1.east)
{
\scriptsize
{
英语
}}
;
\node
[anchor=north] (node1-3) at ([xshift=1.
6
5em]node1-1.south)
{
\scriptsize
{
反向翻译模型
}}
;
\node
[anchor=north] (node1-3) at ([xshift=1.
4
5em]node1-1.south)
{
\scriptsize
{
反向翻译模型
}}
;
\draw
[->,line width=0.6pt](node1-1.east)--(node1-2.west);
\draw
[->,line width=0.6pt](node1-1.east)--(node1-2.west);
\begin{pgfonlayer}
{
background
}
\begin{pgfonlayer}
{
background
}
{
{
\node
[fill=
red!20,rounded corners=2pt,inner sep=0.2em,draw=black,line width=0.6pt,minimum width=6.0em
]
[fit =(node1-1)(node1-2)(node1-3)] (remark1)
{}
;
\node
[fill=
blue!20,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=6.0em,drop shadow,rounded corners=2pt
]
[fit =(node1-1)(node1-2)(node1-3)] (remark1)
{}
;
}
}
\end{pgfonlayer}
\end{pgfonlayer}
\node
[anchor=north](node2-1) at ([xshift=-1.93em,yshift=-1.95em]remark1.south)
{
\scriptsize
{
汉语
}}
;
\node
[anchor=north](node2-1-2) at (node2-1.south)
{
\scriptsize
{
真实数据
}}
;
\begin{pgfonlayer}
{
background
}
{
\node
[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em]
[fit =(node2-1)(node2-1-2)] (remark2-1)
{}
;
}
\end{pgfonlayer}
\node
[anchor=west](node2-2) at ([xshift=0.82em,yshift=0.68em]remark2-1.east)
{
\scriptsize
{
英语
}}
;
\node
[anchor=north](node2-2-2) at (node2-2.south)
{
\scriptsize
{
真实数据
}}
;
\begin{pgfonlayer}
{
background
}
{
\node
[fill=green!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em]
[fit =(node2-2)(node2-2-2)] (remark2-2)
{}
;
}
\end{pgfonlayer}
\node
[anchor=north,fill=green!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node2-1) at ([xshift=-1.5em,yshift=-1.95em]remark1.south)
{
\scriptsize
{
汉语
}}
;
\node
[anchor=west,fill=green!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node2-2) at (node2-1.east)
{
\scriptsize
{
英语
}}
;
\draw
[->,line width=0.6pt]([yshift=-2.0em]remark1.south)--(remark1.south) node [pos=0.5,right] (pos1)
{
\scriptsize
{
训练
}}
;
\draw
[->,line width=0.6pt]([yshift=-2.0em]remark1.south)--(remark1.south) node [pos=0.5,right] (pos1)
{
\scriptsize
{
训练
}}
;
\node
[anchor=west](node3-1) at ([xshift=5.0em,yshift=0.1em]node1-2.east)
{
\scriptsize
{
汉语
}}
;
\node
[anchor=north](node3-1-2) at (node3-1.south)
{
\scriptsize
{
真实数据
}}
;
\begin{pgfonlayer}
{
background
}
\node
[anchor=west,fill=yellow!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node3-1) at ([xshift=5.0em,yshift=0.0em]node1-2.east)
{
\scriptsize
{
汉语
}}
;
{
\node
[anchor=north,fill=red!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node3-2) at ([yshift=-2.15em]node3-1.south)
{
\scriptsize
{
英语
}}
;
\node
[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em]
[fit =(node3-1)(node3-1-2)] (remark3-1)
{}
;
}
\end{pgfonlayer}
\node
[anchor=north](node3-2) at ([yshift=-2.15em]remark3-1.south)
{
\scriptsize
{
英语
}}
;
\node
[anchor=north](node3-2-2) at (node3-2.south)
{
\scriptsize
{
伪数据
}}
;
\begin{pgfonlayer}
{
background
}
{
\node
[fill=yellow!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em]
[fit =(node3-2)(node3-2-2)] (remark3-2)
{}
;
}
\end{pgfonlayer}
\draw
[->,line width=0.6pt](
remark3-1.south)--(remark
3-2.north) node [pos=0.5,right] (pos2)
{
\scriptsize
{
翻译
}}
;
\draw
[->,line width=0.6pt](
node3-1.south)--(node
3-2.north) node [pos=0.5,right] (pos2)
{
\scriptsize
{
翻译
}}
;
\begin{pgfonlayer}
{
background
}
\begin{pgfonlayer}
{
background
}
{
{
\node
[rounded corners=2pt,inner sep=0.3em,draw=black,line width=0.6pt,dotted]
[fit =(
remark3-1)(remark
3-2)] (remark2)
{}
;
\node
[rounded corners=2pt,inner sep=0.3em,draw=black,line width=0.6pt,dotted]
[fit =(
node3-1)(node
3-2)] (remark2)
{}
;
}
}
\end{pgfonlayer}
\end{pgfonlayer}
\draw
[->,line width=0.6pt](remark1.east)--([yshift=
2.40
em]remark2.west) node [pos=0.5,above] (pos2)
{
\scriptsize
{
模型翻译
}}
;
\draw
[->,line width=0.6pt](remark1.east)--([yshift=
0.85
em]remark2.west) node [pos=0.5,above] (pos2)
{
\scriptsize
{
模型翻译
}}
;
\node
[anchor=south](pos2-2) at ([yshift=-0.5em]pos2.north)
{
\scriptsize
{
使用反向
}}
;
\node
[anchor=south](pos2-2) at ([yshift=-0.5em]pos2.north)
{
\scriptsize
{
使用反向
}}
;
\draw
[decorate,thick,decoration={brace,amplitude=5pt}]
([yshift=1.3em,xshift=1.
5em]node3-1.east) -- ([yshift=-7.7em,xshift=1.5
em]node3-1.east) node [pos=0.1,right,xshift=0.0em,yshift=0.0em] (label1)
{
\scriptsize
{{
混合
}}}
;
\draw
[decorate,thick,decoration={brace,amplitude=5pt}]
([yshift=1.3em,xshift=1.
0em]node3-1.east) -- ([yshift=-5.2em,xshift=1.0
em]node3-1.east) node [pos=0.1,right,xshift=0.0em,yshift=0.0em] (label1)
{
\scriptsize
{{
混合
}}}
;
\node
[anchor=west](node4-1) at ([xshift=3.5em,yshift=3.94em]node3-2.east)
{
\scriptsize
{
英语
}}
;
\node
[anchor=north](node4-1-2) at (node4-1.south)
{
\scriptsize
{
伪数据
}}
;
\begin{pgfonlayer}
{
background
}
{
\node
[fill=yellow!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em]
[fit =(node4-1)(node4-1-2)] (remark4-1)
{}
;
}
\end{pgfonlayer}
\node
[anchor=north](node4-2) at ([yshift=-1.59em]node4-1.south)
{
\scriptsize
{
英语
}}
;
\node
[anchor=north](node4-2-2) at (node4-2.south)
{
\scriptsize
{
真实数据
}}
;
\begin{pgfonlayer}
{
background
}
{
\node
[fill=green!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em]
[fit =(node4-2)(node4-2-2)] (remark4-2)
{}
;
}
\end{pgfonlayer}
\node
[anchor=west](node4-3) at ([xshift=1.7em]node4-2.east)
{
\scriptsize
{
汉语
}}
;
\node
[anchor=north](node4-3-2) at (node4-3.south)
{
\scriptsize
{
真实数据
}}
;
\begin{pgfonlayer}
{
background
}
{
\node
[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em]
[fit =(node4-3)(node4-3-2)] (remark4-3)
{}
;
}
\end{pgfonlayer}
\node
[anchor=west](node4-4) at ([xshift=1.7em]node4-1.east)
{
\scriptsize
{
汉语
}}
;
\node
[anchor=west,fill=red!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-1) at ([xshift=2.0em,yshift=1.6em]node3-2.east)
{
\scriptsize
{
英语
}}
;
\node
[anchor=north](node4-4-2) at (node4-4.south)
{
\scriptsize
{
真实数据
}}
;
\node
[anchor=north,fill=green!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-2) at (node4-1.south)
{
\scriptsize
{
英语
}}
;
\node
[anchor=west,fill=green!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-3) at (node4-1.east)
{
\scriptsize
{
汉语
}}
;
\node
[anchor=north,fill=green!20,inner sep=0.1em,minimum width=3em,draw=black,line width=0.6pt,rounded corners=2pt](node4-4) at (node4-3.south)
{
\scriptsize
{
汉语
}}
;
\begin{pgfonlayer}
{
background
}
{
\node
[fill=blue!20,rounded corners=2pt,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=3.85em]
[fit =(node4-4)(node4-4-2)] (remark4-3)
{}
;
}
\end{pgfonlayer}
\node
[anchor=center] (node5-1) at ([xshift=
4.3em,yshift=-1.48em]node4-4
.east)
{
\scriptsize
{
英语
}}
;
\node
[anchor=center] (node5-1) at ([xshift=
3.4em,yshift=0.25em]node4-3
.east)
{
\scriptsize
{
英语
}}
;
\node
[anchor=west] (node5-2) at ([xshift=
1.0
em]node5-1.east)
{
\scriptsize
{
汉语
}}
;
\node
[anchor=west] (node5-2) at ([xshift=
0.8
em]node5-1.east)
{
\scriptsize
{
汉语
}}
;
\node
[anchor=north] (node5-3) at ([xshift=1.65em]node5-1.south)
{
\scriptsize
{
正向翻译模型
}}
;
\node
[anchor=north] (node5-3) at ([xshift=1.65em]node5-1.south)
{
\scriptsize
{
正向翻译模型
}}
;
\draw
[->,line width=0.6pt](node5-1.east)--(node5-2.west);
\draw
[->,line width=0.6pt](node5-1.east)--(node5-2.west);
\begin{pgfonlayer}
{
background
}
\begin{pgfonlayer}
{
background
}
{
{
\node
[fill=
red!20,rounded corners=2pt,inner sep=0.2em,draw=black,line width=0.6pt,minimum width=6.0em
]
[fit =(node5-1)(node5-2)(node5-3)] (remark3)
{}
;
\node
[fill=
blue!20,inner sep=0.1em,draw=black,line width=0.6pt,minimum width=6.0em,drop shadow,rounded corners=2pt
]
[fit =(node5-1)(node5-2)(node5-3)] (remark3)
{}
;
}
}
\end{pgfonlayer}
\end{pgfonlayer}
\draw
[->,line width=0.6pt]([xshift=-2em]remark3.west)--(remark3.west) node [pos=0.5,above] (pos3)
{
\scriptsize
{
训练
}}
;
\draw
[->,line width=0.6pt]([xshift=-2em]remark3.west)--(remark3.west) node [pos=0.5,above] (pos3)
{
\scriptsize
{
训练
}}
;
\node
[anchor=south](d1) at ([xshift=0.0em,yshift=2em]remark3.north)
{
\scriptsize
{
真实数据:
}}
;
\node
[anchor=north](d2) at ([xshift=0.35em]d1.south)
{
\scriptsize
{
伪数据:
}}
;
\node
[anchor=south](d3) at ([xshift=0.0em,yshift=0em]d1.north)
{
\scriptsize
{
额外数据:
}}
;
\node
[anchor=west,fill=green!20,minimum width=1em](d1-1) at ([xshift=-0.0em]d1.east)
{}
;
\node
[anchor=west,fill=red!20,minimum width=1em](d2-1) at ([xshift=-0.0em]d2.east)
{}
;
\node
[anchor=west,fill=yellow!20,minimum width=1em](d3-1) at ([xshift=-0.0em]d3.east)
{}
;
\end{tikzpicture}
\end{tikzpicture}
\ No newline at end of file
This diff is collapsed.
Click to expand it.
Chapter16/chapter16.tex
查看文件 @
681229f0
差异被折叠。
点击展开。
bibliography.bib
查看文件 @
681229f0
差异被折叠。
点击展开。
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论