Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
T
Toy-MT-Introduction
概览
Overview
Details
Activity
Cycle Analytics
版本库
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
问题
0
Issues
0
列表
Board
标记
里程碑
合并请求
0
Merge Requests
0
CI / CD
CI / CD
流水线
作业
日程表
图表
维基
Wiki
代码片段
Snippets
成员
Collapse sidebar
Close sidebar
活动
图像
聊天
创建新问题
作业
提交
Issue Boards
Open sidebar
NiuTrans
Toy-MT-Introduction
Commits
d85642f9
Commit
d85642f9
authored
May 08, 2020
by
zengxin
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
figure
parent
0b980300
隐藏空白字符变更
内嵌
并排
正在显示
14 个修改的文件
包含
65 行增加
和
65 行删除
+65
-65
Book/Chapter1/chapter1.tex
+4
-4
Book/Chapter2/chapter2.tex
+6
-6
Book/Chapter3/Chapter3.tex
+4
-4
Book/Chapter4/chapter4.tex
+2
-2
Book/Chapter5/chapter5.tex
+0
-0
Book/Chapter6/Chapter6.tex
+39
-39
Book/Chapter7/Chapter7.tex
+8
-8
Book/Chapter7/Figures/figure-dynamic-linear-aggregation-network-structure.tex
+0
-0
Book/Chapter7/Figures/figure-expanded-residual-network.tex
+0
-0
Book/Chapter7/Figures/figure-learning-rate.tex
+0
-0
Book/Chapter7/Figures/figure-post-norm-vs-pre-norm.tex
+0
-0
Book/Chapter7/Figures/figure-progressive-training.tex
+0
-0
Book/Chapter7/Figures/figure-sparse-connections-between-different-groups.tex
+0
-0
Book/mt-book-xelatex.tex
+2
-2
没有找到文件。
Book/Chapter1/chapter1.tex
查看文件 @
d85642f9
...
...
@@ -45,7 +45,7 @@
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter1/Figures/figure-
Required-parts-of-MT
}
\input
{
./Chapter1/Figures/figure-
required-parts-of-mt
}
\caption
{
机器翻译系统的组成
}
\label
{
fig:1-2
}
\end{figure}
...
...
@@ -220,7 +220,7 @@
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter1/Figures/figure-
Example-RBMT
}
\input
{
./Chapter1/Figures/figure-
example-rbmt
}
\setlength
{
\belowcaptionskip
}{
-1.5em
}
\caption
{
基于规则的机器翻译的示例图(左:规则库;右:规则匹配结果)
}
\label
{
fig:1-8
}
...
...
@@ -290,7 +290,7 @@
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter1/Figures/figure-
Example-SMT
}
\input
{
./Chapter1/Figures/figure-
example-smt
}
\caption
{
统计机器翻译的示例图(左:语料资源;中:翻译模型与语言模型;右:翻译假设与翻译引擎)
}
\label
{
fig:1-11
}
\end{figure}
...
...
@@ -311,7 +311,7 @@
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter1/Figures/figure-
Example-NMT
}
\input
{
./Chapter1/Figures/figure-
example-nmt
}
\caption
{
神经机器翻译的示例图(左:编码器-解码器网络;右:编码器示例网络)
}
\label
{
fig:1-12
}
\end{figure}
...
...
Book/Chapter2/chapter2.tex
查看文件 @
d85642f9
...
...
@@ -35,8 +35,8 @@
%----------------------------------------------
\begin{figure}
[htp]
\centering
\subfigure
[机器翻译系统被看作一个黑盒]
{
\input
{
./Chapter2/Figures/figure-
MT
-system-as-a-black-box
}
}
\subfigure
[机器翻系统 = 前/后处理 + 翻译引擎]
{
\input
{
./Chapter2/Figures/figure-
MT
=language-analysis+translation-engine
}}
\subfigure
[机器翻译系统被看作一个黑盒]
{
\input
{
./Chapter2/Figures/figure-
mt
-system-as-a-black-box
}
}
\subfigure
[机器翻系统 = 前/后处理 + 翻译引擎]
{
\input
{
./Chapter2/Figures/figure-
mt
=language-analysis+translation-engine
}}
\caption
{
机器翻译系统的结构
}
\label
{
fig:2-1
}
\end{figure}
...
...
@@ -125,7 +125,7 @@ F(X)=\int_{-\infty}^x f(x)dx
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter2/Figures/figure-
Probability-density-function
&
D
istribution-function
}
\input
{
./Chapter2/Figures/figure-
probability-density-function
&
d
istribution-function
}
\caption
{
一个概率密度函数(左)与其对应的分布函数(右)
}
\label
{
fig:2-3
}
\end{figure}
...
...
@@ -310,7 +310,7 @@ F(X)=\int_{-\infty}^x f(x)dx
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter2/Figures/figure-
S
elf-information-function
}
\input
{
./Chapter2/Figures/figure-
s
elf-information-function
}
\caption
{
自信息函数
$
\textrm
{
I
}
(
x
)
$
关于
$
\textrm
{
P
}
(
x
)
$
的曲线
}
\label
{
fig:2-6
}
\end{figure}
...
...
@@ -429,7 +429,7 @@ F(X)=\int_{-\infty}^x f(x)dx
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter2/Figures/figure-
E
xample-of-word-segmentation-based-on-dictionary
}
\input
{
./Chapter2/Figures/figure-
e
xample-of-word-segmentation-based-on-dictionary
}
\caption
{
基于词典进行分词的实例
}
\label
{
fig:2-8
}
\end{figure}
...
...
@@ -638,7 +638,7 @@ F(X)=\int_{-\infty}^x f(x)dx
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter2/Figures/figure-examples-of-
C
hinese-word-segmentation-based-on-1-gram-model
}
\input
{
./Chapter2/Figures/figure-examples-of-
c
hinese-word-segmentation-based-on-1-gram-model
}
\caption
{
基于1-gram语言模型的中文分词实例
}
\label
{
fig:2-17
}
\end{figure}
...
...
Book/Chapter3/Chapter3.tex
查看文件 @
d85642f9
...
...
@@ -170,7 +170,7 @@
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter3/Figures/figure-processes-
SMT
}
\input
{
./Chapter3/Figures/figure-processes-
smt
}
\caption
{
简单的统计机器翻译流程
}
\label
{
fig:3-5
}
\end{figure}
...
...
@@ -472,7 +472,7 @@ g(\mathbf{s},\mathbf{t}) \equiv \prod_{j,i \in \widehat{A}}{\textrm{P}(s_j,t_i)}
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter3/Figures/figure-greedy-
MT
-decoding-pseudo-code
}
\input
{
./Chapter3/Figures/figure-greedy-
mt
-decoding-pseudo-code
}
\caption
{
贪婪的机器翻译解码算法的伪代码
}
\label
{
fig:3-10
}
\end{figure}
...
...
@@ -483,8 +483,8 @@ g(\mathbf{s},\mathbf{t}) \equiv \prod_{j,i \in \widehat{A}}{\textrm{P}(s_j,t_i)}
%----------------------------------------------
\begin{figure}
[htp]
\centering
\subfigure
{
\input
{
./Chapter3/Figures/greedy-
MT
-decoding-process-1
}}
\subfigure
{
\input
{
./Chapter3/Figures/greedy-
MT
-decoding-process-3
}}
\subfigure
{
\input
{
./Chapter3/Figures/greedy-
mt
-decoding-process-1
}}
\subfigure
{
\input
{
./Chapter3/Figures/greedy-
mt
-decoding-process-3
}}
\setlength
{
\belowcaptionskip
}{
14.0em
}
\caption
{
贪婪的机器翻译解码过程实例
}
\label
{
fig:3-11
}
...
...
Book/Chapter4/chapter4.tex
查看文件 @
d85642f9
...
...
@@ -2162,7 +2162,7 @@ d_1 = {d'} \circ {r_5}
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter4/Figures/structure-of-
C
hart
}
\input
{
./Chapter4/Figures/structure-of-
c
hart
}
\caption
{
Chart结构
}
\label
{
fig:4-65
}
\end{figure}
...
...
@@ -2252,7 +2252,7 @@ d_1 = {d'} \circ {r_5}
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter4/Figures/content-of-
C
hart-in-tree-based-decoding
}
\input
{
./Chapter4/Figures/content-of-
c
hart-in-tree-based-decoding
}
\caption
{
基于树的解码中Chart的内容
}
\label
{
fig:4-68
}
\end{figure}
...
...
Book/Chapter5/chapter5.tex
查看文件 @
d85642f9
This source diff could not be displayed because it is too large. You can
view the blob
instead.
Book/Chapter6/Chapter6.tex
查看文件 @
d85642f9
...
...
@@ -252,7 +252,7 @@ NMT & $ 21.7^{\ast}$ & $18.7^{\ast}$ & -1
% 图3.6
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
P
resentation-space
}
\input
{
./Chapter6/Figures/figure-
p
resentation-space
}
\caption
{
统计机器翻译和神经机器翻译的表示空间
}
\label
{
fig:6-6
}
\end{figure}
...
...
@@ -288,7 +288,7 @@ NMT & $ 21.7^{\ast}$ & $18.7^{\ast}$ & -1
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
A
-working-example-of-neural-machine-translation
}
\input
{
./Chapter6/Figures/figure-
a
-working-example-of-neural-machine-translation
}
\caption
{
神经机器翻译的运行实例
}
\label
{
fig:6-7
}
\end{figure}
...
...
@@ -384,7 +384,7 @@ NMT & $ 21.7^{\ast}$ & $18.7^{\ast}$ & -1
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
S
tructure-of-a-recurrent-network-model
}
\input
{
./Chapter6/Figures/figure-
s
tructure-of-a-recurrent-network-model
}
\caption
{
循环网络模型的结构
}
\label
{
fig:6-9
}
\end{figure}
...
...
@@ -396,7 +396,7 @@ NMT & $ 21.7^{\ast}$ & $18.7^{\ast}$ & -1
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
M
odel-structure-based-on-recurrent-neural-network-translation
}
\input
{
./Chapter6/Figures/figure-
m
odel-structure-based-on-recurrent-neural-network-translation
}
\caption
{
基于循环神经网络翻译的模型结构
}
\label
{
fig:6-10
}
\end{figure}
...
...
@@ -480,7 +480,7 @@ $\textrm{P}({y_j | \mathbf{s}_{j-1} ,y_{j-1},\mathbf{C}})$由Softmax实现,Sof
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
W
ord-embedding-structure
}
\input
{
./Chapter6/Figures/figure-
w
ord-embedding-structure
}
\caption
{
词嵌入层结构
}
\label
{
fig:6-12
}
\end{figure}
...
...
@@ -494,7 +494,7 @@ $\textrm{P}({y_j | \mathbf{s}_{j-1} ,y_{j-1},\mathbf{C}})$由Softmax实现,Sof
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
O
utput-layer-structur
}
\input
{
./Chapter6/Figures/figure-
o
utput-layer-structur
}
\caption
{
输出层结构
}
\label
{
fig:6-13
}
\end{figure}
...
...
@@ -525,7 +525,7 @@ $\textrm{P}({y_j | \mathbf{s}_{j-1} ,y_{j-1},\mathbf{C}})$由Softmax实现,Sof
\begin{figure}
[htp]
\centering
% \includegraphics[scale=0.7]{./Chapter6/Figures/Softmax.png}
\input
{
./Chapter6/Figures/figure-
S
oftmax
}
\input
{
./Chapter6/Figures/figure-
s
oftmax
}
\caption
{
Softmax函数(一维)所对应的曲线
}
\label
{
fig:6-14
}
\end{figure}
...
...
@@ -697,7 +697,7 @@ $\textrm{P}({y_j | \mathbf{s}_{j-1} ,y_{j-1},\mathbf{C}})$由Softmax实现,Sof
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
D
ouble-layer-RNN
}
\hspace
{
10em
}
\input
{
./Chapter6/Figures/figure-
d
ouble-layer-RNN
}
\hspace
{
10em
}
\caption
{
双层循环神经网络
}
\label
{
fig:6-19
}
\end{figure}
...
...
@@ -744,7 +744,7 @@ $\textrm{P}({y_j | \mathbf{s}_{j-1} ,y_{j-1},\mathbf{C}})$由Softmax实现,Sof
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
A
ttention-of-source-and-target-words
}
\input
{
./Chapter6/Figures/figure-
a
ttention-of-source-and-target-words
}
\caption
{
源语词和目标语词的关注度
}
\label
{
fig:6-21
}
\end{figure}
...
...
@@ -758,7 +758,7 @@ $\textrm{P}({y_j | \mathbf{s}_{j-1} ,y_{j-1},\mathbf{C}})$由Softmax实现,Sof
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-encoder-decoder-with-
A
ttention
}
\input
{
./Chapter6/Figures/figure-encoder-decoder-with-
a
ttention
}
\caption
{
不使用(a)和使用(b)注意力机制的翻译模型对比
}
\label
{
fig:6-22
}
\end{figure}
...
...
@@ -780,7 +780,7 @@ $\textrm{P}({y_j | \mathbf{s}_{j-1} ,y_{j-1},\mathbf{C}})$由Softmax实现,Sof
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
C
alculation-process-of-context-vector-C
}
\input
{
./Chapter6/Figures/figure-
c
alculation-process-of-context-vector-C
}
\caption
{
上下文向量
$
\mathbf
{
C
}_
j
$
的计算过程
}
\label
{
fig:6-23
}
\end{figure}
...
...
@@ -824,7 +824,7 @@ a (\mathbf{s},\mathbf{h}) = \left\{ \begin{array}{ll}
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
Matrix-Representation-of-Attention-Weights-Between-Chinese-English-Sentence-P
airs
}
\input
{
./Chapter6/Figures/figure-
matrix-representation-of-attention-weights-between-chinese-english-sentence-p
airs
}
\caption
{
一个汉英句对之间的注意力权重
{$
\alpha
_{
i,j
}$}
的矩阵表示
}
\label
{
fig:6-24
}
\end{figure}
...
...
@@ -837,7 +837,7 @@ a (\mathbf{s},\mathbf{h}) = \left\{ \begin{array}{ll}
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
E
xample-of-context-vector-calculation-process
}
\input
{
./Chapter6/Figures/figure-
e
xample-of-context-vector-calculation-process
}
\caption
{
上下文向量计算过程实例
}
\label
{
fig:6-25
}
\end{figure}
...
...
@@ -878,7 +878,7 @@ a (\mathbf{s},\mathbf{h}) = \left\{ \begin{array}{ll}
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
Q
uery-model-corresponding-to-traditional-query-model-vs-attention-mechanism
}
\input
{
./Chapter6/Figures/figure-
q
uery-model-corresponding-to-traditional-query-model-vs-attention-mechanism
}
\caption
{
传统查询模型(a)和注意力机制所对应的查询模型(b)
}
\label
{
fig:6-26
}
\end{figure}
...
...
@@ -898,7 +898,7 @@ a (\mathbf{s},\mathbf{h}) = \left\{ \begin{array}{ll}
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
Q
uery-model-corresponding-to-attention-mechanism
}
\input
{
./Chapter6/Figures/figure-
q
uery-model-corresponding-to-attention-mechanism
}
\caption
{
注意力机制所对应的查询模型
}
\label
{
fig:6-27
}
\end{figure}
...
...
@@ -1012,7 +1012,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
R
elationship-between-learning-rate-and-number-of-updates
}
\input
{
./Chapter6/Figures/figure-
r
elationship-between-learning-rate-and-number-of-updates
}
\caption
{
学习率与更新次数的变化关系
}
\label
{
fig:6-29
}
\end{figure}
...
...
@@ -1054,7 +1054,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
D
ata-parallel-process
}
\input
{
./Chapter6/Figures/figure-
d
ata-parallel-process
}
\caption
{
数据并行过程
}
\label
{
fig:6-30
}
\end{figure}
...
...
@@ -1112,7 +1112,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
D
ecoding-process-based-on-greedy-method
}
\input
{
./Chapter6/Figures/figure-
d
ecoding-process-based-on-greedy-method
}
\caption
{
基于贪婪方法的解码过程
}
\label
{
fig:6-32
}
\end{figure}
...
...
@@ -1124,7 +1124,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
D
ecode-the-word-probability-distribution-at-the-first-position
}
\input
{
./Chapter6/Figures/figure-
d
ecode-the-word-probability-distribution-at-the-first-position
}
\caption
{
解码第一个位置输出的单词概率分布(``Have''的概率最高)
}
\label
{
fig:6-33
}
\end{figure}
...
...
@@ -1147,7 +1147,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
B
eam-search-process
}
\input
{
./Chapter6/Figures/figure-
b
eam-search-process
}
\caption
{
束搜索过程
}
\label
{
fig:6-34
}
\end{figure}
...
...
@@ -1285,7 +1285,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
D
ependencies-between-words-in-a-recurrent-neural-network
}
\input
{
./Chapter6/Figures/figure-
d
ependencies-between-words-in-a-recurrent-neural-network
}
\caption
{
循环神经网络中单词之间的依赖关系
}
\label
{
fig:6-36
}
\end{figure}
...
...
@@ -1297,7 +1297,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
Dependencies-between-words-of-A
ttention
}
\input
{
./Chapter6/Figures/figure-
dependencies-between-words-of-a
ttention
}
\caption
{
自注意力机制中单词之间的依赖关系
}
\label
{
fig:6-37
}
\end{figure}
...
...
@@ -1309,7 +1309,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
E
xample-of-self-attention-mechanism-calculation
}
\input
{
./Chapter6/Figures/figure-
e
xample-of-self-attention-mechanism-calculation
}
\caption
{
自注意力计算实例
}
\label
{
fig:6-38
}
\end{figure}
...
...
@@ -1383,7 +1383,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
C
alculation-of-context-vector-C
}
\input
{
./Chapter6/Figures/figure-
c
alculation-of-context-vector-C
}
\caption
{
上下文向量
$
\mathbf
{
C
}$
的计算
}
\label
{
fig:6-41
}
\end{figure}
...
...
@@ -1418,7 +1418,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
A
-combination-of-position-encoding-and-word-encoding
}
\input
{
./Chapter6/Figures/figure-
a
-combination-of-position-encoding-and-word-encoding
}
\caption
{
位置编码与词编码的组合
}
\label
{
fig:6-43
}
\end{figure}
...
...
@@ -1448,7 +1448,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
P
osition-of-self-attention-mechanism-in-the-model
}
\input
{
./Chapter6/Figures/figure-
p
osition-of-self-attention-mechanism-in-the-model
}
\caption
{
自注意力机制在模型中的位置
}
\label
{
fig:6-44
}
\end{figure}
...
...
@@ -1479,7 +1479,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
P
oint-product-attention-model
}
\input
{
./Chapter6/Figures/figure-
p
oint-product-attention-model
}
\caption
{
点乘注意力力模型
}
\label
{
fig:6-45
}
\end{figure}
...
...
@@ -1511,7 +1511,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
Mask-instance-for-future-positions-in-T
ransformer
}
\input
{
./Chapter6/Figures/figure-
mask-instance-for-future-positions-in-t
ransformer
}
\caption
{
Transformer中对于未来位置进行的屏蔽的Mask实例
}
\label
{
fig:6-47
}
\end{figure}
...
...
@@ -1535,7 +1535,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
Multi-Head-Attention-M
odel
}
\input
{
./Chapter6/Figures/figure-
multi-head-attention-m
odel
}
\caption
{
多头注意力模型
}
\label
{
fig:6-48
}
\end{figure}
...
...
@@ -1560,7 +1560,7 @@ L(\mathbf{Y},\widehat{\mathbf{Y}}) = \sum_{j=1}^n L_{\textrm{ce}}(\mathbf{y}_j,\
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
R
esidual-network-structure
}
\input
{
./Chapter6/Figures/figure-
r
esidual-network-structure
}
\caption
{
残差网络结构
}
\label
{
fig:6-49
}
\end{figure}
...
...
@@ -1579,7 +1579,7 @@ x_{l+1} = x_l + \digamma (x_l)
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
P
osition-of-difference-and-layer-regularization-in-the-model
}
\input
{
./Chapter6/Figures/figure-
p
osition-of-difference-and-layer-regularization-in-the-model
}
\caption
{
残差和层正则化在模型中的位置
}
\label
{
fig:6-50
}
\end{figure}
...
...
@@ -1600,7 +1600,7 @@ x_{l+1} = x_l + \digamma (x_l)
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
D
ifferent-regularization-methods
}
\input
{
./Chapter6/Figures/figure-
d
ifferent-regularization-methods
}
\caption
{
不同正则化方式
}
\label
{
fig:6-51
}
\end{figure}
...
...
@@ -1613,7 +1613,7 @@ x_{l+1} = x_l + \digamma (x_l)
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
P
osition-of-feedforward-neural-network-in-the-model
}
\input
{
./Chapter6/Figures/figure-
p
osition-of-feedforward-neural-network-in-the-model
}
\caption
{
前馈神经网络在模型中的位置
}
\label
{
fig:6-52
}
\end{figure}
...
...
@@ -1636,7 +1636,7 @@ x_{l+1} = x_l + \digamma (x_l)
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
Structure-of-the-network-during-T
ransformer-training
}
\input
{
./Chapter6/Figures/figure-
structure-of-the-network-during-t
ransformer-training
}
\caption
{
Transformer训练时网络的结构
}
\label
{
fig:6-53
}
\end{figure}
...
...
@@ -1676,7 +1676,7 @@ lrate = d_{model}^{-0.5} \cdot \textrm{min} (step^{-0.5} , step \cdot warmup\_st
% 图3.10
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
C
omparison-of-the-number-of-padding-in-batch
}
\input
{
./Chapter6/Figures/figure-
c
omparison-of-the-number-of-padding-in-batch
}
\caption
{
batch中padding数量对比(白色部分为padding)
}
\label
{
fig:6-55
}
\end{figure}
...
...
@@ -1752,7 +1752,7 @@ Transformer Deep(48层) & 30.2 & 43.1 & 194$\times 10^{6}$
% 图3.6.1
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
G
enerate-summary
}
\input
{
./Chapter6/Figures/figure-
g
enerate-summary
}
\caption
{
文本自动摘要实例
}
\label
{
fig:6-57
}
\end{figure}
...
...
@@ -1764,7 +1764,7 @@ Transformer Deep(48层) & 30.2 & 43.1 & 194$\times 10^{6}$
% 图3.6.1
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
Example-of-automatic-translation-of-classical-C
hinese
}
\input
{
./Chapter6/Figures/figure-
example-of-automatic-translation-of-classical-c
hinese
}
\caption
{
文言文自动翻译实例
}
\label
{
fig:6-58
}
\end{figure}
...
...
@@ -1780,7 +1780,7 @@ Transformer Deep(48层) & 30.2 & 43.1 & 194$\times 10^{6}$
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
A
utomatically-generate-instances-of-couplets
}
\input
{
./Chapter6/Figures/figure-
a
utomatically-generate-instances-of-couplets
}
\caption
{
对联自动生成实例(人工给定上联)
}
\label
{
fig:6-59
}
\end{figure}
...
...
@@ -1796,7 +1796,7 @@ Transformer Deep(48层) & 30.2 & 43.1 & 194$\times 10^{6}$
\begin{figure}
[htp]
\centering
\input
{
./Chapter6/Figures/figure-
A
utomatic-generation-of-ancient-poems-based-on-encoder-decoder-framework
}
\input
{
./Chapter6/Figures/figure-
a
utomatic-generation-of-ancient-poems-based-on-encoder-decoder-framework
}
\caption
{
基于编码器-解码器框架的古诗自动生成
}
\label
{
fig:6-60
}
\end{figure}
...
...
Book/Chapter7/Chapter7.tex
查看文件 @
d85642f9
...
...
@@ -90,7 +90,7 @@
%----------------------------------------------
\begin{figure}
[htp]
\centering
\input
{
./Chapter7/Figures/figure-construction-steps-of-
MT
-system
}
\input
{
./Chapter7/Figures/figure-construction-steps-of-
mt
-system
}
\caption
{
构建神经机器翻译系统的主要步骤
}
\label
{
fig:7-2
}
\end{figure}
...
...
@@ -417,7 +417,7 @@ y = f(x)
% 图7.
\begin{figure}
[htp]
\centering
\input
{
./Chapter7/Figures/figure-
Underfitting-vs-O
verfitting
}
\input
{
./Chapter7/Figures/figure-
underfitting-vs-o
verfitting
}
\caption
{
欠拟合 vs 过拟合
}
\label
{
fig:7-11
}
\end{figure}
...
...
@@ -1191,7 +1191,7 @@ b &=& \omega_{\textrm{high}}\cdot |\mathbf{x}|
% 图7.5.1
\begin{figure}
[htp]
\centering
\input
{
./Chapter7/Figures/
Post-Norm-vs-Pre-N
orm
}
\input
{
./Chapter7/Figures/
figure-post-norm-vs-pre-n
orm
}
\caption
{
Post-Norm Transformer vs Pre-Norm Transformer
}
\label
{
fig:7-28
}
\end{figure}
...
...
@@ -1273,7 +1273,7 @@ $g_l$会作为输入的一部分送入第$l+1$层。其网络的结构图\ref{fi
% 图7.5.2
\begin{figure}
[htp]
\centering
\input
{
./Chapter7/Figures/dynamic-linear-aggregation-network-structure
}
\input
{
./Chapter7/Figures/
figure-
dynamic-linear-aggregation-network-structure
}
\caption
{
动态线性层聚合网络结构图
}
\label
{
fig:7-29
}
\end{figure}
...
...
@@ -1299,7 +1299,7 @@ $g_l$会作为输入的一部分送入第$l+1$层。其网络的结构图\ref{fi
% 图7.5.3
\begin{figure}
[htp]
\centering
\input
{
./Chapter7/Figures/progressive-training
}
\input
{
./Chapter7/Figures/
figure-
progressive-training
}
\caption
{
渐进式深层网络训练过程
}
\label
{
fig:7-30
}
\end{figure}
...
...
@@ -1316,7 +1316,7 @@ $g_l$会作为输入的一部分送入第$l+1$层。其网络的结构图\ref{fi
% 图7.5.4
\begin{figure}
[htp]
\centering
\input
{
./Chapter7/Figures/sparse-connections-between-different-groups
}
\input
{
./Chapter7/Figures/
figure-
sparse-connections-between-different-groups
}
\caption
{
不同组之间的稀疏连接
}
\label
{
fig:7-31
}
\end{figure}
...
...
@@ -1335,7 +1335,7 @@ $g_l$会作为输入的一部分送入第$l+1$层。其网络的结构图\ref{fi
% 图7.5.5
\begin{figure}
[htp]
\centering
\input
{
./Chapter7/Figures/learning-rate
}
\input
{
./Chapter7/Figures/
figure-
learning-rate
}
\caption
{
学习率重置vs从头训练的学习率曲线
}
\label
{
fig:7-32
}
\end{figure}
...
...
@@ -1411,7 +1411,7 @@ p_l=\frac{l}{2L}\cdot \varphi
% 图7.5.7
\begin{figure}
[htp]
\centering
\input
{
./Chapter7/Figures/expanded-residual-network
}
\input
{
./Chapter7/Figures/
figure-
expanded-residual-network
}
\caption
{
Layer Dropout中残差网络的展开图
}
\label
{
fig:7-34
}
\end{figure}
...
...
Book/Chapter7/Figures/dynamic-linear-aggregation-network-structure.tex
→
Book/Chapter7/Figures/
figure-
dynamic-linear-aggregation-network-structure.tex
查看文件 @
d85642f9
File moved
Book/Chapter7/Figures/expanded-residual-network.tex
→
Book/Chapter7/Figures/
figure-
expanded-residual-network.tex
查看文件 @
d85642f9
File moved
Book/Chapter7/Figures/learning-rate.tex
→
Book/Chapter7/Figures/
figure-
learning-rate.tex
查看文件 @
d85642f9
File moved
Book/Chapter7/Figures/
Post-Norm-vs-Pre-N
orm.tex
→
Book/Chapter7/Figures/
figure-post-norm-vs-pre-n
orm.tex
查看文件 @
d85642f9
File moved
Book/Chapter7/Figures/progressive-training.tex
→
Book/Chapter7/Figures/
figure-
progressive-training.tex
查看文件 @
d85642f9
File moved
Book/Chapter7/Figures/sparse-connections-between-different-groups.tex
→
Book/Chapter7/Figures/
figure-
sparse-connections-between-different-groups.tex
查看文件 @
d85642f9
File moved
Book/mt-book-xelatex.tex
查看文件 @
d85642f9
...
...
@@ -122,13 +122,13 @@
% CHAPTERS
%----------------------------------------------------------------------------------------
\include
{
Chapter1/chapter1
}
%
\include{Chapter1/chapter1}
%\include{Chapter2/chapter2}
%\include{Chapter3/chapter3}
%\include{Chapter4/chapter4}
%\include{Chapter5/chapter5}
%\include{Chapter6/chapter6}
%
\include{Chapter7/chapter7}
\include
{
Chapter7/chapter7
}
%\include{ChapterAppend/chapterappend}
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论