Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
T
Toy-MT-Introduction
概览
Overview
Details
Activity
Cycle Analytics
版本库
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
问题
0
Issues
0
列表
Board
标记
里程碑
合并请求
0
Merge Requests
0
CI / CD
CI / CD
流水线
作业
日程表
图表
维基
Wiki
代码片段
Snippets
成员
Collapse sidebar
Close sidebar
活动
图像
聊天
创建新问题
作业
提交
Issue Boards
Open sidebar
NiuTrans
Toy-MT-Introduction
Commits
f9b5b4f4
Commit
f9b5b4f4
authored
Apr 14, 2020
by
xiaotong
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
updates of section 3
parent
19badf54
隐藏空白字符变更
内嵌
并排
正在显示
5 个修改的文件
包含
119 行增加
和
781 行删除
+119
-781
Book/Chapter3/Chapter3.tex
+25
-16
Book/mt-book-xelatex.idx
+41
-227
Book/mt-book-xelatex.ptc
+46
-531
Book/mt-book-xelatex.tex
+6
-6
Section03-Word-Based-Models/section03.tex
+1
-1
没有找到文件。
Book/Chapter3/Chapter3.tex
查看文件 @
f9b5b4f4
...
@@ -800,11 +800,11 @@ L(f,\lambda)=\frac{\varepsilon}{(l+1)^m}\prod_{j=1}^{m}\sum_{i=0}^{l}{f(s_j|t_i)
...
@@ -800,11 +800,11 @@ L(f,\lambda)=\frac{\varepsilon}{(l+1)^m}\prod_{j=1}^{m}\sum_{i=0}^{l}{f(s_j|t_i)
\end{figure}
\end{figure}
%---------------------------
%---------------------------
\noindent
因为
$
L
(
f,
\lambda
)
$
是可微分函数,因此可以通过计算
$
L
(
f,
\lambda
)
$
导数为零的点得到极值点。因为这个模型里仅有
$
f
(
s
_
x|t
_
y
)
$
一种类型的参数,只需要如下导数进行计算。
\noindent
因为
$
L
(
f,
\lambda
)
$
是可微分函数,因此可以通过计算
$
L
(
f,
\lambda
)
$
导数为零的点得到极值点。因为这个模型里仅有
$
f
(
s
_
x|t
_
y
)
$
一种类型的参数,只需要
对
如下导数进行计算。
\begin{eqnarray}
\begin{eqnarray}
\frac
{
\partial
L(f,
\lambda
)
}{
\partial
f(s
_
u|t
_
v)
}&
=
&
\frac
{
\partial
\big
[ \frac{\varepsilon}{(l+1)^{m}} \prod\limits_{j=1}^{m} \sum\limits_{i=0}^{l} f(s_j|t_i) \big]
}{
\partial
f(s
_
u|t
_
v)
}
-
\nonumber
\\
\frac
{
\partial
L(f,
\lambda
)
}{
\partial
f(s
_
u|t
_
v)
}&
=
&
\frac
{
\partial
\big
[ \frac{\varepsilon}{(l+1)^{m}} \prod\limits_{j=1}^{m} \sum\limits_{i=0}^{l} f(s_j|t_i) \big]
}{
\partial
f(s
_
u|t
_
v)
}
-
\nonumber
\\
&
&
\frac
{
\partial
\big
[ \sum_{t_y} \lambda_{t_y} (\sum_{s_x} f(s_x|t_y) -1) \big]
}{
\partial
f(s
_
u|t
_
v)
}
\nonumber
\\
&
&
\frac
{
\partial
\big
[ \sum_{t_y} \lambda_{t_y} (\sum_{s_x} f(s_x|t_y) -1) \big]
}{
\partial
f(s
_
u|t
_
v)
}
\nonumber
\\
&
=
&
\frac
{
\varepsilon
}{
(l+1)
^{
m
}}
\cdot
\frac
{
\partial
\big
[ \prod\limits_{j=1}^{m} \sum\limits_{i=0}^{l} f(s_j|t_
{a_j}
) \big]
}{
\partial
f(s
_
u|t
_
v)
}
-
\lambda
_{
t
_
v
}
&
=
&
\frac
{
\varepsilon
}{
(l+1)
^{
m
}}
\cdot
\frac
{
\partial
\big
[ \prod\limits_{j=1}^{m} \sum\limits_{i=0}^{l} f(s_j|t_
i
) \big]
}{
\partial
f(s
_
u|t
_
v)
}
-
\lambda
_{
t
_
v
}
\label
{
eqC3.34-new
}
\label
{
eqC3.34-new
}
\end{eqnarray}
\end{eqnarray}
...
@@ -844,13 +844,13 @@ f(s_u|t_v) = \frac{\lambda_{t_v}^{-1} \varepsilon}{(l+1)^{m}} \cdot \frac{\sum\l
...
@@ -844,13 +844,13 @@ f(s_u|t_v) = \frac{\lambda_{t_v}^{-1} \varepsilon}{(l+1)^{m}} \cdot \frac{\sum\l
\label
{
eqC3.39-new
}
\label
{
eqC3.39-new
}
\end{eqnarray}
\end{eqnarray}
\noindent
\hspace
{
2em
}
将上式稍作调整得到下式
,可以看出,这不是一个计算
$
f
(
s
_
u|t
_
v
)
$
的解析式,因为等式右端仍含有
$
f
(
s
_
u|t
_
v
)
$
。
\noindent
\hspace
{
2em
}
将上式稍作调整得到下式
:
\begin{eqnarray}
\begin{eqnarray}
f(s
_
u|t
_
v) =
\lambda
_{
t
_
v
}^{
-1
}
\frac
{
\varepsilon
}{
(l+1)
^{
m
}}
\prod\limits
_{
j=1
}^{
m
}
\sum\limits
_{
i=0
}^{
l
}
f(s
_
j|t
_
i)
\sum\limits
_{
j=1
}^{
m
}
\delta
(s
_
j,s
_
u)
\sum\limits
_{
i=0
}^{
l
}
\delta
(t
_
i,t
_
v)
\frac
{
f(s
_
u|t
_
v)
}{
\sum\limits
_{
i=0
}^{
l
}
f(s
_
u|t
_
i)
}
f(s
_
u|t
_
v) =
\lambda
_{
t
_
v
}^{
-1
}
\frac
{
\varepsilon
}{
(l+1)
^{
m
}}
\prod\limits
_{
j=1
}^{
m
}
\sum\limits
_{
i=0
}^{
l
}
f(s
_
j|t
_
i)
\sum\limits
_{
j=1
}^{
m
}
\delta
(s
_
j,s
_
u)
\sum\limits
_{
i=0
}^{
l
}
\delta
(t
_
i,t
_
v)
\frac
{
f(s
_
u|t
_
v)
}{
\sum\limits
_{
i=0
}^{
l
}
f(s
_
u|t
_
i)
}
\label
{
eqC3.40-new
}
\label
{
eqC3.40-new
}
\end{eqnarray}
\end{eqnarray}
\noindent
\hspace
{
2em
}
通过采用一个非常经典的
{
\small\sffamily\bfseries
{
期望最大化
}}
(Expectation Maximization)方法,简称EM方法(或算法),仍可以利用上式迭代地计算
$
f
(
s
_
u|t
_
v
)
$
,使其最终收敛到最优值。该方法的思想是:用当前的参数,求一个似然函数的期望,之后最大化这个期望同时得到新的一组参数的值。对于IBM模型来说,其迭代过程就是反复使用公式1.39,具体如下图
。
\noindent
\hspace
{
2em
}
可以看出,这不是一个计算
$
f
(
s
_
u|t
_
v
)
$
的解析式,因为等式右端仍含有
$
f
(
s
_
u|t
_
v
)
$
。不过它蕴含着一种非常经典的方法
\
$
\dash
$
\
{
\small\sffamily\bfseries
{
期望最大化
}}
(Expectation Maximization)方法,简称EM方法(或算法)。使用EM方法可以利用上式迭代地计算
$
f
(
s
_
u|t
_
v
)
$
,使其最终收敛到最优值。EM方法的思想是:用当前的参数,求似然函数的期望,之后最大化这个期望同时得到新的一组参数的值。对于IBM模型来说,其迭代过程就是反复使用公式
\ref
{
eqC3.40-new
}
,具体如图
\ref
{
fig:3-24
}
所示
。
%----------------------------------------------
%----------------------------------------------
% 图3.28
% 图3.28
\begin{figure}
[htp]
\begin{figure}
[htp]
...
@@ -861,25 +861,34 @@ f(s_u|t_v) = \lambda_{t_v}^{-1} \frac{\varepsilon}{(l+1)^{m}} \prod\limits_{j=1}
...
@@ -861,25 +861,34 @@ f(s_u|t_v) = \lambda_{t_v}^{-1} \frac{\varepsilon}{(l+1)^{m}} \prod\limits_{j=1}
\end{figure}
\end{figure}
%---------------------------
%---------------------------
\noindent
\hspace
{
2em
}
为了化简
$
f
(
s
_
u|t
_
v
)
$
的计算,在此对公式
\ref
{
eqC3.40-new
}
进行了重新组织,见下图。红色部分表示翻译概率P
$
(
\mathbf
{
s
}
|
\mathbf
{
t
}
)
$
;蓝色部分表示
$
(
s
_
u,t
_
v
)
$
在句对
$
(
\mathbf
{
s
}
,
\mathbf
{
t
}
)
$
中配对的总次数,即``
$
t
_
v
$
翻译为
$
s
_
u
$
''在所有对齐中出现的次数;绿色部分表示
$
f
(
s
_
u|t
_
v
)
$
对于所有的
$
t
_
i
$
的相对值,即``
$
t
_
v
$
翻译为
$
s
_
u
$
''在所有对齐中出现的相对概率;蓝色与绿色部分相乘表示``
$
t
_
v
$
翻译为
$
s
_
u
$
''这个事件出现次数的期望的估计,称之为
{
\small\sffamily\bfseries
{
期望频次
}}
(expected count)。
\noindent
\hspace
{
2em
}
为了化简
$
f
(
s
_
u|t
_
v
)
$
的计算,在此对公式
\ref
{
eqC3.40-new
}
进行了重新组织,见图
\ref
{
fig:3-25
}
。
%----------------------------------------------
%----------------------------------------------
% 图3.29
% 图3.29
\begin{figure}
[htp]
\begin{figure}
[htp]
\centering
\centering
\input
{
./Chapter3/Figures/figure-a-more-detailed-explanation-of-formula-3.40
}
\input
{
./Chapter3/Figures/figure-a-more-detailed-explanation-of-formula-3.40
}
\caption
{
公式
\ref
{
eqC3.40-new
}
的
更详细
解释
}
\caption
{
公式
\ref
{
eqC3.40-new
}
的解释
}
\label
{
fig:3-25
}
\label
{
fig:3-25
}
\end{figure}
\end{figure}
%---------------------------
%---------------------------
\noindent
\hspace
{
2em
}
更具体的,期望频次是事件在其分布下出现的次数的期望。其计算公式为:
$
c
_{
\mathbb
{
E
}}
(
X
)=
\sum
_
i c
(
x
_
i
)
\cdot
\textrm
{
P
}
(
x
_
i
)
$
。其中
$
c
(
x
_
i
)
$
表示
$
x
_
i
$
出现的次数,P
$
(
x
_
i
)
$
表示
$
x
_
i
$
出现的概率。图
\ref
{
fig:3-26
}
展示了事件X的期望频次的详细计算过程。其中
$
x
_
1
,x
_
2
,x
_
3
$
分别表示事件X出现2次,1次和5次的情况。
\noindent
其中,红色部分表示翻译概率P
$
(
\mathbf
{
s
}
|
\mathbf
{
t
}
)
$
;蓝色部分表示
$
(
s
_
u,t
_
v
)
$
在句对
$
(
\mathbf
{
s
}
,
\mathbf
{
t
}
)
$
中配对的总次数,即``
$
t
_
v
$
翻译为
$
s
_
u
$
''在所有对齐中出现的次数;绿色部分表示
$
f
(
s
_
u|t
_
v
)
$
对于所有的
$
t
_
i
$
的相对值,即``
$
t
_
v
$
翻译为
$
s
_
u
$
''在所有对齐中出现的相对概率;蓝色与绿色部分相乘表示``
$
t
_
v
$
翻译为
$
s
_
u
$
''这个事件出现次数的期望的估计,称之为
{
\small\sffamily\bfseries
{
期望频次
}}
(Expected Count)。
\noindent
\hspace
{
2em
}
期望频次是事件在其分布下出现次数的期望。另
$
c
_{
\mathbb
{
E
}}
(
X
)
$
为事件
$
X
$
的期望频次,其计算公式为:
\begin{equation}
c
_{
\mathbb
{
E
}}
(X)=
\sum
_
i c(x
_
i)
\cdot
\textrm
{
P
}
(x
_
i)
\end{equation}
\noindent
其中
$
c
(
x
_
i
)
$
表示
$
X
$
取
$
x
_
i
$
时出现的次数,P
$
(
x
_
i
)
$
表示
$
X
=
x
_
i
$
出现的概率。图
\ref
{
fig:3-26
}
展示了事件
$
X
$
的期望频次的详细计算过程。其中
$
x
_
1
$
、
$
x
_
2
$
和
$
x
_
3
$
分别表示事件
$
X
$
出现2次、1次和5次的情况。
%----------------------------------------------
%----------------------------------------------
% 图1.26
% 图1.26
\begin{figure}
[htp]
\begin{figure}
[htp]
\centering
\centering
\subfigure
{
\input
{
./Chapter3/Figures/figure-calculation-of-the-expected-frequency-1
}}
\subfigure
{
\input
{
./Chapter3/Figures/figure-calculation-of-the-expected-frequency-1
}}
\subfigure
{
\input
{
./Chapter3/Figures/figure-calculation-of-the-expected-frequency-2
}}
\subfigure
{
\input
{
./Chapter3/Figures/figure-calculation-of-the-expected-frequency-2
}}
\caption
{
期望频次的详细
计算过程
}
\caption
{
频次(左)和期望频次(右)的
计算过程
}
\label
{
fig:3-26
}
\label
{
fig:3-26
}
\end{figure}
\end{figure}
%---------------------------
%---------------------------
...
@@ -909,7 +918,7 @@ f(s_u|t_v) &= &\lambda_{t_v}^{-1} \cdot \textrm{P}(\mathbf{s}| \mathbf{t}) \cdot
...
@@ -909,7 +918,7 @@ f(s_u|t_v) &= &\lambda_{t_v}^{-1} \cdot \textrm{P}(\mathbf{s}| \mathbf{t}) \cdot
\label
{
eqC3.44-new
}
\label
{
eqC3.44-new
}
\end{eqnarray}
\end{eqnarray}
\noindent
\hspace
{
2em
}
为了满足
$
f
(
\cdot
|
\cdot
)
$
的概率归一化约束,易知
$
\lambda
_{
t
_
v
}^{
'
}$
的计算
为:
\noindent
\hspace
{
2em
}
为了满足
$
f
(
\cdot
|
\cdot
)
$
的概率归一化约束,易知
$
\lambda
_{
t
_
v
}^{
'
}$
为:
\begin{eqnarray}
\begin{eqnarray}
\lambda
_{
t
_
v
}^{
'
}
=
\sum\limits
_{
s
_
u
}
c
_{
\mathbb
{
E
}}
(s
_
u|t
_
v;
\mathbf
{
s
}
,
\mathbf
{
t
}
)
\lambda
_{
t
_
v
}^{
'
}
=
\sum\limits
_{
s
_
u
}
c
_{
\mathbb
{
E
}}
(s
_
u|t
_
v;
\mathbf
{
s
}
,
\mathbf
{
t
}
)
\label
{
eqC3.45-new
}
\label
{
eqC3.45-new
}
...
@@ -921,14 +930,14 @@ f(s_u|t_v)=\frac{c_{\mathbb{E}}(s_u|t_v;\mathbf{s},\mathbf{t})} { \sum\limits_{
...
@@ -921,14 +930,14 @@ f(s_u|t_v)=\frac{c_{\mathbb{E}}(s_u|t_v;\mathbf{s},\mathbf{t})} { \sum\limits_{
\label
{
eqC3.46-new
}
\label
{
eqC3.46-new
}
\end{eqnarray}
\end{eqnarray}
\noindent
\hspace
{
2em
}
总的来说,假设拥有的
$
N
$
个互译的句对(称作平行语料):
\noindent
\hspace
{
2em
}
进一步,假设有
$
N
$
个互译的句对(称作平行语料):
$
{
(
\mathbf
{
s
}^{
[
1
]
}
,
\mathbf
{
t
}^{
[
1
]
}
)
,
(
\mathbf
{
s
}^{
[
2
]
}
,
\mathbf
{
t
}^{
[
2
]
}
)
,...,
(
\mathbf
{
s
}^{
[
N
]
}
,
\mathbf
{
t
}^{
[
N
]
}
)
}$
来说,
$
f
(
s
_
u|t
_
v
)
$
的期望频次为:
$
\{
(
\mathbf
{
s
}^{
[
1
]
}
,
\mathbf
{
t
}^{
[
1
]
}
)
,...,
(
\mathbf
{
s
}^{
[
N
]
}
,
\mathbf
{
t
}^{
[
N
]
}
)
\
}
$
来说,
$
f
(
s
_
u|t
_
v
)
$
的期望频次为:
\begin{eqnarray}
\begin{eqnarray}
c
_{
\mathbb
{
E
}}
(s
_
u|t
_
v)=
\sum\limits
_{
i=1
}^{
N
}
c
_{
\mathbb
{
E
}}
(s
_
u|t
_
v;s
^{
[i]
}
,t
^{
[i]
}
)
c
_{
\mathbb
{
E
}}
(s
_
u|t
_
v)=
\sum\limits
_{
i=1
}^{
N
}
c
_{
\mathbb
{
E
}}
(s
_
u|t
_
v;s
^{
[i]
}
,t
^{
[i]
}
)
\label
{
eqC3.47-new
}
\label
{
eqC3.47-new
}
\end{eqnarray}
\end{eqnarray}
\noindent
\hspace
{
2em
}
于是有
$
f
(
s
_
u|t
_
v
)
$
的计算公式和迭代过程
如下:
\noindent
\hspace
{
2em
}
于是有
$
f
(
s
_
u|t
_
v
)
$
的计算公式和迭代过程
图
\ref
{
fig:3-27
}
所示。完整的EM算法如图
\ref
{
fig:3-28
}
所示。其中E-Step对应4-5行,目的是计算
$
c
_{
\mathbb
{
E
}}
(
\cdot
)
$
;M-Step对应6-9行,目的是计算
$
f
(
\cdot
)
$
。
%----------------------------------------------
%----------------------------------------------
% 图3.30
% 图3.30
\begin{figure}
[htp]
\begin{figure}
[htp]
...
@@ -939,7 +948,6 @@ c_{\mathbb{E}}(s_u|t_v)=\sum\limits_{i=1}^{N} c_{\mathbb{E}}(s_u|t_v;s^{[i]},t^
...
@@ -939,7 +948,6 @@ c_{\mathbb{E}}(s_u|t_v)=\sum\limits_{i=1}^{N} c_{\mathbb{E}}(s_u|t_v;s^{[i]},t^
\end{figure}
\end{figure}
%---------------------------
%---------------------------
\noindent
\hspace
{
2em
}
完整的EM算法如下图所示。其中E-Step对应4-5行,目的是计算
$
c
_{
\mathbb
{
E
}}
(
\cdot
)
$
;M-Step对应6-9行,目的是计算
$
f
(
\cdot
)
$
。
%----------------------------------------------
%----------------------------------------------
% 图3.31
% 图3.31
\begin{figure}
[htp]
\begin{figure}
[htp]
...
@@ -953,7 +961,7 @@ c_{\mathbb{E}}(s_u|t_v)=\sum\limits_{i=1}^{N} c_{\mathbb{E}}(s_u|t_v;s^{[i]},t^
...
@@ -953,7 +961,7 @@ c_{\mathbb{E}}(s_u|t_v)=\sum\limits_{i=1}^{N} c_{\mathbb{E}}(s_u|t_v;s^{[i]},t^
\noindent
\hspace
{
2em
}
同样的,EM算法可以直接用于训练IBM模型2。对于句对
$
(
\mathbf
{
s
}
,
\mathbf
{
t
}
)
$
,
$
m
=
|
\mathbf
{
s
}
|
$
,
$
l
=
|
\mathbf
{
t
}
|
$
,E-Step的计算公式如下,其中参数
$
f
(
s
_
j|t
_
i
)
$
与IBM模型1一样:
\noindent
\hspace
{
2em
}
同样的,EM算法可以直接用于训练IBM模型2。对于句对
$
(
\mathbf
{
s
}
,
\mathbf
{
t
}
)
$
,
$
m
=
|
\mathbf
{
s
}
|
$
,
$
l
=
|
\mathbf
{
t
}
|
$
,E-Step的计算公式如下,其中参数
$
f
(
s
_
j|t
_
i
)
$
与IBM模型1一样:
\begin{eqnarray}
\begin{eqnarray}
c
_{
\mathbb
{
E
}}
(s
_
u|t
_
v;
\mathbf
{
s
}
,
\mathbf
{
t
}
)
&
=
&
\sum\limits
_{
j=1
}^{
m
}
\sum\limits
_{
i=0
}^{
l
}
\frac
{
f(s
_
u|t
_
v)a(i|j,m,l)
\delta
(s
_
j,s
_
u)
\delta
(t
_
i,t
_
v)
}
{
\sum
_{
k=0
}^{
l
}
f(s
_
u|t
_
v
)a(k|j,m,l)
}
\\
c
_{
\mathbb
{
E
}}
(s
_
u|t
_
v;
\mathbf
{
s
}
,
\mathbf
{
t
}
)
&
=
&
\sum\limits
_{
j=1
}^{
m
}
\sum\limits
_{
i=0
}^{
l
}
\frac
{
f(s
_
u|t
_
v)a(i|j,m,l)
\delta
(s
_
j,s
_
u)
\delta
(t
_
i,t
_
v)
}
{
\sum
_{
k=0
}^{
l
}
f(s
_
u|t
_
k
)a(k|j,m,l)
}
\\
c
_{
\mathbb
{
E
}}
(i|j,m,l;
\mathbf
{
s
}
,
\mathbf
{
t
}
)
&
=
&
\frac
{
f(s
_
j|t
_
i)a(i|j,m,l)
}
{
\sum
_{
k=0
}^{
l
}
f(s
_
j|t
_
k)a(k,j,m,l)
}
c
_{
\mathbb
{
E
}}
(i|j,m,l;
\mathbf
{
s
}
,
\mathbf
{
t
}
)
&
=
&
\frac
{
f(s
_
j|t
_
i)a(i|j,m,l)
}
{
\sum
_{
k=0
}^{
l
}
f(s
_
j|t
_
k)a(k,j,m,l)
}
\label
{
eqC3.49-new
}
\label
{
eqC3.49-new
}
\end{eqnarray}
\end{eqnarray}
...
@@ -971,9 +979,10 @@ a(i|j,m,l) &=\frac{\sum_{k=0}^{K}c_{\mathbb{E}}(i|j;\mathbf{s}^{[k]},\mathbf{t}^
...
@@ -971,9 +979,10 @@ a(i|j,m,l) &=\frac{\sum_{k=0}^{K}c_{\mathbb{E}}(i|j;\mathbf{s}^{[k]},\mathbf{t}^
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection
{
基于产出率的翻译模型
}
\index
{
Chapter3.5.1
}
\subsection
{
基于产出率的翻译模型
}
\index
{
Chapter3.5.1
}
\parinterval
从前面的介绍可知,IBM模型1和模型2都把不同的源文单词都看作相互独立的单元来进行词对齐和翻译。换句话说,即使源语中的某个短语中的两个单词都对齐到同一个目标语单词,它们之间也是相互独立的。这样模型1和模型2对于多个源语单词对齐到同一个目标语单词的情况并不能很好的描述。
\parinterval
从前面的介绍可知,IBM模型1和模型2把不同的源语言单词看作相互独立的单元来进行词对齐和翻译。换句话说,即使某个源语言短语中的两个单词都对齐到同一个目标语单词,它们之间也是相互独立的。这样模型1和模型2对于多个源语言单词对齐到同一个目标语单词的情况并不能很好的进行描述。
\parinterval
这里将会给出另一个翻译模型,能在一定程度上解决上面提到的问题。该模型把译文生成源文的过程分解为如下几个步骤:首先,确定每个目标语言单词生成源语言单词的个数,这里把它称为
{
\small\sffamily\bfseries
{
产出率
}}
或
{
\small\sffamily\bfseries
{
繁衍率
}}
(Fertility);其次,决定译文中每个单词生成的源语言单词都是什么,即决定生成的第一个源语言单词是什么,生成的第二个源语言单词是什么,以此类推。这样每个目标语单词就对应了一个源语言单词列表;最后把各组源语言单词列表中的每个单词都放置到合适的位置上,完成目标语言译文到源语言句子的生成。
\parinterval
这里将会给出另一个翻译模型,能在一定程度上解决上面提到的问题。该模型把目标语言译文生成源文的过程分解为如下几个步骤:首先,确定每个目标语言单词生成源语言单词的个数,这里把它称为
{
\small\sffamily\bfseries
{
产出率
}}
或
{
\small\sffamily\bfseries
{
繁衍率
}}
(fertility);其次,决定译文中每个单词生成的源语单词都是什么,即决定生成的第一个源语单词是什么,生成的第二个源语单词是什么,以此类推。这样每个目标语单词就对应了一个源语单词列表;最后把各组源语单词列表中的每个单词都放置到源语句子的某个位置上,完成目标语言译文到源语言句子的生成。
%----------------------------------------------
%----------------------------------------------
% 图3.5.1
% 图3.5.1
\begin{figure}
[htp]
\begin{figure}
[htp]
...
...
Book/mt-book-xelatex.idx
查看文件 @
f9b5b4f4
\indexentry{Chapter1.1|hyperpage}{13}
\indexentry{Chapter3.1|hyperpage}{9}
\indexentry{Chapter1.2|hyperpage}{16}
\indexentry{Chapter3.2|hyperpage}{11}
\indexentry{Chapter1.3|hyperpage}{21}
\indexentry{Chapter3.2.1|hyperpage}{11}
\indexentry{Chapter1.4|hyperpage}{22}
\indexentry{Chapter3.2.1.1|hyperpage}{11}
\indexentry{Chapter1.4.1|hyperpage}{22}
\indexentry{Chapter3.2.1.2|hyperpage}{12}
\indexentry{Chapter1.4.2|hyperpage}{24}
\indexentry{Chapter3.2.1.3|hyperpage}{13}
\indexentry{Chapter1.4.3|hyperpage}{25}
\indexentry{Chapter3.2.2|hyperpage}{13}
\indexentry{Chapter1.4.4|hyperpage}{26}
\indexentry{Chapter3.2.3|hyperpage}{14}
\indexentry{Chapter1.4.5|hyperpage}{27}
\indexentry{Chapter3.2.3.1|hyperpage}{14}
\indexentry{Chapter1.5|hyperpage}{28}
\indexentry{Chapter3.2.3.2|hyperpage}{14}
\indexentry{Chapter1.5.1|hyperpage}{28}
\indexentry{Chapter3.2.3.3|hyperpage}{16}
\indexentry{Chapter1.5.2|hyperpage}{29}
\indexentry{Chapter3.2.4|hyperpage}{17}
\indexentry{Chapter1.5.2.1|hyperpage}{29}
\indexentry{Chapter3.2.4.1|hyperpage}{17}
\indexentry{Chapter1.5.2.2|hyperpage}{31}
\indexentry{Chapter3.2.4.2|hyperpage}{19}
\indexentry{Chapter1.5.2.3|hyperpage}{31}
\indexentry{Chapter3.2.5|hyperpage}{21}
\indexentry{Chapter1.6|hyperpage}{32}
\indexentry{Chapter3.3|hyperpage}{24}
\indexentry{Chapter1.7|hyperpage}{34}
\indexentry{Chapter3.3.1|hyperpage}{24}
\indexentry{Chapter1.7.1|hyperpage}{34}
\indexentry{Chapter3.3.2|hyperpage}{26}
\indexentry{Chapter1.7.1.1|hyperpage}{34}
\indexentry{Chapter3.3.2.1|hyperpage}{27}
\indexentry{Chapter1.7.1.2|hyperpage}{36}
\indexentry{Chapter3.3.2.2|hyperpage}{27}
\indexentry{Chapter1.7.2|hyperpage}{38}
\indexentry{Chapter3.3.2.3|hyperpage}{29}
\indexentry{Chapter1.8|hyperpage}{40}
\indexentry{Chapter3.4|hyperpage}{30}
\indexentry{Chapter2.1|hyperpage}{46}
\indexentry{Chapter3.4.1|hyperpage}{30}
\indexentry{Chapter2.2|hyperpage}{47}
\indexentry{Chapter3.4.2|hyperpage}{32}
\indexentry{Chapter2.2.1|hyperpage}{47}
\indexentry{Chapter3.4.3|hyperpage}{33}
\indexentry{Chapter2.2.2|hyperpage}{49}
\indexentry{Chapter3.4.4|hyperpage}{34}
\indexentry{Chapter2.2.3|hyperpage}{50}
\indexentry{Chapter3.4.4.1|hyperpage}{34}
\indexentry{Chapter2.2.4|hyperpage}{51}
\indexentry{Chapter3.4.4.2|hyperpage}{35}
\indexentry{Chapter2.2.5|hyperpage}{53}
\indexentry{Chapter3.5|hyperpage}{41}
\indexentry{Chapter2.2.5.1|hyperpage}{53}
\indexentry{Chapter3.5.1|hyperpage}{41}
\indexentry{Chapter2.2.5.2|hyperpage}{54}
\indexentry{Chapter3.5.2|hyperpage}{44}
\indexentry{Chapter2.2.5.3|hyperpage}{54}
\indexentry{Chapter3.5.3|hyperpage}{45}
\indexentry{Chapter2.3|hyperpage}{55}
\indexentry{Chapter3.5.4|hyperpage}{47}
\indexentry{Chapter2.3.1|hyperpage}{56}
\indexentry{Chapter3.5.5|hyperpage}{48}
\indexentry{Chapter2.3.2|hyperpage}{57}
\indexentry{Chapter3.5.5|hyperpage}{51}
\indexentry{Chapter2.3.2.1|hyperpage}{57}
\indexentry{Chapter3.6|hyperpage}{51}
\indexentry{Chapter2.3.2.2|hyperpage}{58}
\indexentry{Chapter3.6.1|hyperpage}{51}
\indexentry{Chapter2.3.2.3|hyperpage}{60}
\indexentry{Chapter3.6.2|hyperpage}{52}
\indexentry{Chapter2.4|hyperpage}{62}
\indexentry{Chapter3.6.4|hyperpage}{53}
\indexentry{Chapter2.4.1|hyperpage}{63}
\indexentry{Chapter3.6.5|hyperpage}{54}
\indexentry{Chapter2.4.2|hyperpage}{65}
\indexentry{Chapter3.7|hyperpage}{54}
\indexentry{Chapter2.4.2.1|hyperpage}{66}
\indexentry{Chapter2.4.2.2|hyperpage}{67}
\indexentry{Chapter2.4.2.3|hyperpage}{68}
\indexentry{Chapter2.5|hyperpage}{70}
\indexentry{Chapter2.5.1|hyperpage}{70}
\indexentry{Chapter2.5.2|hyperpage}{72}
\indexentry{Chapter2.5.3|hyperpage}{76}
\indexentry{Chapter2.6|hyperpage}{78}
\indexentry{Chapter3.1|hyperpage}{83}
\indexentry{Chapter3.2|hyperpage}{85}
\indexentry{Chapter3.2.1|hyperpage}{85}
\indexentry{Chapter3.2.1.1|hyperpage}{85}
\indexentry{Chapter3.2.1.2|hyperpage}{86}
\indexentry{Chapter3.2.1.3|hyperpage}{87}
\indexentry{Chapter3.2.2|hyperpage}{87}
\indexentry{Chapter3.2.3|hyperpage}{88}
\indexentry{Chapter3.2.3.1|hyperpage}{88}
\indexentry{Chapter3.2.3.2|hyperpage}{88}
\indexentry{Chapter3.2.3.3|hyperpage}{90}
\indexentry{Chapter3.2.4|hyperpage}{91}
\indexentry{Chapter3.2.4.1|hyperpage}{91}
\indexentry{Chapter3.2.4.2|hyperpage}{93}
\indexentry{Chapter3.2.5|hyperpage}{95}
\indexentry{Chapter3.3|hyperpage}{98}
\indexentry{Chapter3.3.1|hyperpage}{98}
\indexentry{Chapter3.3.2|hyperpage}{100}
\indexentry{Chapter3.3.2.1|hyperpage}{101}
\indexentry{Chapter3.3.2.2|hyperpage}{101}
\indexentry{Chapter3.3.2.3|hyperpage}{103}
\indexentry{Chapter3.4|hyperpage}{104}
\indexentry{Chapter3.4.1|hyperpage}{104}
\indexentry{Chapter3.4.2|hyperpage}{106}
\indexentry{Chapter3.4.3|hyperpage}{107}
\indexentry{Chapter3.4.4|hyperpage}{108}
\indexentry{Chapter3.4.4.1|hyperpage}{108}
\indexentry{Chapter3.4.4.2|hyperpage}{109}
\indexentry{Chapter3.5|hyperpage}{114}
\indexentry{Chapter3.5.1|hyperpage}{115}
\indexentry{Chapter3.5.2|hyperpage}{117}
\indexentry{Chapter3.5.3|hyperpage}{119}
\indexentry{Chapter3.5.4|hyperpage}{120}
\indexentry{Chapter3.5.5|hyperpage}{122}
\indexentry{Chapter3.5.5|hyperpage}{124}
\indexentry{Chapter3.6|hyperpage}{124}
\indexentry{Chapter3.6.1|hyperpage}{125}
\indexentry{Chapter3.6.2|hyperpage}{125}
\indexentry{Chapter3.6.4|hyperpage}{126}
\indexentry{Chapter3.6.5|hyperpage}{127}
\indexentry{Chapter3.7|hyperpage}{127}
\indexentry{Chapter4.1|hyperpage}{129}
\indexentry{Chapter4.1.1|hyperpage}{130}
\indexentry{Chapter4.1.2|hyperpage}{132}
\indexentry{Chapter4.2|hyperpage}{134}
\indexentry{Chapter4.2.1|hyperpage}{134}
\indexentry{Chapter4.2.2|hyperpage}{137}
\indexentry{Chapter4.2.2.1|hyperpage}{137}
\indexentry{Chapter4.2.2.2|hyperpage}{138}
\indexentry{Chapter4.2.2.3|hyperpage}{139}
\indexentry{Chapter4.2.3|hyperpage}{140}
\indexentry{Chapter4.2.3.1|hyperpage}{140}
\indexentry{Chapter4.2.3.2|hyperpage}{141}
\indexentry{Chapter4.2.3.3|hyperpage}{142}
\indexentry{Chapter4.2.4|hyperpage}{144}
\indexentry{Chapter4.2.4.1|hyperpage}{144}
\indexentry{Chapter4.2.4.2|hyperpage}{145}
\indexentry{Chapter4.2.4.3|hyperpage}{146}
\indexentry{Chapter4.2.5|hyperpage}{147}
\indexentry{Chapter4.2.6|hyperpage}{147}
\indexentry{Chapter4.2.7|hyperpage}{151}
\indexentry{Chapter4.2.7.1|hyperpage}{152}
\indexentry{Chapter4.2.7.2|hyperpage}{152}
\indexentry{Chapter4.2.7.3|hyperpage}{153}
\indexentry{Chapter4.2.7.4|hyperpage}{154}
\indexentry{Chapter4.3|hyperpage}{155}
\indexentry{Chapter4.3.1|hyperpage}{157}
\indexentry{Chapter4.3.1.1|hyperpage}{158}
\indexentry{Chapter4.3.1.2|hyperpage}{159}
\indexentry{Chapter4.3.1.3|hyperpage}{160}
\indexentry{Chapter4.3.1.4|hyperpage}{161}
\indexentry{Chapter4.3.2|hyperpage}{161}
\indexentry{Chapter4.3.3|hyperpage}{163}
\indexentry{Chapter4.3.4|hyperpage}{164}
\indexentry{Chapter4.3.5|hyperpage}{167}
\indexentry{Chapter4.4|hyperpage}{170}
\indexentry{Chapter4.4.1|hyperpage}{171}
\indexentry{Chapter4.4.2|hyperpage}{174}
\indexentry{Chapter4.4.2.1|hyperpage}{175}
\indexentry{Chapter4.4.2.2|hyperpage}{176}
\indexentry{Chapter4.4.2.3|hyperpage}{178}
\indexentry{Chapter4.4.3|hyperpage}{179}
\indexentry{Chapter4.4.3.1|hyperpage}{180}
\indexentry{Chapter4.4.3.2|hyperpage}{184}
\indexentry{Chapter4.4.3.3|hyperpage}{184}
\indexentry{Chapter4.4.3.4|hyperpage}{185}
\indexentry{Chapter4.4.3.5|hyperpage}{186}
\indexentry{Chapter4.4.4|hyperpage}{187}
\indexentry{Chapter4.4.4.1|hyperpage}{188}
\indexentry{Chapter4.4.4.2|hyperpage}{189}
\indexentry{Chapter4.4.5|hyperpage}{189}
\indexentry{Chapter4.4.5|hyperpage}{192}
\indexentry{Chapter4.4.7|hyperpage}{195}
\indexentry{Chapter4.4.7.1|hyperpage}{195}
\indexentry{Chapter4.4.7.2|hyperpage}{196}
\indexentry{Chapter4.5|hyperpage}{198}
\indexentry{Chapter5.1|hyperpage}{204}
\indexentry{Chapter5.1.1|hyperpage}{204}
\indexentry{Chapter5.1.1.1|hyperpage}{204}
\indexentry{Chapter5.1.1.2|hyperpage}{205}
\indexentry{Chapter5.1.1.3|hyperpage}{206}
\indexentry{Chapter5.1.2|hyperpage}{207}
\indexentry{Chapter5.1.2.1|hyperpage}{207}
\indexentry{Chapter5.1.2.2|hyperpage}{208}
\indexentry{Chapter5.2|hyperpage}{208}
\indexentry{Chapter5.2.1|hyperpage}{208}
\indexentry{Chapter5.2.1.1|hyperpage}{209}
\indexentry{Chapter5.2.1.2|hyperpage}{210}
\indexentry{Chapter5.2.1.3|hyperpage}{210}
\indexentry{Chapter5.2.1.4|hyperpage}{211}
\indexentry{Chapter5.2.1.5|hyperpage}{212}
\indexentry{Chapter5.2.1.6|hyperpage}{213}
\indexentry{Chapter5.2.2|hyperpage}{214}
\indexentry{Chapter5.2.2.1|hyperpage}{214}
\indexentry{Chapter5.2.2.2|hyperpage}{216}
\indexentry{Chapter5.2.2.3|hyperpage}{216}
\indexentry{Chapter5.2.2.4|hyperpage}{217}
\indexentry{Chapter5.2.3|hyperpage}{218}
\indexentry{Chapter5.2.3.1|hyperpage}{218}
\indexentry{Chapter5.2.3.2|hyperpage}{220}
\indexentry{Chapter5.2.4|hyperpage}{220}
\indexentry{Chapter5.3|hyperpage}{225}
\indexentry{Chapter5.3.1|hyperpage}{226}
\indexentry{Chapter5.3.1.1|hyperpage}{226}
\indexentry{Chapter5.3.1.2|hyperpage}{228}
\indexentry{Chapter5.3.1.3|hyperpage}{229}
\indexentry{Chapter5.3.2|hyperpage}{230}
\indexentry{Chapter5.3.3|hyperpage}{230}
\indexentry{Chapter5.3.4|hyperpage}{234}
\indexentry{Chapter5.3.5|hyperpage}{235}
\indexentry{Chapter5.4|hyperpage}{236}
\indexentry{Chapter5.4.1|hyperpage}{237}
\indexentry{Chapter5.4.2|hyperpage}{238}
\indexentry{Chapter5.4.2.1|hyperpage}{239}
\indexentry{Chapter5.4.2.2|hyperpage}{241}
\indexentry{Chapter5.4.2.3|hyperpage}{243}
\indexentry{Chapter5.4.3|hyperpage}{246}
\indexentry{Chapter5.4.4|hyperpage}{248}
\indexentry{Chapter5.4.4.1|hyperpage}{248}
\indexentry{Chapter5.4.4.2|hyperpage}{249}
\indexentry{Chapter5.4.4.3|hyperpage}{250}
\indexentry{Chapter5.4.5|hyperpage}{251}
\indexentry{Chapter5.4.6|hyperpage}{252}
\indexentry{Chapter5.4.6.1|hyperpage}{253}
\indexentry{Chapter5.4.6.2|hyperpage}{255}
\indexentry{Chapter5.4.6.3|hyperpage}{256}
\indexentry{Chapter5.5|hyperpage}{257}
\indexentry{Chapter5.5.1|hyperpage}{258}
\indexentry{Chapter5.5.1.1|hyperpage}{259}
\indexentry{Chapter5.5.1.2|hyperpage}{261}
\indexentry{Chapter5.5.1.3|hyperpage}{262}
\indexentry{Chapter5.5.1.4|hyperpage}{263}
\indexentry{Chapter5.5.2|hyperpage}{264}
\indexentry{Chapter5.5.2.1|hyperpage}{264}
\indexentry{Chapter5.5.2.2|hyperpage}{264}
\indexentry{Chapter5.5.3|hyperpage}{266}
\indexentry{Chapter5.5.3.1|hyperpage}{266}
\indexentry{Chapter5.5.3.2|hyperpage}{268}
\indexentry{Chapter5.5.3.3|hyperpage}{269}
\indexentry{Chapter5.5.3.4|hyperpage}{269}
\indexentry{Chapter5.5.3.5|hyperpage}{270}
\indexentry{Chapter5.6|hyperpage}{271}
\indexentry{Chapter6.1|hyperpage}{273}
\indexentry{Chapter6.1.1|hyperpage}{275}
\indexentry{Chapter6.1.2|hyperpage}{277}
\indexentry{Chapter6.1.3|hyperpage}{280}
\indexentry{Chapter6.2|hyperpage}{282}
\indexentry{Chapter6.2.1|hyperpage}{282}
\indexentry{Chapter6.2.2|hyperpage}{283}
\indexentry{Chapter6.2.3|hyperpage}{284}
\indexentry{Chapter6.2.4|hyperpage}{285}
\indexentry{Chapter6.3|hyperpage}{286}
\indexentry{Chapter6.3.1|hyperpage}{288}
\indexentry{Chapter6.3.2|hyperpage}{290}
\indexentry{Chapter6.3.3|hyperpage}{294}
\indexentry{Chapter6.3.3.1|hyperpage}{294}
\indexentry{Chapter6.3.3.2|hyperpage}{294}
\indexentry{Chapter6.3.3.3|hyperpage}{296}
Book/mt-book-xelatex.ptc
查看文件 @
f9b5b4f4
\boolfalse {citerequest}\boolfalse {citetracker}\boolfalse {pagetracker}\boolfalse {backtracker}\relax
\boolfalse {citerequest}\boolfalse {citetracker}\boolfalse {pagetracker}\boolfalse {backtracker}\relax
\babel@toc {english}{}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {part}{\@mypartnumtocformat {I}{机器翻译基础}}{7}{part.1}%
\select@language {english}
\ttl@starttoc {default@1}
\defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {1}机器翻译简介}{9}{chapter.1}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.1}机器翻译的概念}{9}{section.1.1}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.2}机器翻译简史}{12}{section.1.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.2.1}人工翻译}{12}{subsection.1.2.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.2.2}机器翻译的萌芽}{13}{subsection.1.2.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.2.3}机器翻译的受挫}{14}{subsection.1.2.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.2.4}机器翻译的快速成长}{15}{subsection.1.2.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.2.5}机器翻译的爆发}{16}{subsection.1.2.5}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.3}机器翻译现状}{17}{section.1.3}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.4}机器翻译方法}{18}{section.1.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.4.1}基于规则的机器翻译}{18}{subsection.1.4.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.4.2}基于实例的机器翻译}{20}{subsection.1.4.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.4.3}统计机器翻译}{21}{subsection.1.4.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.4.4}神经机器翻译}{22}{subsection.1.4.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.4.5}对比分析}{23}{subsection.1.4.5}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.5}翻译质量评价}{24}{section.1.5}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.5.1}人工评价}{24}{subsection.1.5.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.5.2}自动评价}{25}{subsection.1.5.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{BLEU}{25}{section*.15}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{TER}{27}{section*.16}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于检测点的评价}{27}{section*.17}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.6}机器翻译应用}{28}{section.1.6}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.7}开源项目与评测}{30}{section.1.7}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.7.1}开源机器翻译系统}{30}{subsection.1.7.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{统计机器翻译开源系统}{30}{section*.19}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{神经机器翻译开源系统}{32}{section*.20}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {1.7.2}常用数据集及公开评测任务}{34}{subsection.1.7.2}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {1.8}推荐学习资源}{36}{section.1.8}%
\defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {2}词法、语法及统计建模基础}{41}{chapter.2}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.1}问题概述 }{42}{section.2.1}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.2}概率论基础}{43}{section.2.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.2.1}随机变量和概率}{43}{subsection.2.2.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.2.2}联合概率、条件概率和边缘概率}{45}{subsection.2.2.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.2.3}链式法则}{46}{subsection.2.2.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.2.4}贝叶斯法则}{47}{subsection.2.2.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.2.5}KL距离和熵}{49}{subsection.2.2.5}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{信息熵}{49}{section*.27}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{KL距离}{50}{section*.29}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{交叉熵}{50}{section*.30}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.3}中文分词}{51}{section.2.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.3.1}基于词典的分词方法}{52}{subsection.2.3.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.3.2}基于统计的分词方法}{53}{subsection.2.3.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{统计模型的学习与推断}{53}{section*.34}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{掷骰子游戏}{54}{section*.36}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{全概率分词方法}{56}{section*.40}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.4}$n$-gram语言模型 }{58}{section.2.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.4.1}建模}{59}{subsection.2.4.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.4.2}未登录词和平滑算法}{61}{subsection.2.4.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{加法平滑方法}{62}{section*.46}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{古德-图灵估计法}{63}{section*.48}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{Kneser-Ney平滑方法}{64}{section*.50}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.5}句法分析(短语结构分析)}{66}{section.2.5}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.5.1}句子的句法树表示}{66}{subsection.2.5.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.5.2}上下文无关文法}{68}{subsection.2.5.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {2.5.3}规则和推导的概率}{72}{subsection.2.5.3}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {2.6}小结及深入阅读}{74}{section.2.6}%
\defcounter {refsection}{0}\relax
\contentsline {part}{\@mypartnumtocformat {II}{统计机器翻译}}{77}{part.2}%
\ttl@stoptoc {default@1}
\ttl@starttoc {default@2}
\defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {3}基于词的机器翻译模型}{79}{chapter.3}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.1}什么是基于词的翻译模型}{79}{section.3.1}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.2}构建一个简单的机器翻译系统}{81}{section.3.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.2.1}如何进行翻译?}{81}{subsection.3.2.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{机器翻译流程}{82}{section*.63}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{人工翻译 vs. 机器翻译}{83}{section*.65}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.2.2}基本框架}{83}{subsection.3.2.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.2.3}单词翻译概率}{84}{subsection.3.2.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{什么是单词翻译概率?}{84}{section*.67}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{如何从一个双语平行数据中学习?}{84}{section*.69}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{如何从大量的双语平行数据中学习?}{86}{section*.70}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.2.4}句子级翻译模型}{87}{subsection.3.2.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基础模型}{87}{section*.72}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{生成流畅的译文}{89}{section*.74}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.2.5}解码}{91}{subsection.3.2.5}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.3}基于词的翻译建模}{94}{section.3.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.3.1}噪声信道模型}{94}{subsection.3.3.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.3.2}统计机器翻译的三个基本问题}{96}{subsection.3.3.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{词对齐}{97}{section*.83}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于词对齐的翻译模型}{97}{section*.86}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于词对齐的翻译实例}{99}{section*.88}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.4}IBM模型1-2}{100}{section.3.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.4.1}IBM模型1}{100}{subsection.3.4.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.4.2}IBM模型2}{102}{subsection.3.4.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.4.3}解码及计算优化}{103}{subsection.3.4.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.4.4}训练}{103}{subsection.3.4.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{目标函数}{104}{section*.93}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{优化}{105}{section*.95}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.5}IBM模型3-5及隐马尔可夫模型}{110}{section.3.5}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.1}基于产出率的翻译模型}{111}{subsection.3.5.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.2}IBM 模型3}{113}{subsection.3.5.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.3}IBM 模型4}{115}{subsection.3.5.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.4} IBM 模型5}{116}{subsection.3.5.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.5}隐马尔可夫模型}{118}{subsection.3.5.5}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{隐马尔可夫模型}{118}{section*.107}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{词对齐模型}{119}{section*.109}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.5.6}解码和训练}{120}{subsection.3.5.6}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.6}问题分析}{120}{section.3.6}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.6.1}词对齐及对称化}{121}{subsection.3.6.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.6.2}Deficiency}{121}{subsection.3.6.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.6.3}句子长度}{122}{subsection.3.6.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {3.6.4}其他问题}{123}{subsection.3.6.4}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {3.7}小结及深入阅读}{123}{section.3.7}%
\defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {4}基于短语和句法的机器翻译模型}{125}{chapter.4}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {4.1}翻译中的结构信息}{125}{section.4.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.1.1}更大粒度的翻译单元}{126}{subsection.4.1.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.1.2}句子的结构信息}{128}{subsection.4.1.2}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {4.2}基于短语的翻译模型}{130}{section.4.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.2.1}机器翻译中的短语}{130}{subsection.4.2.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.2.2}数学建模及判别式模型}{133}{subsection.4.2.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于翻译推导的建模}{133}{section*.121}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{对数线性模型}{134}{section*.122}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{搭建模型的基本流程}{135}{section*.123}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.2.3}短语抽取}{136}{subsection.4.2.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{与词对齐一致的短语}{136}{section*.126}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{获取词对齐}{137}{section*.130}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{度量双语短语质量}{138}{section*.132}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.2.4}调序}{140}{subsection.4.2.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于距离的调序}{140}{section*.136}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于方向的调序}{141}{section*.138}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于分类的调序}{142}{section*.141}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.2.5}特征}{143}{subsection.4.2.5}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.2.6}最小错误率训练}{143}{subsection.4.2.6}%
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.2.7}栈解码}{147}{subsection.4.2.7}%
\contentsline {part}{\@mypartnumtocformat {I}{统计机器翻译}}{7}{part.1}
\defcounter {refsection}{0}\relax
\ttl@starttoc {default@1}
\contentsline {subsubsection}{翻译候选匹配}{148}{section*.146}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{翻译假设扩展}{148}{section*.148}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{剪枝}{149}{section*.150}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{解码中的栈结构}{150}{section*.152}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {4.3}基于层次短语的模型}{151}{section.4.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.3.1}同步上下文无关文法}{153}{subsection.4.3.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{文法定义}{154}{section*.157}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{推导}{155}{section*.158}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{胶水规则}{156}{section*.159}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{处理流程}{157}{section*.160}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.3.2}层次短语规则抽取}{157}{subsection.4.3.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.3.3}翻译模型及特征}{159}{subsection.4.3.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.3.4}CYK解码}{160}{subsection.4.3.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.3.5}立方剪枝}{163}{subsection.4.3.5}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {4.4}基于语言学句法的模型}{166}{section.4.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.1}基于句法的翻译模型分类}{167}{subsection.4.4.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.2}基于树结构的文法}{170}{subsection.4.4.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{树到树翻译规则}{171}{section*.176}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于树结构的翻译推导}{172}{section*.178}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{树到串翻译规则}{174}{section*.181}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.3}树到串翻译规则抽取}{175}{subsection.4.4.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{树的切割与最小规则}{176}{section*.183}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{空对齐处理}{180}{section*.189}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{组合规则}{180}{section*.191}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{SPMT规则}{181}{section*.193}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{句法树二叉化}{182}{section*.195}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.4}树到树翻译规则抽取}{183}{subsection.4.4.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于节点对齐的规则抽取}{184}{section*.199}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于对齐矩阵的规则抽取}{185}{section*.202}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.5}句法翻译模型的特征}{185}{subsection.4.4.5}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.6}基于超图的推导空间表示}{188}{subsection.4.4.6}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {4.4.7}基于树的解码 vs 基于串的解码}{191}{subsection.4.4.7}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于树的解码}{191}{section*.209}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{基于串的解码}{192}{section*.212}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {4.5}小结及深入阅读}{194}{section.4.5}%
\defcounter {refsection}{0}\relax
\contentsline {part}{\@mypartnumtocformat {III}{神经机器翻译}}{197}{part.3}%
\ttl@stoptoc {default@2}
\ttl@starttoc {default@3}
\defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {5}人工神经网络和神经语言建模}{199}{chapter.5}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.1}深度学习与人工神经网络}{200}{section.5.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.1.1}发展简史}{200}{subsection.5.1.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)早期的人工神经网络和第一次寒冬}{200}{section*.214}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)神经网络的第二次高潮和第二次寒冬}{201}{section*.215}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)深度学习和神经网络的崛起}{202}{section*.216}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.1.2}为什么需要深度学习}{203}{subsection.5.1.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)端到端学习和表示学习}{203}{section*.218}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)深度学习的效果}{204}{section*.220}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.2}神经网络基础}{204}{section.5.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.2.1}线性代数基础}{204}{subsection.5.2.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{标量、向量和矩阵}{205}{section*.222}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{矩阵的转置}{206}{section*.223}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{矩阵加法和数乘}{206}{section*.224}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{矩阵乘法和矩阵点乘}{207}{section*.225}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{线性映射}{208}{section*.226}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{范数}{209}{section*.227}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.2.2}人工神经元和感知机}{210}{subsection.5.2.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)感知机\ \raisebox {0.5mm}{------}\ 最简单的人工神经元模型}{210}{section*.230}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)神经元内部权重}{212}{section*.233}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)神经元的输入\ \raisebox {0.5mm}{------}\ 离散 vs 连续}{212}{section*.235}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(四)神经元内部的参数学习}{213}{section*.237}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.2.3}多层神经网络}{214}{subsection.5.2.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{线性变换和激活函数}{214}{section*.239}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{单层神经网络$\rightarrow $多层神经网络}{216}{section*.246}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.2.4}函数拟合能力}{216}{subsection.5.2.4}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.3}神经网络的张量实现}{221}{section.5.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.1} 张量及其计算}{222}{subsection.5.3.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{张量}{222}{section*.256}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{张量的矩阵乘法}{224}{section*.259}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{张量的单元操作}{225}{section*.261}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.2}张量的物理存储形式}{226}{subsection.5.3.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.3}使用开源框架实现张量计算}{226}{subsection.5.3.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.4}神经网络中的前向传播}{228}{subsection.5.3.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.3.5}神经网络实例}{231}{subsection.5.3.5}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.4}神经网络的参数训练}{232}{section.5.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.1}损失函数}{233}{subsection.5.4.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.2}基于梯度的参数优化}{234}{subsection.5.4.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)梯度下降}{234}{section*.279}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)梯度获取}{236}{section*.281}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)基于梯度的方法的变种和改进}{239}{section*.285}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.3}参数更新的并行化策略}{242}{subsection.5.4.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.4}梯度消失、梯度爆炸和稳定性训练}{242}{subsection.5.4.4}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)梯度消失现象及解决方法}{242}{section*.288}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)梯度爆炸现象及解决方法}{245}{section*.292}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)稳定性训练}{245}{section*.293}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.5}过拟合}{247}{subsection.5.4.5}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.4.6}反向传播}{247}{subsection.5.4.6}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)输出层的反向传播}{248}{section*.296}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)隐藏层的反向传播}{251}{section*.300}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)程序实现}{252}{section*.303}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.5}神经语言模型}{253}{section.5.5}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.5.1}基于神经网络的语言建模}{253}{subsection.5.5.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)基于前馈神经网络的语言模型}{254}{section*.306}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)基于循环神经网络的语言模型}{257}{section*.309}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)基于自注意力机制的语言模型}{258}{section*.311}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(四)语言模型的评价}{259}{section*.313}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.5.2}单词表示模型}{259}{subsection.5.5.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)One-hot编码}{259}{section*.314}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)分布式表示}{260}{section*.316}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {5.5.3}句子表示模型及预训练}{262}{subsection.5.5.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(一)简单的上下文表示模型}{262}{section*.320}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(二)ELMO模型}{263}{section*.323}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(三)GPT模型}{264}{section*.325}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(四)BERT模型}{265}{section*.327}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{(五)为什么要预训练?}{265}{section*.329}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {5.6}小结及深入阅读}{266}{section.5.6}%
\defcounter {refsection}{0}\relax
\contentsline {chapter}{\numberline {6}神经机器翻译模型}{269}{chapter.6}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {6.1}神经机器翻译的发展简史}{269}{section.6.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {6.1.1}神经机器翻译的起源}{271}{subsection.6.1.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {6.1.2}神经机器翻译的品质 }{273}{subsection.6.1.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {6.1.3}神经机器翻译的优势 }{276}{subsection.6.1.3}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {6.2}编码器-解码器框架}{278}{section.6.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {6.2.1}框架结构}{278}{subsection.6.2.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {6.2.2}表示学习}{279}{subsection.6.2.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {6.2.3}简单的运行实例}{280}{subsection.6.2.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {6.2.4}机器翻译范式的对比}{281}{subsection.6.2.4}%
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {6.3}基于循环神经网络的翻译模型及注意力机制}{282}{section.6.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {6.3.1}建模}{284}{subsection.6.3.1}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {6.3.2}输入(词嵌入)及输出(Softmax)}{286}{subsection.6.3.2}%
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {6.3.3}循环神经网络结构}{290}{subsection.6.3.3}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{循环神经单元(RNN)}{290}{section*.351}%
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{长短时记忆网络(LSTM)}{290}{section*.352}%
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {
subsubsection}{门控循环单元(GRU)}{292}{section*.355}%
\contentsline {
chapter}{\numberline {1}基于词的机器翻译模型}{9}{chapter.1}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {s
ubsubsection}{双向模型}{293}{section*.357}%
\contentsline {s
ection}{\numberline {1.1}什么是基于词的翻译模型}{9}{section.1.1}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {s
ubsubsection}{多层循环神经网络}{295}{section*.359}%
\contentsline {s
ection}{\numberline {1.2}构建一个简单的机器翻译系统}{11}{section.1.2}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {
6.3.4}注意力机制}{295}{subsection.6.3.4}%
\contentsline {subsection}{\numberline {
1.2.1}如何进行翻译?}{11}{subsection.1.2.1}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{
翻译中的注意力机制}{296}{section*.362}%
\contentsline {subsubsection}{
机器翻译流程}{12}{section*.6}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{
上下文向量的计算}{297}{section*.365}%
\contentsline {subsubsection}{
人工翻译 vs. 机器翻译}{13}{section*.8}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subs
ubsection}{注意力机制的解读}{300}{section*.370}%
\contentsline {subs
ection}{\numberline {1.2.2}基本框架}{13}{subsection.1.2.2}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {
6.3.5}训练}{302}{subsection.6.3.5}%
\contentsline {subsection}{\numberline {
1.2.3}单词翻译概率}{14}{subsection.1.2.3}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{
损失函数}{303}{section*.373}%
\contentsline {subsubsection}{
什么是单词翻译概率?}{14}{section*.10}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{
长参数初始化}{303}{section*.374}%
\contentsline {subsubsection}{
如何从一个双语平行数据中学习?}{14}{section*.12}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{
优化策略}{304}{section*.375}%
\contentsline {subsubsection}{
如何从大量的双语平行数据中学习?}{16}{section*.13}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subs
ubsection}{梯度裁剪}{304}{section*.377}%
\contentsline {subs
ection}{\numberline {1.2.4}句子级翻译模型}{17}{subsection.1.2.4}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{
学习率策略}{305}{section*.378}%
\contentsline {subsubsection}{
基础模型}{17}{section*.15}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsubsection}{
并行训练}{306}{section*.381}%
\contentsline {subsubsection}{
生成流畅的译文}{19}{section*.17}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {
6.3.6}推断}{307}{subsection.6.3.6}%
\contentsline {subsection}{\numberline {
1.2.5}解码}{21}{subsection.1.2.5}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {s
ubsubsection}{贪婪搜索}{309}{section*.385}%
\contentsline {s
ection}{\numberline {1.3}基于词的翻译建模}{24}{section.1.3}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subs
ubsection}{束搜索}{310}{section*.388}%
\contentsline {subs
ection}{\numberline {1.3.1}噪声信道模型}{24}{subsection.1.3.1}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subs
ubsection}{长度惩罚}{311}{section*.390}%
\contentsline {subs
ection}{\numberline {1.3.2}统计机器翻译的三个基本问题}{26}{subsection.1.3.2}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subs
ection}{\numberline {6.3.7}实例-GNMT}{312}{subsection.6.3.7}%
\contentsline {subs
ubsection}{词对齐}{27}{section*.26}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {s
ection}{\numberline {6.4}Transformer}{314}{section.6.4}%
\contentsline {s
ubsubsection}{基于词对齐的翻译模型}{27}{section*.29}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subs
ection}{\numberline {6.4.1}自注意力模型}{315}{subsection.6.4.1}%
\contentsline {subs
ubsection}{基于词对齐的翻译实例}{29}{section*.31}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {s
ubsection}{\numberline {6.4.2}Transformer架构}{316}{subsection.6.4.2}%
\contentsline {s
ection}{\numberline {1.4}IBM模型1-2}{30}{section.1.4}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {
6.4.3}位置编码}{318}{subsection.6.4.3}%
\contentsline {subsection}{\numberline {
1.4.1}IBM模型1}{30}{subsection.1.4.1}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {
6.4.4}基于点乘的注意力机制}{320}{subsection.6.4.4}%
\contentsline {subsection}{\numberline {
1.4.2}IBM模型2}{32}{subsection.1.4.2}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {
6.4.5}掩码操作}{322}{subsection.6.4.5}%
\contentsline {subsection}{\numberline {
1.4.3}解码及计算优化}{33}{subsection.1.4.3}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {
6.4.6}多头注意力}{324}{subsection.6.4.6}%
\contentsline {subsection}{\numberline {
1.4.4}训练}{34}{subsection.1.4.4}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subs
ection}{\numberline {6.4.7}残差网络和层正则化}{325}{subsection.6.4.7}%
\contentsline {subs
ubsection}{目标函数}{34}{section*.36}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subs
ection}{\numberline {6.4.8}前馈全连接网络子层}{326}{subsection.6.4.8}%
\contentsline {subs
ubsection}{优化}{35}{section*.38}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {s
ubsection}{\numberline {6.4.9}训练}{327}{subsection.6.4.9}%
\contentsline {s
ection}{\numberline {1.5}IBM模型3-5及隐马尔可夫模型}{41}{section.1.5}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {
6.4.10}推断}{330}{subsection.6.4.10}%
\contentsline {subsection}{\numberline {
1.5.1}基于产出率的翻译模型}{41}{subsection.1.5.1}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {s
ection}{\numberline {6.5}序列到序列问题及应用}{330}{section.6.5}%
\contentsline {s
ubsection}{\numberline {1.5.2}IBM 模型3}{44}{subsection.1.5.2}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {
6.5.1}自动问答}{331}{subsection.6.5.1}%
\contentsline {subsection}{\numberline {
1.5.3}IBM 模型4}{45}{subsection.1.5.3}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {
6.5.2}自动文摘}{331}{subsection.6.5.2}%
\contentsline {subsection}{\numberline {
1.5.4} IBM 模型5}{47}{subsection.1.5.4}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subsection}{\numberline {
6.5.3}文言文翻译}{331}{subsection.6.5.3}%
\contentsline {subsection}{\numberline {
1.5.5}隐马尔可夫模型}{48}{subsection.1.5.5}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subs
ection}{\numberline {6.5.4}对联生成}{333}{subsection.6.5.4}%
\contentsline {subs
ubsection}{隐马尔可夫模型}{48}{section*.50}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {subs
ection}{\numberline {6.5.5}古诗生成}{333}{subsection.6.5.5}%
\contentsline {subs
ubsection}{词对齐模型}{50}{section*.52}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {s
ection}{\numberline {6.6}小结及深入阅读}{333}{section.6.6}%
\contentsline {s
ubsection}{\numberline {1.5.6}解码和训练}{51}{subsection.1.5.6}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {part}{\@mypartnumtocformat {IV}{附录}}{337}{part.4}%
\contentsline {section}{\numberline {1.6}问题分析}{51}{section.1.6}
\ttl@stoptoc {default@3}
\ttl@starttoc {default@4}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {
chapter}{\numberline {A}附录A}{339}{appendix.1.A}%
\contentsline {
subsection}{\numberline {1.6.1}词对齐及对称化}{51}{subsection.1.6.1}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {
chapter}{\numberline {B}附录B}{341}{appendix.2.B}%
\contentsline {
subsection}{\numberline {1.6.2}Deficiency}{52}{subsection.1.6.2}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {s
ection}{\numberline {B.1}IBM模型3训练方法}{341}{section.2.B.1}%
\contentsline {s
ubsection}{\numberline {1.6.3}句子长度}{53}{subsection.1.6.3}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {s
ection}{\numberline {B.2}IBM模型4训练方法}{343}{section.2.B.2}%
\contentsline {s
ubsection}{\numberline {1.6.4}其他问题}{54}{subsection.1.6.4}
\defcounter {refsection}{0}\relax
\defcounter {refsection}{0}\relax
\contentsline {section}{\numberline {
B.3}IBM模型5训练方法}{344}{section.2.B.3}%
\contentsline {section}{\numberline {
1.7}小结及深入阅读}{54}{section.1.7}
\contentsfinish
\contentsfinish
Book/mt-book-xelatex.tex
查看文件 @
f9b5b4f4
...
@@ -112,13 +112,13 @@
...
@@ -112,13 +112,13 @@
% CHAPTERS
% CHAPTERS
%----------------------------------------------------------------------------------------
%----------------------------------------------------------------------------------------
\include
{
Chapter1/chapter1
}
%
\include{Chapter1/chapter1}
\include
{
Chapter2/chapter2
}
%
\include{Chapter2/chapter2}
\include
{
Chapter3/chapter3
}
\include
{
Chapter3/chapter3
}
\include
{
Chapter4/chapter4
}
%
\include{Chapter4/chapter4}
\include
{
Chapter5/chapter5
}
%
\include{Chapter5/chapter5}
\include
{
Chapter6/chapter6
}
%
\include{Chapter6/chapter6}
\include
{
ChapterAppend/chapterappend
}
%
\include{ChapterAppend/chapterappend}
...
...
Section03-Word-Based-Models/section03.tex
查看文件 @
f9b5b4f4
...
@@ -3824,7 +3824,7 @@ s.t. $\forall t_y: \sum_{s_x} f(s_x|t_y) =1 $ & \\
...
@@ -3824,7 +3824,7 @@ s.t. $\forall t_y: \sum_{s_x} f(s_x|t_y) =1 $ & \\
\begin{eqnarray}
\begin{eqnarray}
\frac
{
\partial
L(f,
\lambda
)
}{
\partial
f(s
_
u|t
_
v)
}
&
=
&
\frac
{
\partial
\big
[ \frac{\epsilon}{(l+1)^{m}} \prod\limits_{j=1}^{m} \sum\limits_{i=0}^{l} f(s_j|t_i) \big]
}{
\partial
f(s
_
u|t
_
v)
}
-
\nonumber
\\
\frac
{
\partial
L(f,
\lambda
)
}{
\partial
f(s
_
u|t
_
v)
}
&
=
&
\frac
{
\partial
\big
[ \frac{\epsilon}{(l+1)^{m}} \prod\limits_{j=1}^{m} \sum\limits_{i=0}^{l} f(s_j|t_i) \big]
}{
\partial
f(s
_
u|t
_
v)
}
-
\nonumber
\\
&
&
\frac
{
\partial
\big
[ \sum_{t_y} \lambda_{t_y} (\sum_{s_x} f(s_x|t_y) -1) \big]
}{
\partial
f(s
_
u|t
_
v)
}
\nonumber
\\
&
&
\frac
{
\partial
\big
[ \sum_{t_y} \lambda_{t_y} (\sum_{s_x} f(s_x|t_y) -1) \big]
}{
\partial
f(s
_
u|t
_
v)
}
\nonumber
\\
&
=
&
\frac
{
\epsilon
}{
(l+1)
^{
m
}}
\cdot
\frac
{
\partial
\big
[ \prod\limits_{j=1}^{m} \sum\limits_{i=0}^{l} f(s_j|t_
{a_j}
) \big]
}{
\partial
f(s
_
u|t
_
v)
}
-
\lambda
_{
t
_
v
}
\nonumber
&
=
&
\frac
{
\epsilon
}{
(l+1)
^{
m
}}
\cdot
\frac
{
\partial
\big
[ \prod\limits_{j=1}^{m} \sum\limits_{i=0}^{l} f(s_j|t_
i
) \big]
}{
\partial
f(s
_
u|t
_
v)
}
-
\lambda
_{
t
_
v
}
\nonumber
\end{eqnarray}
\end{eqnarray}
\vspace
{
-0.3em
}
\vspace
{
-0.3em
}
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论