Commit 41d5b8db by 曹润柘

合并分支 'caorunzhe' 到 'master'

Caorunzhe

查看合并请求 !87
parents 29939616 4f05fac1
...@@ -111,7 +111,7 @@ ...@@ -111,7 +111,7 @@
\parinterval 人工翻译已经存在了上千年,而机器翻译又起源于什么时候呢?机器翻译跌宕起伏的发展史可以分为萌芽期、受挫期、快速成长期和爆发期四个阶段。 \parinterval 人工翻译已经存在了上千年,而机器翻译又起源于什么时候呢?机器翻译跌宕起伏的发展史可以分为萌芽期、受挫期、快速成长期和爆发期四个阶段。
\parinterval 早在17世纪,如Descartes就提出使用世界语言,即使用统一符号表示不同语言、相同含义的词汇,来克服语言障碍的想法\upcite{knowlson1975universal},这种想法在当时是很超前的。随着语言学、计算机科学等学科的发展,在19世纪30年代使用计算模型进行自动翻译的思想开始萌芽,如当时法国科学家Georges Artsrouni就提出用机器来进行翻译的想法。只是那时依然没有合适的实现手段,所以这种想法的合理性无法被证实。 \parinterval 17世纪,Descartes提出世界语言的概念\upcite{knowlson1975universal},他希望使用统一符号表示不同语言、相同含义的词汇,以此来克服语言障碍,这种想法在当时是很超前的。随着语言学、计算机科学等学科的发展,在19世纪30年代使用计算模型进行自动翻译的思想开始萌芽,如当时法国科学家Georges Artsrouni就提出用机器来进行翻译的想法。只是那时依然没有合适的实现手段,所以这种想法的合理性无法被证实。
\parinterval 随着第二次世界大战爆发, 对文字进行加密和解密成为重要的军事需求,这也使得数学和密码学变得相当发达。在战争结束一年后,世界上第一台通用电子数字计算机于1946年研制成功(图\ref{fig:1-4}),至此使用机器进行翻译有了真正实现的可能。 \parinterval 随着第二次世界大战爆发, 对文字进行加密和解密成为重要的军事需求,这也使得数学和密码学变得相当发达。在战争结束一年后,世界上第一台通用电子数字计算机于1946年研制成功(图\ref{fig:1-4}),至此使用机器进行翻译有了真正实现的可能。
......
...@@ -9,10 +9,10 @@ ...@@ -9,10 +9,10 @@
% \node[anchor=north,minimum width=1.8em,minimum height=1em,fill=blue!10] (l1) at ([yshift=-1em]eos.south){}; % \node[anchor=north,minimum width=1.8em,minimum height=1em,fill=blue!10] (l1) at ([yshift=-1em]eos.south){};
% \node[anchor=north,minimum width=1.8em,minimum height=1em,fill=red!10] (l2) at ([yshift=-0.5em]l1.south){}; % \node[anchor=north,minimum width=1.8em,minimum height=1em,fill=red!10] (l2) at ([yshift=-0.5em]l1.south){};
\node[anchor=west,unit] (w1) at ([xshift=1.5em,yshift=7em]eos.east){$w_1$}; \node[anchor=north,unit] (w1) at ([xshift=0em,yshift=-1.8em]eos.south){$w_1$};
\node[anchor=north,unit,fill=blue!10] (n11) at ([yshift=-0.5em]w1.south){$<$sos$>$}; \node[anchor=north,unit,fill=blue!10] (n11) at ([yshift=-0.5em]w1.south){$<$sos$>$};
\node[anchor=west,unit,fill=red!20,opacity=0.3] (n24) at ([xshift=4.5em]n11.east){an}; \node[anchor=west,unit,fill=red!20,opacity=0.3] (n24) at ([xshift=6.5em,yshift=4.3em]n11.east){an};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black,opacity=0.3] (pt24) at (n24.east) {\small{{\color{white} \textbf{-1.4}}}}; \node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black,opacity=0.3] (pt24) at (n24.east) {\small{{\color{white} \textbf{-1.4}}}};
\node[anchor=south,unit,fill=red!20] (n23) at ([yshift=0.1em]n24.north){one}; \node[anchor=south,unit,fill=red!20] (n23) at ([yshift=0.1em]n24.north){one};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt23) at (n23.east) {\small{{\color{white} \textbf{-0.6}}}}; \node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt23) at (n23.east) {\small{{\color{white} \textbf{-0.6}}}};
...@@ -30,7 +30,7 @@ ...@@ -30,7 +30,7 @@
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black,opacity=0.3] (pt27) at (n27.east) {\small{{\color{white} \textbf{-7.2}}}}; \node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black,opacity=0.3] (pt27) at (n27.east) {\small{{\color{white} \textbf{-7.2}}}};
\node[anchor=south,unit] (w2) at ([yshift=0.5em]n21.north){$w_2$}; \node[anchor=south,unit] (w2) at ([yshift=0.5em]n21.north){$w_2$};
\node[anchor=west,unit,fill=red!20] (n31) at ([yshift=3em,xshift=6em]n21.east){is}; \node[anchor=west,unit,fill=red!20] (n31) at ([yshift=4.7em,xshift=8em]n21.east){is};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt31) at (n31.east) {\small{{\color{white} \textbf{-0.1}}}}; \node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt31) at (n31.east) {\small{{\color{white} \textbf{-0.1}}}};
\node[anchor=north,unit,fill=blue!10] (n32) at ([yshift=-0.1em]n31.south){$<$eos$>$}; \node[anchor=north,unit,fill=blue!10] (n32) at ([yshift=-0.1em]n31.south){$<$eos$>$};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt32) at (n32.east) {\small{{\color{white} \textbf{-0.6}}}}; \node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt32) at (n32.east) {\small{{\color{white} \textbf{-0.6}}}};
...@@ -49,7 +49,7 @@ ...@@ -49,7 +49,7 @@
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt41) at (n41.east) {\small{{\color{white} \textbf{-0.1}}}}; \node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt41) at (n41.east) {\small{{\color{white} \textbf{-0.1}}}};
\node[anchor=north,unit,fill=red!20,opacity=0.3,minimum width=3.5em,minimum height=2.5em] (n51) at ([yshift=-0.1em]n41.south){}; \node[anchor=north,unit,fill=red!20,opacity=0.3,minimum width=3.5em,minimum height=2.5em] (n51) at ([yshift=-0.1em]n41.south){};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2.5em,fill=black,opacity=0.3] (pt51) at (n51.east) {\small{{\color{white} \textbf{$<$-0.7}}}}; \node [anchor=north,rotate=90,inner sep=1pt,minimum width=2.5em,fill=black,opacity=0.3] (pt51) at (n51.east) {\small{{\color{white} \textbf{$<$-0.7}}}};
\node[anchor=south,unit] (w3) at ([yshift=0.5em]n31.north){$w_2$}; \node[anchor=south,unit] (w3) at ([yshift=0.5em]n31.north){$w_3$};
\draw[->,ublue,very thick] (n11.east) -- (n21.west); \draw[->,ublue,very thick] (n11.east) -- (n21.west);
\draw[->,ublue,very thick] (n11.east) -- (n22.west); \draw[->,ublue,very thick] (n11.east) -- (n22.west);
......
...@@ -6,17 +6,17 @@ ...@@ -6,17 +6,17 @@
\node[fill=blue!40,anchor=north,align=left,inner sep=2pt,minimum width=5em](spe)at(words.south){\color{white}{\small\bfnew{特殊符号}}}; \node[fill=blue!40,anchor=north,align=left,inner sep=2pt,minimum width=5em](spe)at(words.south){\color{white}{\small\bfnew{特殊符号}}};
\node[fill=blue!10,anchor=north,align=left,inner sep=3pt,minimum width=5em](eos)at(spe.south){$<$sos$>$\\[-0.5ex]$<$eos$>$}; \node[fill=blue!10,anchor=north,align=left,inner sep=3pt,minimum width=5em](eos)at(spe.south){$<$sos$>$\\[-0.5ex]$<$eos$>$};
\node[anchor=west,unit] (w1) at ([xshift=2em,yshift=4.5em]eos.east){$w_1$}; \node[anchor=north,unit] (w1) at ([xshift=2.5em,yshift=-1em]eos.south){$w_1$};
\node[anchor=north,unit,fill=blue!10] (n11) at ([yshift=-0.5em]w1.south){$<$sos$>$}; \node[anchor=north,unit,fill=blue!10] (n11) at ([yshift=-0.5em]w1.south){$<$sos$>$};
\node [anchor=north] (wtranslabel) at ([xshift=0em,yshift=-1em]n11.south) {\small{生成顺序:}}; \node [anchor=north] (wtranslabel) at ([xshift=-2.5em,yshift=-3em]n11.south) {\small{生成顺序:}};
\draw [->,ultra thick,red,line width=1.5pt,opacity=0.7] (wtranslabel.east) -- ([xshift=1.5em]wtranslabel.east); \draw [->,ultra thick,red,line width=1.5pt,opacity=0.7] (wtranslabel.east) -- ([xshift=1.5em]wtranslabel.east);
\node[anchor=west,unit,fill=red!20] (n22) at ([xshift=5em]n11.east){agree}; \node[anchor=west,unit,fill=red!20] (n22) at ([xshift=5em]n11.east){agree};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt22) at (n22.east) {\small{{\color{white} \textbf{-0.4}}}}; \node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt22) at (n22.east) {\small{{\color{white} \textbf{-0.4}}}};
\node[anchor=south,unit,fill=red!20] (n21) at ([yshift=0.3em]n22.north){I}; \node[anchor=south,unit,fill=red!20] (n21) at ([yshift=5.5em]n22.north){I};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt21) at (n21.east) {\small{{\color{white} \textbf{-0.5}}}}; \node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt21) at (n21.east) {\small{{\color{white} \textbf{-0.5}}}};
\node[anchor=north,unit,fill=blue!10] (n23) at ([yshift=-0.3em]n22.south){$<$eos$>$}; \node[anchor=north,unit,fill=blue!10] (n23) at ([yshift=-3em]n22.south){$<$eos$>$};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt23) at (n23.east) {\small{{\color{white} \textbf{-2.2}}}}; \node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt23) at (n23.east) {\small{{\color{white} \textbf{-2.2}}}};
\node[anchor=south,unit] (w2) at ([yshift=0.5em]n21.north){$w_2$}; \node[anchor=south,unit] (w2) at ([yshift=0.5em]n21.north){$w_2$};
......
...@@ -686,7 +686,7 @@ N & = & \sum_{r=0}^{\infty}{r^{*}n_r} \nonumber \\ ...@@ -686,7 +686,7 @@ N & = & \sum_{r=0}^{\infty}{r^{*}n_r} \nonumber \\
\subsubsection{3.Kneser-Ney平滑方法} \subsubsection{3.Kneser-Ney平滑方法}
\parinterval Kneser-Ney平滑方法是由Reinhard Kneser和Hermann Ney于1995年提出的用于计算$n$元语法概率分布的方法\upcite{kneser1995improved,chen1999empirical},并被广泛认为是最有效的平滑方法之一。这种平滑方法改进了Absolute Discounting\upcite{ney1994on,ney1991on}中与高阶分布相结合的低阶分布的计算方法,使不同阶分布得到充分的利用。这种算法也综合利用了其他多种平滑算法的思想。 \parinterval Kneser-Ney平滑方法是由Reinhard Kneser和Hermann Ney于1995年提出的用于计算$n$元语法概率分布的方法\upcite{kneser1995improved,chen1999empirical},并被广泛认为是最有效的平滑方法之一。这种平滑方法改进了Absolute Discounting\upcite{ney1994on,ney1991smoothing}中与高阶分布相结合的低阶分布的计算方法,使不同阶分布得到充分的利用。这种算法也综合利用了其他多种平滑算法的思想。
\parinterval 首先介绍一下Absolute Discounting平滑算法,公式如下所示: \parinterval 首先介绍一下Absolute Discounting平滑算法,公式如下所示:
\begin{eqnarray} \begin{eqnarray}
...@@ -823,7 +823,7 @@ c_{\textrm{KN}}(\cdot) = \left\{\begin{array}{ll} ...@@ -823,7 +823,7 @@ c_{\textrm{KN}}(\cdot) = \left\{\begin{array}{ll}
\label{eq:2-40} \label{eq:2-40}
\end{eqnarray} \end{eqnarray}
\noindent 这里$\arg$即argument(参数),$\argmax_x f(x)$表示返回使$f(x)$达到最大的$x$$\argmax_{w \in \chi}\funp{P}(w)$表示找到使语言模型得分$\funp{P}(w)$达到最大的单词序列$w$$\chi$ 是搜索问题的解空间,它是所有可能的单词序列$w$的集合。$\hat{w}$可以被看做该搜索问题中的“最优解”,即概率最大的单词序列。 \noindent 这里$\arg$即argument(参数),$\argmax_x f(x)$表示返回使$f(x)$达到最大的$x$$\argmax_{w \in \chi}$\\$\funp{P}(w)$表示找到使语言模型得分$\funp{P}(w)$达到最大的单词序列$w$$\chi$ 是搜索问题的解空间,它是所有可能的单词序列$w$的集合。$\hat{w}$可以被看做该搜索问题中的“最优解”,即概率最大的单词序列。
\parinterval 在序列生成任务中,最简单的策略就是对词表中的词汇进行任意组合,通过这种枚举的方式得到全部可能的序列。但是,很多时候并生成序列的长度是无法预先知道的。比如,机器翻译中目标语序列的长度是任意的。那么怎样判断一个序列何时完成了生成过程呢?这里借用人类书写中文和英文的过程:句子的生成首先从一片空白开始,然后从左到右逐词生成,除了第一个单词,所有单词的生成都依赖于前面已经生成的单词。为了方便计算机实现,通常定义单词序列从一个特殊的符号<sos>后开始生成。同样地,一个单词序列的结束也用一个特殊的符号<eos>来表示。 \parinterval 在序列生成任务中,最简单的策略就是对词表中的词汇进行任意组合,通过这种枚举的方式得到全部可能的序列。但是,很多时候并生成序列的长度是无法预先知道的。比如,机器翻译中目标语序列的长度是任意的。那么怎样判断一个序列何时完成了生成过程呢?这里借用人类书写中文和英文的过程:句子的生成首先从一片空白开始,然后从左到右逐词生成,除了第一个单词,所有单词的生成都依赖于前面已经生成的单词。为了方便计算机实现,通常定义单词序列从一个特殊的符号<sos>后开始生成。同样地,一个单词序列的结束也用一个特殊的符号<eos>来表示。
...@@ -925,7 +925,7 @@ c_{\textrm{KN}}(\cdot) = \left\{\begin{array}{ll} ...@@ -925,7 +925,7 @@ c_{\textrm{KN}}(\cdot) = \left\{\begin{array}{ll}
\end{figure} \end{figure}
%------------------------------------------- %-------------------------------------------
\parinterval 这样,语言模型的打分与解空间树的遍历就融合了在一起。于是,序列生成的问题可以被重新描述为:寻找所有单词序列组成的解空间树中权重总和最大的一条路径。在这个定义下,前面提到的两种枚举词序列的方法就是经典的{\small\bfnew{深度优先搜索}}\index{深度优先搜索}(Depth-first Search)\index{Depth-first Search}{\small\bfnew{宽度优先搜索}}\index{宽度优先搜索}(Breadth-first Search)\index{Breadth-first Search}的雏形。在后面的内容中可以看到,从遍历解空间树的角度出发,可以对原始这些搜索策略的效率进行优化。 \parinterval 这样,语言模型的打分与解空间树的遍历就融合了在一起。于是,序列生成的问题可以被重新描述为:寻找所有单词序列组成的解空间树中权重总和最大的一条路径。在这个定义下,前面提到的两种枚举词序列的方法就是经典的{\small\bfnew{深度优先搜索}}\index{深度优先搜索}(Depth-first Search)\upcite{even2011graph}\index{Depth-first Search}{\small\bfnew{宽度优先搜索}}\index{宽度优先搜索}(Breadth-first Search)\upcite{lee1961an}\index{Breadth-first Search}的雏形。在后面的内容中可以看到,从遍历解空间树的角度出发,可以对原始这些搜索策略的效率进行优化。
%---------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------
% NEW SUB-SECTION % NEW SUB-SECTION
......
...@@ -1174,12 +1174,22 @@ ...@@ -1174,12 +1174,22 @@
biburl = {https://dblp.org/rec/books/mg/CormenLR89.bib}, biburl = {https://dblp.org/rec/books/mg/CormenLR89.bib},
bibsource = {dblp computer science bibliography, https://dblp.org} bibsource = {dblp computer science bibliography, https://dblp.org}
} }
%没有出版社
@book{russell2003artificial, @article{DBLP:journals/ai/SabharwalS11,
title={Artificial Intelligence : A Modern Approach}, author = {Ashish Sabharwal and
author={Stuart J. {Russell} and Peter {Norvig}}, Bart Selman},
//notes="Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2122410182", title = {S. Russell, P. Norvig, Artificial Intelligence: {A} Modern Approach,
year={2003} Third Edition},
journal = {Artif. Intell.},
volume = {175},
number = {5-6},
pages = {935--937},
year = {2011},
url = {https://doi.org/10.1016/j.artint.2011.01.005},
doi = {10.1016/j.artint.2011.01.005},
timestamp = {Sat, 27 May 2017 14:24:41 +0200},
biburl = {https://dblp.org/rec/journals/ai/SabharwalS11.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
} }
@book{sahni1978fundamentals, @book{sahni1978fundamentals,
...@@ -1370,11 +1380,12 @@ ...@@ -1370,11 +1380,12 @@
number={3}, number={3},
year={1957}, year={1957},
} }
%没有出版社
@book{lowerre1976the, @book{lowerre1976the,
title={The HARPY speech recognition system}, title={The HARPY speech recognition system},
author={Bruce T. {Lowerre}}, author={Bruce T. {Lowerre}},
//notes="Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2137095888", //notes="Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2137095888",
publisher={Carnegie Mellon University},
year={1976} year={1976}
} }
...@@ -1419,13 +1430,13 @@ ...@@ -1419,13 +1430,13 @@
year={1994} year={1994}
} }
@inproceedings{ney1991on, @inproceedings{ney1991smoothing,
title={On smoothing techniques for bigram-based natural language modelling}, title={On smoothing techniques for bigram-based natural language modelling},
author={H. {Ney} and U. {Essen}}, author={Ney, Hermann and Essen, Ute},
booktitle={[Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing}, booktitle={Acoustics, Speech, and Signal Processing, IEEE International Conference on},
pages={825--828}, pages={825--828},
//notes={Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2020749563}, year={1991},
year={1991} organization={IEEE Computer Society}
} }
@article{chen1999an, @article{chen1999an,
...@@ -1438,13 +1449,13 @@ ...@@ -1438,13 +1449,13 @@
//notes={Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2158195707}, //notes={Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2158195707},
year={1999} year={1999}
} }
%需要确认
@book{bell1990text, @book{bell1990text,
title={Text compression}, title={Text compression},
author={Timothy C. {Bell} and John G. {Cleary} and Ian H. {Witten}}, author={Timothy C. {Bell} and John G. {Cleary} and Ian H. {Witten}},
//notes={Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2611071497}, //notes={Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2611071497},
year={1990}, year={1990},
publisher={Prentice-Hall, Inc.} publisher={Prentice Hall}
} }
@article{katz1987estimation, @article{katz1987estimation,
...@@ -1686,6 +1697,22 @@ ...@@ -1686,6 +1697,22 @@
year={2000} year={2000}
} }
@article{lee1961an,
title="An Algorithm for Path Connections and Its Applications",
author="C. Y. {Lee}",
journal="Ire Transactions on Electronic Computers",
volume="10",
number="3",
pages="346--365",
year="1961"
}
@book{even2011graph,
title={Graph algorithms},
author={Even, Shimon},
year={2011},
publisher={Cambridge University Press}
}
%%%%% chapter 2------------------------------------------------------ %%%%% chapter 2------------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论