Commit 41d5b8db by 曹润柘

合并分支 'caorunzhe' 到 'master'

Caorunzhe

查看合并请求 !87
parents 29939616 4f05fac1
......@@ -111,7 +111,7 @@
\parinterval 人工翻译已经存在了上千年,而机器翻译又起源于什么时候呢?机器翻译跌宕起伏的发展史可以分为萌芽期、受挫期、快速成长期和爆发期四个阶段。
\parinterval 早在17世纪,如Descartes就提出使用世界语言,即使用统一符号表示不同语言、相同含义的词汇,来克服语言障碍的想法\upcite{knowlson1975universal},这种想法在当时是很超前的。随着语言学、计算机科学等学科的发展,在19世纪30年代使用计算模型进行自动翻译的思想开始萌芽,如当时法国科学家Georges Artsrouni就提出用机器来进行翻译的想法。只是那时依然没有合适的实现手段,所以这种想法的合理性无法被证实。
\parinterval 17世纪,Descartes提出世界语言的概念\upcite{knowlson1975universal},他希望使用统一符号表示不同语言、相同含义的词汇,以此来克服语言障碍,这种想法在当时是很超前的。随着语言学、计算机科学等学科的发展,在19世纪30年代使用计算模型进行自动翻译的思想开始萌芽,如当时法国科学家Georges Artsrouni就提出用机器来进行翻译的想法。只是那时依然没有合适的实现手段,所以这种想法的合理性无法被证实。
\parinterval 随着第二次世界大战爆发, 对文字进行加密和解密成为重要的军事需求,这也使得数学和密码学变得相当发达。在战争结束一年后,世界上第一台通用电子数字计算机于1946年研制成功(图\ref{fig:1-4}),至此使用机器进行翻译有了真正实现的可能。
......
......@@ -9,10 +9,10 @@
% \node[anchor=north,minimum width=1.8em,minimum height=1em,fill=blue!10] (l1) at ([yshift=-1em]eos.south){};
% \node[anchor=north,minimum width=1.8em,minimum height=1em,fill=red!10] (l2) at ([yshift=-0.5em]l1.south){};
\node[anchor=west,unit] (w1) at ([xshift=1.5em,yshift=7em]eos.east){$w_1$};
\node[anchor=north,unit] (w1) at ([xshift=0em,yshift=-1.8em]eos.south){$w_1$};
\node[anchor=north,unit,fill=blue!10] (n11) at ([yshift=-0.5em]w1.south){$<$sos$>$};
\node[anchor=west,unit,fill=red!20,opacity=0.3] (n24) at ([xshift=4.5em]n11.east){an};
\node[anchor=west,unit,fill=red!20,opacity=0.3] (n24) at ([xshift=6.5em,yshift=4.3em]n11.east){an};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black,opacity=0.3] (pt24) at (n24.east) {\small{{\color{white} \textbf{-1.4}}}};
\node[anchor=south,unit,fill=red!20] (n23) at ([yshift=0.1em]n24.north){one};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt23) at (n23.east) {\small{{\color{white} \textbf{-0.6}}}};
......@@ -30,7 +30,7 @@
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black,opacity=0.3] (pt27) at (n27.east) {\small{{\color{white} \textbf{-7.2}}}};
\node[anchor=south,unit] (w2) at ([yshift=0.5em]n21.north){$w_2$};
\node[anchor=west,unit,fill=red!20] (n31) at ([yshift=3em,xshift=6em]n21.east){is};
\node[anchor=west,unit,fill=red!20] (n31) at ([yshift=4.7em,xshift=8em]n21.east){is};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt31) at (n31.east) {\small{{\color{white} \textbf{-0.1}}}};
\node[anchor=north,unit,fill=blue!10] (n32) at ([yshift=-0.1em]n31.south){$<$eos$>$};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt32) at (n32.east) {\small{{\color{white} \textbf{-0.6}}}};
......@@ -49,7 +49,7 @@
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt41) at (n41.east) {\small{{\color{white} \textbf{-0.1}}}};
\node[anchor=north,unit,fill=red!20,opacity=0.3,minimum width=3.5em,minimum height=2.5em] (n51) at ([yshift=-0.1em]n41.south){};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2.5em,fill=black,opacity=0.3] (pt51) at (n51.east) {\small{{\color{white} \textbf{$<$-0.7}}}};
\node[anchor=south,unit] (w3) at ([yshift=0.5em]n31.north){$w_2$};
\node[anchor=south,unit] (w3) at ([yshift=0.5em]n31.north){$w_3$};
\draw[->,ublue,very thick] (n11.east) -- (n21.west);
\draw[->,ublue,very thick] (n11.east) -- (n22.west);
......
......@@ -6,17 +6,17 @@
\node[fill=blue!40,anchor=north,align=left,inner sep=2pt,minimum width=5em](spe)at(words.south){\color{white}{\small\bfnew{特殊符号}}};
\node[fill=blue!10,anchor=north,align=left,inner sep=3pt,minimum width=5em](eos)at(spe.south){$<$sos$>$\\[-0.5ex]$<$eos$>$};
\node[anchor=west,unit] (w1) at ([xshift=2em,yshift=4.5em]eos.east){$w_1$};
\node[anchor=north,unit] (w1) at ([xshift=2.5em,yshift=-1em]eos.south){$w_1$};
\node[anchor=north,unit,fill=blue!10] (n11) at ([yshift=-0.5em]w1.south){$<$sos$>$};
\node [anchor=north] (wtranslabel) at ([xshift=0em,yshift=-1em]n11.south) {\small{生成顺序:}};
\node [anchor=north] (wtranslabel) at ([xshift=-2.5em,yshift=-3em]n11.south) {\small{生成顺序:}};
\draw [->,ultra thick,red,line width=1.5pt,opacity=0.7] (wtranslabel.east) -- ([xshift=1.5em]wtranslabel.east);
\node[anchor=west,unit,fill=red!20] (n22) at ([xshift=5em]n11.east){agree};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt22) at (n22.east) {\small{{\color{white} \textbf{-0.4}}}};
\node[anchor=south,unit,fill=red!20] (n21) at ([yshift=0.3em]n22.north){I};
\node[anchor=south,unit,fill=red!20] (n21) at ([yshift=5.5em]n22.north){I};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt21) at (n21.east) {\small{{\color{white} \textbf{-0.5}}}};
\node[anchor=north,unit,fill=blue!10] (n23) at ([yshift=-0.3em]n22.south){$<$eos$>$};
\node[anchor=north,unit,fill=blue!10] (n23) at ([yshift=-3em]n22.south){$<$eos$>$};
\node [anchor=north,rotate=90,inner sep=1pt,minimum width=2em,fill=black] (pt23) at (n23.east) {\small{{\color{white} \textbf{-2.2}}}};
\node[anchor=south,unit] (w2) at ([yshift=0.5em]n21.north){$w_2$};
......
......@@ -686,7 +686,7 @@ N & = & \sum_{r=0}^{\infty}{r^{*}n_r} \nonumber \\
\subsubsection{3.Kneser-Ney平滑方法}
\parinterval Kneser-Ney平滑方法是由Reinhard Kneser和Hermann Ney于1995年提出的用于计算$n$元语法概率分布的方法\upcite{kneser1995improved,chen1999empirical},并被广泛认为是最有效的平滑方法之一。这种平滑方法改进了Absolute Discounting\upcite{ney1994on,ney1991on}中与高阶分布相结合的低阶分布的计算方法,使不同阶分布得到充分的利用。这种算法也综合利用了其他多种平滑算法的思想。
\parinterval Kneser-Ney平滑方法是由Reinhard Kneser和Hermann Ney于1995年提出的用于计算$n$元语法概率分布的方法\upcite{kneser1995improved,chen1999empirical},并被广泛认为是最有效的平滑方法之一。这种平滑方法改进了Absolute Discounting\upcite{ney1994on,ney1991smoothing}中与高阶分布相结合的低阶分布的计算方法,使不同阶分布得到充分的利用。这种算法也综合利用了其他多种平滑算法的思想。
\parinterval 首先介绍一下Absolute Discounting平滑算法,公式如下所示:
\begin{eqnarray}
......@@ -823,7 +823,7 @@ c_{\textrm{KN}}(\cdot) = \left\{\begin{array}{ll}
\label{eq:2-40}
\end{eqnarray}
\noindent 这里$\arg$即argument(参数),$\argmax_x f(x)$表示返回使$f(x)$达到最大的$x$$\argmax_{w \in \chi}\funp{P}(w)$表示找到使语言模型得分$\funp{P}(w)$达到最大的单词序列$w$$\chi$ 是搜索问题的解空间,它是所有可能的单词序列$w$的集合。$\hat{w}$可以被看做该搜索问题中的“最优解”,即概率最大的单词序列。
\noindent 这里$\arg$即argument(参数),$\argmax_x f(x)$表示返回使$f(x)$达到最大的$x$$\argmax_{w \in \chi}$\\$\funp{P}(w)$表示找到使语言模型得分$\funp{P}(w)$达到最大的单词序列$w$$\chi$ 是搜索问题的解空间,它是所有可能的单词序列$w$的集合。$\hat{w}$可以被看做该搜索问题中的“最优解”,即概率最大的单词序列。
\parinterval 在序列生成任务中,最简单的策略就是对词表中的词汇进行任意组合,通过这种枚举的方式得到全部可能的序列。但是,很多时候并生成序列的长度是无法预先知道的。比如,机器翻译中目标语序列的长度是任意的。那么怎样判断一个序列何时完成了生成过程呢?这里借用人类书写中文和英文的过程:句子的生成首先从一片空白开始,然后从左到右逐词生成,除了第一个单词,所有单词的生成都依赖于前面已经生成的单词。为了方便计算机实现,通常定义单词序列从一个特殊的符号<sos>后开始生成。同样地,一个单词序列的结束也用一个特殊的符号<eos>来表示。
......@@ -925,7 +925,7 @@ c_{\textrm{KN}}(\cdot) = \left\{\begin{array}{ll}
\end{figure}
%-------------------------------------------
\parinterval 这样,语言模型的打分与解空间树的遍历就融合了在一起。于是,序列生成的问题可以被重新描述为:寻找所有单词序列组成的解空间树中权重总和最大的一条路径。在这个定义下,前面提到的两种枚举词序列的方法就是经典的{\small\bfnew{深度优先搜索}}\index{深度优先搜索}(Depth-first Search)\index{Depth-first Search}{\small\bfnew{宽度优先搜索}}\index{宽度优先搜索}(Breadth-first Search)\index{Breadth-first Search}的雏形。在后面的内容中可以看到,从遍历解空间树的角度出发,可以对原始这些搜索策略的效率进行优化。
\parinterval 这样,语言模型的打分与解空间树的遍历就融合了在一起。于是,序列生成的问题可以被重新描述为:寻找所有单词序列组成的解空间树中权重总和最大的一条路径。在这个定义下,前面提到的两种枚举词序列的方法就是经典的{\small\bfnew{深度优先搜索}}\index{深度优先搜索}(Depth-first Search)\upcite{even2011graph}\index{Depth-first Search}{\small\bfnew{宽度优先搜索}}\index{宽度优先搜索}(Breadth-first Search)\upcite{lee1961an}\index{Breadth-first Search}的雏形。在后面的内容中可以看到,从遍历解空间树的角度出发,可以对原始这些搜索策略的效率进行优化。
%----------------------------------------------------------------------------------------
% NEW SUB-SECTION
......
......@@ -1174,12 +1174,22 @@
biburl = {https://dblp.org/rec/books/mg/CormenLR89.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
%没有出版社
@book{russell2003artificial,
title={Artificial Intelligence : A Modern Approach},
author={Stuart J. {Russell} and Peter {Norvig}},
//notes="Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2122410182",
year={2003}
@article{DBLP:journals/ai/SabharwalS11,
author = {Ashish Sabharwal and
Bart Selman},
title = {S. Russell, P. Norvig, Artificial Intelligence: {A} Modern Approach,
Third Edition},
journal = {Artif. Intell.},
volume = {175},
number = {5-6},
pages = {935--937},
year = {2011},
url = {https://doi.org/10.1016/j.artint.2011.01.005},
doi = {10.1016/j.artint.2011.01.005},
timestamp = {Sat, 27 May 2017 14:24:41 +0200},
biburl = {https://dblp.org/rec/journals/ai/SabharwalS11.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
@book{sahni1978fundamentals,
......@@ -1370,11 +1380,12 @@
number={3},
year={1957},
}
%没有出版社
@book{lowerre1976the,
title={The HARPY speech recognition system},
author={Bruce T. {Lowerre}},
//notes="Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2137095888",
publisher={Carnegie Mellon University},
year={1976}
}
......@@ -1419,13 +1430,13 @@
year={1994}
}
@inproceedings{ney1991on,
@inproceedings{ney1991smoothing,
title={On smoothing techniques for bigram-based natural language modelling},
author={H. {Ney} and U. {Essen}},
booktitle={[Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing},
author={Ney, Hermann and Essen, Ute},
booktitle={Acoustics, Speech, and Signal Processing, IEEE International Conference on},
pages={825--828},
//notes={Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2020749563},
year={1991}
year={1991},
organization={IEEE Computer Society}
}
@article{chen1999an,
......@@ -1438,13 +1449,13 @@
//notes={Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2158195707},
year={1999}
}
%需要确认
@book{bell1990text,
title={Text compression},
author={Timothy C. {Bell} and John G. {Cleary} and Ian H. {Witten}},
//notes={Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2611071497},
year={1990},
publisher={Prentice-Hall, Inc.}
publisher={Prentice Hall}
}
@article{katz1987estimation,
......@@ -1686,6 +1697,22 @@
year={2000}
}
@article{lee1961an,
title="An Algorithm for Path Connections and Its Applications",
author="C. Y. {Lee}",
journal="Ire Transactions on Electronic Computers",
volume="10",
number="3",
pages="346--365",
year="1961"
}
@book{even2011graph,
title={Graph algorithms},
author={Even, Shimon},
year={2011},
publisher={Cambridge University Press}
}
%%%%% chapter 2------------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
......
......@@ -18,6 +18,13 @@
year ={2019},
}
@book{knowlson1975universal,
title={Universal Language Schemes in England and France 1600-1800},
author={James {Knowlson}},
year={1975},
publisher={University of Toronto Press}
}
@article{DBLP:journals/bstj/Shannon48,
author = {Claude E. Shannon},
title = {A mathematical theory of communication},
......@@ -25,12 +32,7 @@
volume = {27},
number = {3},
pages = {379--423},
year = {1948},
//url = {https://doi.org/10.1002/j.1538-7305.1948.tb01338.x},
//doi = {10.1002/j.1538-7305.1948.tb01338.x},
//timestamp = {Sat, 30 May 2020 20:01:09 +0200},
//biburl = {https://dblp.org/rec/journals/bstj/Shannon48.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
year = {1948}
}
@article{shannon1949the,
......@@ -38,10 +40,20 @@
author={Claude E. {Shannon} and Warren {Weaver}},
journal={IEEE Transactions on Instrumentation and Measurement},
volume={13},
//notes={Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2993383518},
year={1949}
}
@article{weaver1955translation,
title={Translation},
author={Weaver, Warren},
journal={Machine translation of languages},
volume={14},
number={15-23},
pages={10},
year={1955},
publisher={Cambridge: Technology Press, MIT}
}
@article{Chomsky1957Syntactic,
title={Syntactic Structures},
author={Chomsky, Noam},
......@@ -51,24 +63,22 @@
year={1957},
}
@article{DBLP:journals/coling/BrownCPPJLMR90,
author = {Peter F. Brown and
John Cocke and
Stephen Della Pietra and
Vincent J. Della Pietra and
Frederick Jelinek and
John D. Lafferty and
Robert L. Mercer and
Paul S. Roossin},
title = {A Statistical Approach to Machine Translation},
journal = {Computational Linguistics},
volume = {16},
number = {2},
pages = {79--85},
year = {1990},
//timestamp = {Mon, 11 May 2020 15:46:08 +0200},
//biburl = {https://dblp.org/rec/journals/coling/BrownCPPJLMR90.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
@inproceedings{DBLP:conf/coling/SatoN90,
author = {Satoshi Sato and
Makoto Nagao},
title = {Toward Memory-based Translation},
booktitle = {13th International Conference on Computational Linguistics, {COLING}
1990, University of Helsinki, Finland, August 20-25, 1990},
pages = {247--252},
year = {1990}
}
@article{nagao1984framework,
title={A framework of a mechanical translation between Japanese and English by analogy principle},
author={Nagao, Makoto},
journal={Artificial and human intelligence},
pages={351--354},
year={1984}
}
@article{DBLP:journals/coling/BrownPPM94,
......@@ -81,32 +91,7 @@
volume = {19},
number = {2},
pages = {263--311},
year = {1993},
//timestamp = {Mon, 11 May 2020 15:46:10 +0200},
//biburl = {https://dblp.org/rec/journals/coling/BrownPPM94.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
}
@article{nagao1984framework,
title={A framework of a mechanical translation between Japanese and English by analogy principle},
author={Nagao, Makoto},
journal={Artificial and human intelligence},
pages={351--354},
year={1984}
}
@inproceedings{DBLP:conf/coling/SatoN90,
author = {Satoshi Sato and
Makoto Nagao},
title = {Toward Memory-based Translation},
booktitle = {13th International Conference on Computational Linguistics, {COLING}
1990, University of Helsinki, Finland, August 20-25, 1990},
pages = {247--252},
year = {1990},
//url = {https://www.aclweb.org/anthology/C90-3044/},
//timestamp = {Mon, 16 Sep 2019 17:08:53 +0200},
//biburl = {https://dblp.org/rec/conf/coling/SatoN90.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
year = {1993}
}
@article{DBLP:journals/coling/BrownCPPJLMR90,
......@@ -119,14 +104,11 @@
Robert L. Mercer and
Paul S. Roossin},
title = {A Statistical Approach to Machine Translation},
journal = {Comput. Linguistics},
journal = {Computational Linguistics},
volume = {16},
number = {2},
pages = {79--85},
year = {1990},
//timestamp = {Mon, 11 May 2020 15:46:08 +0200},
//biburl = {https://dblp.org/rec/journals/coling/BrownCPPJLMR90.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
year = {1990}
}
@article{nirenburg1989knowledge,
......@@ -154,7 +136,6 @@
volume={26},
number={4},
pages={638--641},
//notes="Sourced from Microsoft Academic - https://academic.microsoft.com/paper/1579838312",
year={2000}
}
......@@ -192,28 +173,67 @@
volume={19},
number={1},
pages={75--102},
//notes="Sourced from Microsoft Academic - https://academic.microsoft.com/paper/1489181569",
year={1993}
}
@article{brown1990statistical,
author = {Peter F. Brown and
John Cocke and
Stephen Della Pietra and
Vincent J. Della Pietra and
Frederick Jelinek and
John D. Lafferty and
Robert L. Mercer and
Paul S. Roossin},
title = {A Statistical Approach to Machine Translation},
journal = {Computational Linguistics},
volume = {16},
number = {2},
pages = {79--85},
year = {1990},
//timestamp = {Wed, 13 Feb 2002 09:26:36 +0100},
//biburl = {https://dblp.org/rec/journals/coling/BrownCPPJLMR90.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
@inproceedings{DBLP:journals/corr/LuongPM15,
author = {Thang Luong and
Hieu Pham and
Christopher D. Manning},
//editor = {Llu{\'{\i}}s M{\`{a}}rquez and
Chris Callison{-}Burch and
Jian Su and
Daniele Pighin and
Yuval Marton},
title = {Effective Approaches to Attention-based Neural Machine Translation},
booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural
Language Processing, {EMNLP} 2015, Lisbon, Portugal, September 17-21,
2015},
pages = {1412--1421},
publisher = {The Association for Computational Linguistics},
year = {2015}
}
@inproceedings{DBLP:journals/corr/GehringAGYD17,
author = {Jonas Gehring and
Michael Auli and
David Grangier and
Denis Yarats and
Yann N. Dauphin},
//editor = {Doina Precup and
Yee Whye Teh},
title = {Convolutional Sequence to Sequence Learning},
booktitle = {Proceedings of the 34th International Conference on Machine Learning,
{ICML} 2017, Sydney, NSW, Australia, 6-11 August 2017},
series = {Proceedings of Machine Learning Research},
volume = {70},
pages = {1243--1252},
publisher = {{PMLR}},
year = {2017}
}
@inproceedings{NIPS2017_7181,
author = {Ashish Vaswani and
Noam Shazeer and
Niki Parmar and
Jakob Uszkoreit and
Llion Jones and
Aidan N. Gomez and
Lukasz Kaiser and
Illia Polosukhin},
//editor = {Isabelle Guyon and
Ulrike von Luxburg and
Samy Bengio and
Hanna M. Wallach and
Rob Fergus and
S. V. N. Vishwanathan and
Roman Garnett},
title = {Attention is All you Need},
booktitle = {Advances in Neural Information Processing Systems 30: Annual Conference
on Neural Information Processing Systems 2017, 4-9 December 2017,
Long Beach, CA, {USA}},
pages = {5998--6008},
year = {2017}
}
@inproceedings{bahdanau2014neural,
......@@ -225,11 +245,7 @@
title = {Neural Machine Translation by Jointly Learning to Align and Translate},
booktitle = {3rd International Conference on Learning Representations, {ICLR} 2015,
San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings},
year = {2015},
//url = {http://arxiv.org/abs/1409.0473},
//timestamp = {Wed, 17 Jul 2019 10:40:54 +0200},
//biburl = {https://dblp.org/rec/journals/corr/BahdanauCB14.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
year = {2015}
}
@inproceedings{NIPS2014_5346,
......@@ -246,23 +262,14 @@
on Neural Information Processing Systems 2014, December 8-13 2014,
Montreal, Quebec, Canada},
pages = {3104--3112},
year = {2014},
//url = {http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks},
//timestamp = {Fri, 06 Mar 2020 16:58:11 +0100},
//biburl = {https://dblp.org/rec/conf/nips/SutskeverVL14.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
year = {2014}
}
@book{koehn2009statistical,
author = {Philipp Koehn},
title = {Statistical Machine Translation},
publisher = {Cambridge University Press},
year = {2010},
//url = {http://www.statmt.org/book/},
//isbn = {978-0-521-87415-1},
//timestamp = {Tue, 25 Jun 2019 09:00:29 +0200},
//biburl = {https://dblp.org/rec/books/daglib/0032677.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
year = {2010}
}
@article{DBLP:journals/corr/abs-1709-07809,
......@@ -270,12 +277,7 @@
title = {Neural Machine Translation},
journal = {CoRR},
volume = {abs/1709.07809},
year = {2017},
//url = {http://arxiv.org/abs/1709.07809},
//eprint = {1709.07809},
//timestamp = {Mon, 13 Aug 2018 16:47:37 +0200},
//biburl = {https://dblp.org/rec/journals/corr/abs-1709-07809.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
year = {2017}
}
@book{manning1999foundations,
......@@ -299,12 +301,7 @@
title = {Deep Learning},
series = {Adaptive computation and machine learning},
publisher = {{MIT} Press},
year = {2016},
//url = {http://www.deeplearningbook.org/},
//isbn = {978-0-262-03561-3},
//timestamp = {Sat, 25 Mar 2017 20:16:59 +0100},
//biburl = {https://dblp.org/rec/books/daglib/0040158.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
year = {2016}
}
@article{goldberg2017neural,
......@@ -338,27 +335,55 @@
journal ={中文信息学报},
volume ={34},
pages ={4},
year ={2020},
//note ={\url{https://nndl.github.io/}}
year ={2020}
}
@book{knowlson1975universal,
title={Universal Language Schemes in England and France 1600-1800},
author={James {Knowlson}},
//notes={Sourced from Microsoft Academic - https://academic.microsoft.com/paper/2088082035},
year={1975},
publisher={University of Toronto Press}
%%%%% chapter 1------------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%% chapter 2------------------------------------------------------
@book{kolmogorov2018foundations,
title ={Foundations of the theory of probability: Second English Edition},
author ={Kolmogorov, Andre Nikolaevich and Bharucha-Reid, Albert T},
year ={2018},
publisher ={Courier Dover Publications}
}
@article{weaver1955translation,
title={Translation},
author={Weaver, Warren},
journal={Machine translation of languages},
volume={14},
number={15-23},
pages={10},
year={1955},
publisher={Cambridge: Technology Press, MIT}
@book{mao-prob-book-2011,
title ={概率论与数理统计教程: 第二版},
author ={魏宗舒},
year ={2011},
publisher ={北京: 高等教育出版社}
}
@article{resnick1992adventures,
author = {Barbour, A. and Resnick, Sidney},
year = {1993},
month = {12},
pages = {1474},
title = {Adventures in Stochastic Processes.},
volume = {88},
journal = {Journal of the American Statistical Association}
}
@book{liuke-markov-2004,
title ={实用马尔可夫决策过程},
author ={刘克},
year ={2004},
publisher ={清华大学出版社}
}
@article{gale1995good,
author = {William A. Gale and
Geoffrey Sampson},
title = {Good-Turing Frequency Estimation Without Tears},
journal = {Journal of Quantitative Linguistics},
volume = {2},
number = {3},
pages = {217--237},
year = {1995}
}
@article{good1953population,
......@@ -372,26 +397,427 @@
publisher ={Oxford University Press}
}
@article{gale1995good,
author = {William A. Gale and
Geoffrey Sampson},
title = {Good-Turing Frequency Estimation Without Tears},
journal = {Journal of Quantitative Linguistics},
volume = {2},
number = {3},
pages = {217--237},
year = {1995},
//url = {https://doi.org/10.1080/09296179508590051},
//doi = {10.1080/09296179508590051},
//timestamp = {Sat, 20 May 2017 00:22:46 +0200},
//biburl = {https://dblp.org/rec/journals/jql/GaleS95.bib},
//bibsource = {dblp computer science bibliography, https://dblp.org}
@inproceedings{kneser1995improved,
author = {Reinhard Kneser and
Hermann Ney},
title = {Improved backing-off for M-gram language modeling},
booktitle = {1995 International Conference on Acoustics, Speech, and Signal Processing,
{ICASSP} '95, Detroit, Michigan, USA, May 08-12, 1995},
pages = {181--184},
publisher = {{IEEE} Computer Society},
year = {1995}
}
%%%%% chapter 1------------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%% chapter 2------------------------------------------------------
@inproceedings{ney1991smoothing,
title={On smoothing techniques for bigram-based natural language modelling},
author={Ney, Hermann and Essen, Ute},
booktitle={Acoustics, Speech, and Signal Processing, IEEE International Conference on},
pages={825--828},
year={1991},
organization={IEEE Computer Society}
}
@article{ney1994on,
title={On structuring probabilistic dependences in stochastic language modelling},
author={Hermann {Ney} and Ute {Essen} and Reinhard {Kneser}},
journal={Computer Speech \& Language},
volume={8},
number={1},
pages={1--38},
year={1994}
}
@inproceedings{stolcke2002srilm,
author = {Andreas Stolcke},
//editor = {John H. L. Hansen and
Bryan L. Pellom},
title = {{SRILM} - an extensible language modeling toolkit},
booktitle = {7th International Conference on Spoken Language Processing, {ICSLP2002}
- {INTERSPEECH} 2002, Denver, Colorado, USA, September 16-20, 2002},
publisher = {{ISCA}},
year = {2002}
}
@inproceedings{heafield-2011-kenlm,
author = {Kenneth Heafield},
//editor = {Chris Callison{-}Burch and
Philipp Koehn and
Christof Monz and
Omar Zaidan},
title = {KenLM: Faster and Smaller Language Model Queries},
booktitle = {Proceedings of the Sixth Workshop on Statistical Machine Translation,
WMT@EMNLP 2011, Edinburgh, Scotland, UK, July 30-31, 2011},
pages = {187--197},
publisher = {Association for Computational Linguistics},
year = {2011}
}
@article{chen1999empirical,
author = {Stanley F. Chen and
Joshua Goodman},
title = {An empirical study of smoothing techniques for language modeling},
journal = {Computer Speech \& Language},
volume = {13},
number = {4},
pages = {359--393},
year = {1999}
}
@article{ney1994structuring,
author = {Hermann Ney and
Ute Essen and
Reinhard Kneser},
title = {On structuring probabilistic dependences in stochastic language modelling},
journal = {Computer Speech \& Language},
volume = {8},
number = {1},
pages = {1--38},
year = {1994}
}
@book{parsing2009speech,
author = {Dan Jurafsky and
James H. Martin},
title = {Speech and language processing: an introduction to natural language
processing, computational linguistics, and speech recognition, 2nd
Edition},
series = {Prentice Hall series in artificial intelligence},
publisher = {Prentice Hall, Pearson Education International},
year = {2009}
}
@book{DBLP:books/mg/CormenLR89,
author = {Thomas H. Cormen and
Charles E. Leiserson and
Ronald L. Rivest},
title = {Introduction to Algorithms},
publisher = {The {MIT} Press and McGraw-Hill Book Company},
year = {1989}
}
@book{even2011graph,
title={Graph algorithms},
author={Even, Shimon},
year={2011},
publisher={Cambridge University Press}
}
@article{lee1961an,
title="An Algorithm for Path Connections and Its Applications",
author="C. Y. {Lee}",
journal="Ire Transactions on Electronic Computers",
volume="10",
number="3",
pages="346--365",
year="1961"
}
@article{DBLP:journals/ai/SabharwalS11,
author = {Ashish Sabharwal and
Bart Selman},
title = {S. Russell, P. Norvig, Artificial Intelligence: {A} Modern Approach,
Third Edition},
journal = {Artificial Intelligence},
volume = {175},
number = {5-6},
pages = {935--937},
year = {2011}
}
@book{sahni1978fundamentals,
title={Fundamentals of Computer Algorithms},
author={Sartaj {Sahni} and Ellis {Horowitz}},
year={1978},
publisher={Computer Science Press}
}
@article{hart1968a,
title={A Formal Basis for the Heuristic Determination of Minimum Cost Paths},
author={Peter E. {Hart} and Nils J. {Nilsson} and Bertram {Raphael}},
journal={IEEE Transactions on Systems Science and Cybernetics},
volume={4},
number={2},
pages={100--107},
year={1968}
}
@book{lowerre1976the,
title={The HARPY speech recognition system},
author={Bruce T. {Lowerre}},
publisher={Carnegie Mellon University},
year={1976}
}
@book{bishop1995neural,
title={Neural networks for pattern recognition},
author={Christopher M. {Bishop}},
year={1995},
publisher={Oxford university press}
}
@article{åström1965optimal,
title={Optimal control of Markov processes with incomplete state information},
author={Karl Johan {Åström}},
journal={Journal of Mathematical Analysis and Applications},
volume={10},
number={1},
pages={174--205},
year={1965}
}
@article{korf1990real,
title={Real-time heuristic search},
author={Richard E. {Korf}},
journal={Artificial Intelligence},
volume={42},
number={2},
pages={189--211},
year={1990}
}
%缩写
@article{jelinek1980interpolated,
title={Interpolated estimation of Markov source parameters from sparse data},
author={F. {Jelinek}},
journal={Proc. Workshop on Pattern Recognition in Practice, 1980},
pages={381--397},
year={1980}
}
@article{katz1987estimation,
title={Estimation of probabilities from sparse data for the language model component of a speech recognizer},
author={S. {Katz}},
journal={IEEE Transactions on Acoustics, Speech, and Signal Processing},
volume={35},
number={3},
pages={400--401},
year={1987}
}
@article{witten1991the,
title={The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression},
author={I.H. {Witten} and T.C. {Bell}},
journal={IEEE Transactions on Information Theory},
volume={37},
number={4},
pages={1085--1094},
year={1991}
}
@book{bell1990text,
title={Text compression},
author={Timothy C. {Bell} and John G. {Cleary} and Ian H. {Witten}},
year={1990},
publisher={Prentice Hall}
}
@article{goodman2001a,
title={A bit of progress in language modeling},
author={Joshua T. {Goodman}},
journal={Computer Speech \& Language},
volume={15},
number={4},
pages={403--434},
year={2001}
}
@article{chen1999an,
title={An empirical study of smoothing techniques for language modeling},
author={Stanley F. {Chen} and Joshua {Goodman}},
journal={Computer Speech \& Language},
volume={13},
number={4},
pages={359--394},
year={1999}
}
@inproceedings{kirchhoff2005improved,
title={Improved Language Modeling for Statistical Machine Translation},
author={Katrin {Kirchhoff} and Mei {Yang}},
booktitle={Proceedings of the ACL Workshop on Building and Using Parallel Texts},
pages={125--128},
year={2005}
}
@inproceedings{koehn2007factored,
title={Factored Translation Models},
author={Philipp {Koehn} and Hieu {Hoang}},
booktitle={Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
pages={868--876},
year={2007}
}
@inproceedings{sarikaya2007joint,
title={Joint Morphological-Lexical Language Modeling for Machine Translation},
author={Ruhi {Sarikaya} and Yonggang {Deng}},
booktitle={Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers},
pages={145--148},
year={2007}
}
@inproceedings{heafield2011kenlm,
title={KenLM: Faster and Smaller Language Model Queries},
author={Kenneth {Heafield}},
booktitle={Proceedings of the Sixth Workshop on Statistical Machine Translation},
pages={187--197},
year={2011}
}
@inproceedings{federico2006how,
title={How Many Bits Are Needed To Store Probabilities for Phrase-Based Translation?},
author={Marcello {Federico} and Nicola {Bertoldi}},
booktitle={Proceedings on the Workshop on Statistical Machine Translation},
pages={94--101},
year={2006}
}
@inproceedings{federico2007efficient,
title={Efficient Handling of N-gram Language Models for Statistical Machine Translation},
author={Marcello {Federico} and Mauro {Cettolo}},
booktitle={Proceedings of the Second Workshop on Statistical Machine Translation},
pages={88--95},
year={2007}
}
@inproceedings{talbot2007randomised,
title={Randomised Language Modelling for Statistical Machine Translation},
author={David {Talbot} and Miles {Osborne}},
booktitle={Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics},
pages={512--519},
year={2007}
}
@inproceedings{talbot2007smoothed,
title={Smoothed Bloom Filter Language Models: Tera-Scale LMs on the Cheap},
author={David {Talbot} and Miles {Osborne}},
booktitle={Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
pages={468--476},
year={2007}
}
@article{jing2019a,
title={A Survey on Neural Network Language Models.},
author={Kun {Jing} and Jungang {Xu}},
journal={arXiv preprint arXiv:1906.03591},
year={2019}
}
@article{bengio2003a,
title={A neural probabilistic language model},
author={Yoshua {Bengio} and Réjean {Ducharme} and Pascal {Vincent} and Christian {Janvin}},
journal={Journal of Machine Learning Research},
volume={3},
number={6},
pages={1137--1155},
year={2003}
}
@inproceedings{mikolov2010recurrent,
author = {Tomas Mikolov and
Martin Karafi{\'{a}}t and
Luk{\'{a}}s Burget and
Jan Cernock{\'{y}} and
Sanjeev Khudanpur},
//editor = {Takao Kobayashi and
Keikichi Hirose and
Satoshi Nakamura},
title = {Recurrent neural network based language model},
booktitle = {{INTERSPEECH} 2010, 11th Annual Conference of the International Speech
Communication Association, Makuhari, Chiba, Japan, September 26-30,
2010},
pages = {1045--1048},
publisher = {{ISCA}},
year = {2010}
}
@inproceedings{sundermeyer2012lstm,
title={LSTM Neural Networks for Language Modeling.},
author={Martin {Sundermeyer} and Ralf {Schlüter} and Hermann {Ney}},
booktitle={INTERSPEECH},
pages={194--197},
year={2012}
}
@inproceedings{vaswani2017attention,
title={Attention is All You Need},
author={Ashish {Vaswani} and Noam {Shazeer} and Niki {Parmar} and Jakob {Uszkoreit} and Llion {Jones} and Aidan N. {Gomez} and Lukasz {Kaiser} and Illia {Polosukhin}},
booktitle={Proceedings of the 31st International Conference on Neural Information Processing Systems},
pages={5998--6008},
year={2017}
}
@inproceedings{tillmann1997a,
title={A DP-based Search Using Monotone Alignments in Statistical Translation},
author={Christoph {Tillmann} and Stephan {Vogel} and Hermann {Ney} and Alex {Zubiaga}},
booktitle={Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics},
pages={289--296},
year={1997}
}
@inproceedings{DBLP:conf/acl/WangW97,
author = {Ye{-}Yi Wang and
Alex Waibel},
editor = {Philip R. Cohen and
Wolfgang Wahlster},
title = {Decoding Algorithm in Statistical Machine Translation},
booktitle = {35th Annual Meeting of the Association for Computational Linguistics
and 8th Conference of the European Chapter of the Association for
Computational Linguistics, Proceedings of the Conference, 7-12 July
1997, Universidad Nacional de Educaci{\'{o}}n a Distancia (UNED),
Madrid, Spain},
pages = {366--372},
publisher = {Morgan Kaufmann Publishers / {ACL}},
year = {1997}
}
@inproceedings{DBLP:conf/acl/OchUN01,
author = {Franz Josef Och and
Nicola Ueffing and
Hermann Ney},
title = {An Efficient A* Search Algorithm for Statistical Machine Translation},
booktitle = {Proceedings of the {ACL} Workshop on Data-Driven Methods in Machine
Translation, Toulouse, France, July 7, 2001},
year = {2001}
}
@inproceedings{germann2001fast,
title={Fast Decoding and Optimal Decoding for Machine Translation},
author={Ulrich {Germann} and Michael {Jahr} and Kevin {Knight} and Daniel {Marcu} and Kenji {Yamada}},
booktitle={Proceedings of 39th Annual Meeting of the Association for Computational Linguistics},
pages={228--235},
year={2001}
}
@inproceedings{germann2003greedy,
title={Greedy decoding for statistical machine translation in almost linear time},
author={Ulrich {Germann}},
booktitle={NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1},
pages={1--8},
year={2003}
}
@inproceedings{bangalore2001a,
title={A finite-state approach to machine translation},
author={S. {Bangalore} and G. {Riccardi}},
booktitle={IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.},
pages={381--388},
year={2001}
}
@inproceedings{bangalore2000stochastic,
title={Stochastic finite-state models for spoken language machine translation},
author={Srinivas {Bangalore} and Giuseppe {Riccardi}},
booktitle={NAACL-ANLP-EMTS '00 Proceedings of the 2000 NAACL-ANLP Workshop on Embedded machine translation systems - Volume 5},
pages={52--59},
year={2000}
}
@inproceedings{venugopal2007an,
title={An Efficient Two-Pass Approach to Synchronous-CFG Driven Statistical MT},
author={Ashish {Venugopal} and Andreas {Zollmann} and Vogel {Stephan}},
booktitle={Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference},
pages={500--507},
year={2007}
}
%%%%% chapter 2------------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论