new pages of phrase-based modeling

7b8b7e9a · xiaotong · 4a5d7b97 · 7b8b7e9a · 7b8b7e9a
Commit 7b8b7e9a authored Nov 21, 2019 by xiaotong
--- a/Section04-Phrasal-and-Syntactic-Models/section04-test.tex
+++ b/Section04-Phrasal-and-Syntactic-Models/section04-test.tex
@@ -96,148 +96,54 @@
 \section{使用更大的翻译单元}

 %%%------------------------------------------------------------------------------------------------------------
-%%% 什么是短语
-\begin{frame}{何为短语？}
-
+%%% 数学模型
+\begin{frame}{数学模型}
 \begin{itemize}
-\item 句对可以用短语对的组合进行表示，比如下图的例子包含三个短语翻译：
-	\begin{itemize}
-	\item 进口 $\leftrightarrow$ the imports have
-	\item 大幅度 $\leftrightarrow$ drastically
-	\item 下降 了 $\leftrightarrow$ fallen
-	\end{itemize}
-
-\begin{center}
-\begin{tikzpicture}
-
-\begin{scope}[minimum height = 18pt]
-
-\node[anchor=east] (s0) at (-0.5em, 0) {源语:};
-\node[anchor=west] (s1) at (0, 0) {进口};
-\node[anchor=west] (s2) at (3.5em, 0) {大幅度};
-\node[anchor=west] (s3) at (7.9em, 0) {下降 了};
-
-\node[anchor=west,fill=ugreen!50] (s1) at (0, 0) {进口};
-\node [anchor=west,fill=red!50] (s2) at (3.5em, 0) {大幅度};
-\node[anchor=west,fill=blue!50] (s3) at (7.9em, 0) {下降 了};
-
-\node[anchor=east] (t0) at (-0.5em, -1) {目标语:};
-\node[anchor=west] (t1) at (0, -1) {the imports have};
-\node[anchor=west] (t2) at (8.4em, -1) {drastically};
-\node[anchor=west] (t3) at (14.0em, -1) {fallen};
-
-\node[anchor=west,fill=ugreen!50] (t1) at (0, -1) {the imports have};
-\node[anchor=west,fill=red!50] (t2) at (8.4em, -1) {drastically};
-\node[anchor=west,fill=blue!50] (t3) at (14.0em, -1) {fallen};
-
-\path[<->, thick] (s1.south) edge (t1.north);
-\path[<->, thick] (s2.south) edge (t2.north);
-\path[<->, thick] (s3.south) edge (t3.north);
+\item \textbf{机器翻译}：对于输入的源语言句子$\textbf{s}$，找到最佳译文$\hat{\textbf{t}}$

-\end{scope}
+\begin{displaymath}
+\hat{\textbf{t}} = \argmax_{\textbf{t}} \textrm{P}(\textbf{t}|\textbf{s})
+\end{displaymath}

-\end{tikzpicture}
-\end{center}
-
-\item<2-> 显然上图中的短语并不是语言学上的短语。这里有：\\
+其中$\textrm{P}(\textbf{t}|\textbf{s})$表示$\textbf{s}$到$\textbf{t}$的翻译概率

+\item 三个基本问题(回忆一下第三章)
+    \begin{enumerate}
+    \item 如何定义$\textrm{P}(\textbf{t}|\textbf{s})$ - 建模问题
+    \item 如何学习$\textrm{P}(\textbf{t}|\textbf{s})$的统计模型 - 训练问题
+    \item 如何找到最优译文 - 解码问题
+    \end{enumerate}
 \vspace{0.3em}
-\begin{beamerboxesrounded}[upper=uppercolblue,lower=lowercolblue,shadow=true]{定义 - 短语}
-对于一个句子$\textbf{w} = w_1...w_n$，任意子串$w_i...w_j$($i \le j$, $0 \le i$, $j \le n$)都是句子$\textbf{w}$的一个\alert{短语}
-\end{beamerboxesrounded}
+\item<2-> 先看建模问题。可以把$\textrm{P}(\textbf{t}|\textbf{s})$表示成所有翻译推导的概率

-	\begin{itemize}
-	\item $n$个词构成的句子可以有$\frac{n(n+1)}{2}$个短语
-	\end{itemize}
+\begin{displaymath}
+\textrm{P}(\textbf{t}|\textbf{s}) = \sum_{d} \textrm{P}(d,\textbf{t}|\textbf{s})
+\end{displaymath}
+
+$d$是一个$(\textbf{s},\textbf{t})$上基于短语的翻译推导，$\textrm{P}(d,\textbf{t}|\textbf{s})$表示翻译推导$d$的概率

 \end{itemize}
 \end{frame}

 %%%------------------------------------------------------------------------------------------------------------
-%%% 什么是短语翻译推导
-\begin{frame}{双语短语}
+%%% 翻译推导的建模
+\begin{frame}{对翻译推导进行建模}
 \begin{itemize}
-\item 进一步，可以定义 \\
-\begin{beamerboxesrounded}[upper=uppercolblue,lower=lowercolblue,shadow=true]{定义 - 句子的短语切分}
-对于一个句子$\textbf{w} = w_1...w_n$，可以被切分为$m$个子串，则称$\textbf{w}$由$m$个短语组成，记为$\textbf{w} = p_1...p_m$，其中$p_i$是$\textbf{w}$的一个短语， $p_1...p_m$也被称作句子$\textbf{w}$的一个\alert{短语切分}
-\end{beamerboxesrounded}
-
-\vspace{0.5em}
-\item<2-> 对于双语的情况 \\
-\begin{beamerboxesrounded}[upper=uppercolblue,lower=lowercolblue,shadow=true]{定义 - 双语短语(或短语对)}
-对于源语和目标语句对($\textbf{s}, \textbf{t}$)，$\textbf{s}$中短语$\tilde{s}_i$和$\textbf{t}$中的短语$\tilde{t}_j$可以构成一个双语短语对$(\tilde{s}_i,\tilde{t}_j)$，简称\alert{短语对}$(\tilde{s}_i,\tilde{t}_j)$
-\end{beamerboxesrounded}
-
-	\begin{itemize}
-	\item 比如，句对``进口 大幅度 下降 了 $\leftrightarrow$ the imports have drastically fallen''，有很多短语对，比如
+\item $\textrm{P}(\textbf{t}|\textbf{s}) = \sum_{d} \textrm{P}(d,\textbf{t}|\textbf{s})$带来新的问题：
    \begin{itemize}
-		\item 大幅度 $\leftrightarrow$ drastically
-		\item 大幅度 下降 $\leftrightarrow$ have drastically fallen
+    \item \textbf{短语获取}：如何获取双语短语，以构成$d$
+    \item \textbf{翻译建模}：如何描述$\textrm{P}(d,\textbf{t}|\textbf{s})$
+    \item \textbf{模型简化}：如何对所有$d$进行$\textrm{P}(d,\textbf{t}|\textbf{s})$的求和
    \end{itemize}
+    下面会分别展开讨论
+\item 回到一开始的问题: 给定$\textbf{s}$和$\textbf{t}$，如何获得双语短语
+    \begin{itemize}
+    \item 如果没有限制，$\textbf{s}$和$\textbf{t}$之间任何子串映射都可以看做双语短语
    \end{itemize}
-
 \end{itemize}
 \end{frame}

 %%%------------------------------------------------------------------------------------------------------------
-%%% 融合用双语短语描述翻译
-\begin{frame}{基于短语的翻译推导}
-\begin{beamerboxesrounded}[upper=uppercolblue,lower=lowercolblue,shadow=true]{定义 - 基于短语的翻译推导}
-{\small
-对于源语和目标语句对($\textbf{s}, \textbf{t}$)，有$l$个短语对$\{(\tilde{s}_i,\tilde{t}_j)\}$，且所有源语言短语$\{\tilde{s}_i\}$和所有目标语短语$\{\tilde{t}_j\}$分别构成$\textbf{s}$和$\textbf{t}$ 的切分，则称这些短语对$\{(\tilde{s}_i,\tilde{t}_j)\}$构成了$\textbf{s}$到$\textbf{t}$的\alert{基于短语的翻译推导}(简称推导)，记为$d(\{(\tilde{s}_i,\tilde{t}_j)\},\textbf{s},\textbf{t})$(简记为$d(\{(\tilde{s}_i,\tilde{t}_j)\})$或$d$)。
-}
-\end{beamerboxesrounded}
-
-\vspace{-0.5em}
-\begin{center}
-\begin{tikzpicture}
- 
-\begin{scope}[minimum height = 18pt]
- 
-\node[anchor=east] (s0) at (-0.5em, 0) {$\textbf{s}$:};
-\node[anchor=west] (s1) at (0, 0) {进口};
-\node[anchor=west] (s2) at (3.5em, 0) {大幅度};
-\node[anchor=west] (s3) at (7.9em, 0) {下降 了};
- 
-\node[anchor=west,fill=ugreen!50] (s1) at (0, 0) {进口};
-\node[anchor=west,fill=red!50] (s2) at (3.5em, 0) {大幅度};
-\node[anchor=west,fill=blue!50] (s3) at (7.9em, 0) {下降 了};
- 
-\node[anchor=east] (t0) at (-0.5em, -1) {$\textbf{t}$:};
-\node[anchor=west] (t1) at (0, -1) {the imports have};
-\node[anchor=west] (t2) at (8.4em, -1) {drastically};
-\node[anchor=west] (t3) at (14.0em, -1) {fallen};
- 
-\node[anchor=west,fill=ugreen!50] (t1) at (0, -1) {the imports have};
-\node[anchor=west,fill=red!50] (t2) at (8.4em, -1) {drastically};
-\node[anchor=west,fill=blue!50] (t3) at (14.0em, -1) {fallen};
- 
-\path[<->, thick] (s1.south) edge (t1.north);
-\path[<->, thick] (s2.south) edge (t2.north);
-\path[<->, thick] (s3.south) edge (t3.north);
-
-\node[anchor=south,inner sep=0pt,yshift=-0.3em] (sp1) at (s1.north) {\scriptsize{$\tilde{s}_1$}};
-\node[anchor=south,inner sep=0pt,yshift=-0.3em] (sp2) at (s2.north) {\scriptsize{$\tilde{s}_2$}};
-\node[anchor=south,inner sep=0pt,yshift=-0.3em] (sp3) at (s3.north) {\scriptsize{$\tilde{s}_3$}};
-\node[anchor=north,inner sep=0pt,yshift=0.3em] (tp1) at (t1.south) {\scriptsize{$\tilde{t}_1$}};
-\node[anchor=north,inner sep=0pt,yshift=0.3em] (tp2) at (t2.south) {\scriptsize{$\tilde{t}_2$}};
-\node[anchor=north,inner sep=0pt,yshift=0.3em] (tp3) at (t3.south) {\scriptsize{$\tilde{t}_3$}};
- 
-\end{scope}
-\end{tikzpicture}
-\end{center}
-
-\vspace{-1.5em}
-
-\begin{itemize}
-\item $\{\tilde{s}_1,\tilde{s}_2,\tilde{s}_3\}$是$\textbf{s}$的一个短语切分
-\item $\{\tilde{t}_1,\tilde{t}_2,\tilde{t}_3\}$是$\textbf{t}$的一个短语切分
-\item $\{(\tilde{s}_k,\tilde{t}_k)\}$构成了$(\textbf{s},\textbf{t})$的一个基于短语的翻译推导
-\end{itemize}
-
-\end{frame}
-
-%%%------------------------------------------------------------------------------------------------------------
 \section{基于短语的模型}

 %%%------------------------------------------------------------------------------------------------------------

--- a/Section04-Phrasal-and-Syntactic-Models/section04.tex
+++ b/Section04-Phrasal-and-Syntactic-Models/section04.tex
@@ -1024,7 +1024,7 @@
 \end{tikzpicture}
 \end{center}

-\vspace{-1.5em}
+\vspace{-1.0em}

 \begin{itemize}
 \item $\{\tilde{s}_1,\tilde{s}_2,\tilde{s}_3\}$是$\textbf{s}$的一个短语切分