合并分支 'caorunzhe' 到 'master'

Caorunzhe 查看合并请求 !897

合并分支 'caorunzhe' 到 'master'
Caorunzhe 查看合并请求 !897
cf17ccf5 · 曹润柘 · c323d69a · 131763e6 · cf17ccf5 · cf17ccf5
Commit cf17ccf5 authored Jan 14, 2021 by 曹润柘
--- a/Chapter12/chapter12.tex
+++ b/Chapter12/chapter12.tex
@@ -325,11 +325,11 @@
 \begin{itemize}
 \vspace{0.5em}
-\item 首先，将$\mathbi{Q}$、$\mathbi{K}$、$\mathbi{V}$分别通过线性（Linear）变换的方式映射为$h$个子集。即$\mathbi{Q}_i = \mathbi{Q}\mathbi{W}_i^{\,Q} $、$\mathbi{K}_i = \mathbi{K}\mathbi{W}_i^{\,K} $、$\mathbi{V}_i = \mathbi{V}\mathbi{W}_i^{\,V} $，其中$i$表示第$i$个头， $\mathbi{W}_i^{\,Q}  \in \mathbb{R}^{d_{model} \times d_k}$,  $\mathbi{W}_i^{\,K}  \in \mathbb{R}^{d_{model} \times d_k}$,  $\mathbi{W}_i^{\,V}  \in \mathbb{R}^{d_{model} \times d_v}$是参数矩阵; $d_k=d_v=d_{model} / h$，对于不同的头采用不同的变换矩阵，这里$d_{model}$表示每个隐层向量的维度；
+\item 首先，将$\mathbi{Q}$、$\mathbi{K}$、$\mathbi{V}$分别通过线性（Linear）变换的方式映射为$h$个子集。即$\mathbi{Q}_i = \mathbi{Q}\mathbi{W}_i^{\,Q} $、$\mathbi{K}_i = \mathbi{K}\mathbi{W}_i^{\,K} $、$\mathbi{V}_i = \mathbi{V}\mathbi{W}_i^{\,V} $，其中$i$表示第$i$个头， $\mathbi{W}_i^{\,Q}  \in \mathbb{R}^{d_{\textrm{model}} \times d_k}$,  $\mathbi{W}_i^{\,K}  \in \mathbb{R}^{d_{\textrm{model}} \times d_k}$,  $\mathbi{W}_i^{\,V}  \in \mathbb{R}^{d_{\textrm{model}} \times d_v}$是参数矩阵; $d_k=d_v=d_{\textrm{model}} / h$，对于不同的头采用不同的变换矩阵，这里$d_{\textrm{model}}$表示每个隐层向量的维度；
 \vspace{0.5em}
 \item 其次，对每个头分别执行点乘注意力操作，并得到每个头的注意力操作的输出$\mathbi{head}_i$；
 \vspace{0.5em}
-\item 最后，将$h$个头的注意力输出在最后一维$d_v$进行拼接（Concat）重新得到维度为$hd_v$的输出，并通过对其右乘一个权重矩阵$\mathbi{W}^{\,o}$进行线性变换，从而对多头计算得到的信息进行融合，且将多头注意力输出的维度映射为模型的隐层大小（即$d_{model}$），这里参数矩阵$\mathbi{W}^{\,o} \in \mathbb{R}^{h d_v \times d_{model}}$。
+\item 最后，将$h$个头的注意力输出在最后一维$d_v$进行拼接（Concat）重新得到维度为$hd_v$的输出，并通过对其右乘一个权重矩阵$\mathbi{W}^{\,o}$进行线性变换，从而对多头计算得到的信息进行融合，且将多头注意力输出的维度映射为模型的隐层大小（即$d_{\textrm{model}}$），这里参数矩阵$\mathbi{W}^{\,o} \in \mathbb{R}^{h d_v \times d_{\textrm{model}}}$。
 \vspace{0.5em}
 \end{itemize}

--- a/Chapter17/Figures/figure-cascading-speech-translation.tex
+++ b/Chapter17/Figures/figure-cascading-speech-translation.tex
@@ -10,7 +10,7 @@
 \node(process_2)[process,fill=blue!20,right of = process_1,xshift=7.0cm,text width=4cm,align=center]{\baselineskip=4pt\LARGE{[[0.2,...,0.3], \qquad ..., \qquad  0.3,...,0.5]]}\par};
 \node(text_2)[below of = process_2,yshift=-2cm,scale=1.5]{语音特征};
 \node(process_3)[process,fill=orange!20,minimum width=6cm,minimum height=5cm,right of = process_2,xshift=8.2cm,text width=4cm,align=center]{};
-\node(text_3)[below of = process_3,yshift=-3cm,scale=1.5]{源语文本及其词格};
+\node(text_3)[below of = process_3,yshift=-3cm,scale=1.5]{源语言文本及其词格};
 \node(cir_s)[cir,very thick, below of = process_3,xshift=-2.2cm,yshift=1.1cm]{\LARGE S};
 \node(cir_a)[cir,right of = cir_s,xshift=1cm,yshift=0.8cm]{\LARGE a};
 \node(cir_c)[cir,right of = cir_a,xshift=1.2cm,yshift=0cm]{\LARGE c};

--- a/Chapter17/chapter17.tex
+++ b/Chapter17/chapter17.tex
--- a/bibliography.bib
+++ b/bibliography.bib