Commit 90ed014e by 姜雨帆

Update Transformer

parent 1485b49e
...@@ -2758,34 +2758,63 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$ ...@@ -2758,34 +2758,63 @@ $\textrm{``you''} = \argmax_{y} \textrm{P}(y|\textbf{s}_1, \alert{\textbf{C}})$
\begin{itemize} \begin{itemize}
\item 有了一个NMT模型,我们应该怎么使用梯度下降算法来训练一个翻译模型呢? 或者说哪些因素会对RNN训练产生影响? \item 有了一个NMT模型,我们应该怎么使用梯度下降算法来训练一个翻译模型呢? 或者说哪些因素会对RNN训练产生影响?
\end{itemize} \end{itemize}
\begin{beamerboxesrounded}[upper=uppercolblue,lower=lowercolblue,shadow=true]{\small{\textbf{参数初始化}}} \begin{center}
{\footnotesize \begin{tikzpicture}
\begin{spacing}{0.9} \begin{scope}
给定模型结构,初始化的好坏决定了模型最后的性能。
\end{spacing} \node [anchor=south west,draw,inner sep=0.7em,minimum width=3em,fill=blue!20!white] (c1) at (0,0) {参数初始化};
} \node [anchor=north,draw,inner sep=0.7em,minimum width=3em,fill=yellow!20!white] (c2) at ([yshift=-1em]c1.south) {优化器选择};
\end{beamerboxesrounded} \node [anchor=north,draw,inner sep=0.7em,minimum width=3em,fill=red!20!white] (c3) at ([yshift=-1em]c2.south) {学习率调度};
\begin{beamerboxesrounded}[upper=uppercolblue,lower=lowercolblue,shadow=true]{\small{\textbf{优化器选择}}} \node [anchor=north,draw,inner sep=0.7em,minimum width=3em,fill=ugreen!20!white] (c4) at ([yshift=-1em]c3.south) {多设备加速};
{\footnotesize
\begin{spacing}{0.9}
选择不同的优化器需要对使用的便利性与效果进行权衡。
\end{spacing} \node [anchor=east] (line1) at ([xshift=-1.5em,yshift=0em]c1.west) {给定模型结构,初};
} \node [anchor=north west] (line2) at ([yshift=0.3em]line1.south west) {始化的好坏决定了};
\end{beamerboxesrounded} \node [anchor=north west] (line3) at ([yshift=0.3em]line2.south west) {模型最后的性能};
\begin{beamerboxesrounded}[upper=uppercolblue,lower=lowercolblue,shadow=true]{\small{\textbf{学习率调度}}}
{\footnotesize
\begin{spacing}{0.9} \node [anchor=west] (line11) at ([xshift=1.5em,yshift=0em]c1.east) {选择不同的优化器};
合适的学习率调度方案可以让训练过程又好又快。 \node [anchor=north west] (line12) at ([yshift=0.3em]line11.south west) {需要对使用的便利};
\end{spacing} \node [anchor=north west] (line13) at ([yshift=0.3em]line12.south west) {性与效果进行权衡};
}
\end{beamerboxesrounded}
\begin{beamerboxesrounded}[upper=uppercolblue,lower=lowercolblue,shadow=true]{\small{\textbf{多设备加速}}} \node [anchor=west] (line21) at ([yshift=-7em]line1.west) {合适的学习率调度};
{\footnotesize \node [anchor=north west] (line22) at ([yshift=0.3em]line21.south west) {方案可以让训练};
\begin{spacing}{0.9} \node [anchor=north west] (line23) at ([yshift=0.3em]line22.south west) {过程又好又快};
当训练非常缓慢的时候,可以使用多个设备并行计算加速。
\end{spacing}
} \node [anchor=west] (line31) at ([yshift=-7em]line11.west) {当训练非常缓慢的};
\end{beamerboxesrounded} \node [anchor=north west] (line32) at ([yshift=0.3em]line31.south west) {时候,可以使用多};
\node [anchor=north west] (line33) at ([yshift=0.3em]line32.south west) {设备并行计算加速};
\draw [->,very thick] ([yshift=-0.1em]c1.south) -- ([yshift=0.1em]c2.north);
\draw [->,very thick] ([yshift=-0.1em]c2.south) -- ([yshift=0.1em]c3.north);
\draw [->,very thick] ([yshift=-0.1em]c3.south) -- ([yshift=0.1em]c4.north);
\begin{pgfonlayer}{background}
\node [rectangle,inner sep=0.2em,rounded corners=1pt,fill=blue!10,drop shadow,draw=blue] [fit = (line1) (line2) (line3)] (box1) {};
\draw [->,dotted,very thick,blue] ([yshift=1.5em,xshift=1em]box1.east) -- ([yshift=1.5em,xshift=0.1em]box1.east);
\node [rectangle,inner sep=0.2em,rounded corners=1pt,fill=yellow!20!white,drop shadow,draw=black] [fit = (line11) (line12) (line13)] (box2) {};
\draw [->,dotted,very thick,black] ([yshift=-1.5em,xshift=-1em]box2.west) -- ([yshift=-1.5em,xshift=-0.1em]box2.west);
\node [rectangle,inner sep=0.2em,rounded corners=1pt,fill=red!20,drop shadow,draw=red] [fit = (line21) (line22) (line23)] (box3) {};
\draw [->,dotted,very thick,red] ([xshift=1em,yshift=1.5em]box3.east) -- ([xshift=0.1em,yshift=1.5em]box3.east) ;
\node [rectangle,inner sep=0.2em,rounded corners=1pt,fill=ugreen!10,drop shadow,draw=ugreen] [fit = (line31) (line32) (line33)] (box4) {};
\draw [->,dotted,very thick,ugreen] ([yshift=-1.5em,xshift=-1em]box4.west) -- ([yshift=-1.5em,xshift=-0.1em]box4.west);
\end{pgfonlayer}
\end{scope}
\end{tikzpicture}
\end{center}
\end{frame} \end{frame}
\begin{frame}{训练 - 初始化} \begin{frame}{训练 - 初始化}
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论