Commit 9eb63ef7 by xiaotong

bug fixes and new pages

parent ec8c84cc
......@@ -4747,8 +4747,10 @@ GPT-2 (Transformer) & Radford et al. & 2019 & 35.7
\end{displaymath}
这里,$\vv{\textrm{word}}$表示单词的分布式向量表示
\item 更多的词的可视化:相似的词聚在一起
\end{itemize}
\begin{center}
\includegraphics[scale=0.4]{./Figures/word-graph.png}
\end{center}
\end{frame}
%%%------------------------------------------------------------------------------------------------------------
......
......@@ -121,156 +121,19 @@
\subsection{词嵌入}
%%%------------------------------------------------------------------------------------------------------------
%%% 预训练带来的新思路
\begin{frame}{预训练带来的新思路}
%%% 用实例理解词的分布式表示
\begin{frame}{分布式表示的可视化}
\begin{itemize}
\item 预训练模型刷榜各种任务的同时,引发了一些思考:\\
预训练究竟给我们带来了什么?
\begin{itemize}
\item 有标注数据量有限,预训练提供使用超大规模数据的方法
\item 从大规模无标注数据中学习通用知识,提升泛化能力
\item 神经网络复杂且不容易训练,预训练可以使模型关注优质解的高密度区域
\end{itemize}
\item \textbf{一个著名的例子}:国王 $\to$ 王后\\
\begin{displaymath}
\vv{\textrm{国王}} - \vv{\textrm{男人}} + \vv{\textrm{女人}} = \vv{\textrm{王后}}
\end{displaymath}
这里,$\vv{\textrm{word}}$表示单词的分布式向量表示
\item 更多的词的可视化:相似的词聚在一起
\end{itemize}
\visible<2->{
\begin{center}
\begin{tikzpicture}
\draw[name path=ellipse,thick] (0,0) circle[x radius = 2, y radius = 1];
\node[rectangle,minimum size=0.1em,inner sep=2pt,fill=red] (p1) at (0.2,0.5) {};
\node[rectangle,minimum size=0.1em,inner sep=2pt,fill=red] (p2) at (0.3,0.6) {};
\node[rectangle,minimum size=0.1em,inner sep=2pt,fill=red] (p3) at (0.1,-0.1) {};
\node[rectangle,minimum size=0.1em,inner sep=2pt,fill=red] (p4) at (0.4,0) {};
\node[rectangle,minimum size=0.1em,inner sep=2pt,fill=red] (p5) at (0.5,0.3) {};
\node[rectangle,minimum size=0.1em,inner sep=2pt,fill=red] (p6) at (0.6,0.1) {};
\node[rectangle,minimum size=0.1em,inner sep=2pt,fill=red] (p7) at (0.7,-0.1) {};
\node[rectangle,minimum size=0.1em,inner sep=2pt,fill=red] (p8) at (-1.2,0.4) {};
\node[rectangle,minimum size=0.1em,inner sep=2pt,fill=red] (p9) at (-1.0,-0.3) {};
\node[rectangle,minimum size=0.1em,inner sep=2pt,fill=red] (p10) at (-0.1,-0.8) {};
\begin{pgfonlayer}{background}
\visible<4->{
\node [rectangle,inner sep=0.4em,draw,blue] [fit = (p1) (p2) (p3) (p4) (p5) (p6)] (area) {};
}
\end{pgfonlayer}
\draw [->] (2.5,-0.7) -- (1.8,-0.5) node [pos=0,right] {\scriptsize{模型参数解空间}};
\visible<4->{
\draw [->] (2.0,0.7) -- (area.20) node [pos=0,right] {\scriptsize{优质解高密度区域(预训练)}};
}
\visible<3->{
\draw [->] (-2.0,0.7) -- (p8.west) node [pos=0,left] {\scriptsize{游离的解}};
}
\end{tikzpicture}
\end{center}
}
\begin{itemize}
\item<5-> 机器翻译中的预训练
\begin{itemize}
\item 机器翻译中预训练还没有屠榜,一方面由于很多机器翻译任务训练数据量并不小,另一方面也反应出翻译的双语建模对预训练也提出了新的要求
\end{itemize}
\end{itemize}
\end{frame}
%%%------------------------------------------------------------------------------------------------------------
%%% 总结
\begin{frame}{总结 - 长出一口气}
\begin{itemize}
\item 讲了很多,累呀累,再整理一下主要观点
\begin{itemize}
\item 神经网络没有那么复杂,入门不能
\item 简单的网络结构可以组合成强大的模型
\item 语言模型可以用神经网络实现,效果很好,最近出现的预训练等范式证明了神经语言模型的潜力
\end{itemize}
\item<2-> 仍然有很多问题需要讨论
\begin{itemize}
\item 常见的神经网络结构(面向NLP)\\
google一下LSTM、GRU、CNN
\item 深层模型和训练方法。深度学习如何体现``深''?\\
深层网络可以带来什么?\\
如何有效的训练深层模型?
\item 如何把神经网络用于包括机器翻译在内的其它NLP任务?\\
比如encoder-decoder框架
\item 深度学习的实践技巧\\
``炼金术''了解下,因为不同任务调参和模型设计都有技巧\\
...
\end{itemize}
\end{itemize}
\end{frame}
%%%------------------------------------------------------------------------------------------------------------
%%% last slide
\begin{frame}{又结束一章内容~}
\vspace{2em}
\begin{center}
\textbf{内容很多,开个了个头}\\
\textbf{学习深度学习技术需要实践和经验的积累!}
\vspace{2em}
\begin{tikzpicture}
\tikzstyle{rnnnode} = [draw,inner sep=5pt,minimum width=4em,minimum height=1.5em,fill=green!30!white,blur shadow={shadow xshift=1pt,shadow yshift=-1pt}]
\node [anchor=west,rnnnode] (node11) at (0,0) {\tiny{RNN Cell}};
\node [anchor=west,rnnnode] (node12) at ([xshift=2em]node11.east) {\tiny{RNN Cell}};
\node [anchor=west,rnnnode] (node13) at ([xshift=2em]node12.east) {\tiny{RNN Cell}};
\node [anchor=west,rnnnode] (node14) at ([xshift=2em]node13.east) {\tiny{RNN Cell}};
\node [anchor=north,rnnnode,fill=red!30!white] (e1) at ([yshift=-1.2em]node11.south) {\tiny{embedding}};
\node [anchor=north,rnnnode,fill=red!30!white] (e2) at ([yshift=-1.2em]node12.south) {\tiny{embedding}};
\node [anchor=north,rnnnode,fill=red!30!white] (e3) at ([yshift=-1.2em]node13.south) {\tiny{embedding}};
\node [anchor=north,rnnnode,fill=red!30!white] (e4) at ([yshift=-1.2em]node14.south) {\tiny{embedding}};
\node [anchor=north] (w1) at ([yshift=-1em]e1.south) {\footnotesize{$<$s$>$}};
\node [anchor=north] (w2) at ([yshift=-1em]e2.south) {\footnotesize{谢谢}};
\node [anchor=north] (w3) at ([yshift=-1em]e3.south) {\footnotesize{大家}};
\node [anchor=north] (w4) at ([yshift=-1em]e4.south) {\footnotesize{聆听}};
\draw [->,thick] ([yshift=0.1em]w1.north)--([yshift=-0.1em]e1.south);
\draw [->,thick] ([yshift=0.1em]w2.north)--([yshift=-0.1em]e2.south);
\draw [->,thick] ([yshift=0.1em]w3.north)--([yshift=-0.1em]e3.south);
\draw [->,thick] ([yshift=0.1em]w4.north)--([yshift=-0.1em]e4.south);
\draw [->,thick] ([yshift=0.1em]e1.north)--([yshift=-0.1em]node11.south);
\draw [->,thick] ([yshift=0.1em]e2.north)--([yshift=-0.1em]node12.south);
\draw [->,thick] ([yshift=0.1em]e3.north)--([yshift=-0.1em]node13.south);
\draw [->,thick] ([yshift=0.1em]e4.north)--([yshift=-0.1em]node14.south);
\node [anchor=south,rnnnode,fill=red!30!white] (node21) at ([yshift=1.0em]node11.north) {\tiny{Softmax($\cdot$)}};
\node [anchor=south,rnnnode,fill=red!30!white] (node22) at ([yshift=1.0em]node12.north) {\tiny{Softmax($\cdot$)}};
\node [anchor=south,rnnnode,fill=red!30!white] (node23) at ([yshift=1.0em]node13.north) {\tiny{Softmax($\cdot$)}};
\node [anchor=south,rnnnode,fill=red!30!white] (node24) at ([yshift=1.0em]node14.north) {\tiny{Softmax($\cdot$)}};
\node [anchor=south] (output1) at ([yshift=1em]node21.north) {\Large{\textbf{谢谢}}};
\node [anchor=south] (output2) at ([yshift=1em]node22.north) {\Large{\textbf{大家}}};
\node [anchor=south] (output3) at ([yshift=1em]node23.north) {\Large{\textbf{聆听}}};
\node [anchor=south] (output4) at ([yshift=1em]node24.north) {\Large{\textbf{$<$/s$>$}}};
\draw [->,thick] ([yshift=0.1em]node21.north)--([yshift=-0.1em]output1.south);
\draw [->,thick] ([yshift=0.1em]node22.north)--([yshift=-0.1em]output2.south);
\draw [->,thick] ([yshift=0.1em]node23.north)--([yshift=-0.1em]output3.south);
\draw [->,thick] ([yshift=0.1em]node24.north)--([yshift=-0.1em]output4.south);
\draw [->,thick] ([yshift=0.1em]node11.north)--([yshift=-0.1em]node21.south);
\draw [->,thick] ([yshift=0.1em]node12.north)--([yshift=-0.1em]node22.south);
\draw [->,thick] ([yshift=0.1em]node13.north)--([yshift=-0.1em]node23.south);
\draw [->,thick] ([yshift=0.1em]node14.north)--([yshift=-0.1em]node24.south);
\draw [->,thick] ([xshift=-1em]node11.west)--([xshift=-0.1em]node11.west);
\draw [->,thick] ([xshift=0.1em]node11.east)--([xshift=-0.1em]node12.west);
\draw [->,thick] ([xshift=0.1em]node12.east)--([xshift=-0.1em]node13.west);
\draw [->,thick] ([xshift=0.1em]node13.east)--([xshift=-0.1em]node14.west);
\draw [->,thick] ([xshift=0.1em]node14.east)--([xshift=1em]node14.east);
\end{tikzpicture}
\includegraphics[scale=0.4]{./Figures/word-graph.png}
\end{center}
\end{frame}
%%%------------------------------------------------------------------------------------------------------------
......
......@@ -4750,8 +4750,10 @@ GPT-2 (Transformer) & Radford et al. & 2019 & 35.7
\end{displaymath}
这里,$\vv{\textrm{word}}$表示单词的分布式向量表示
\item 更多的词的可视化:相似的词聚在一起
\end{itemize}
\begin{center}
\includegraphics[scale=0.4]{./Figures/word-graph.png}
\end{center}
\end{frame}
%%%------------------------------------------------------------------------------------------------------------
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论