合并分支 'master' 到 'caorunzhe'

Master 查看合并请求 !190

合并分支 'master' 到 'caorunzhe'
Master 查看合并请求 !190
d93b88a7 · 曹润柘 · e7bf1bed · 3ff7eff5 · d93b88a7 · d93b88a7
Commit d93b88a7 authored Sep 12, 2020 by 曹润柘
--- a/Chapter9/Figures/fig-corresponence-between-matrix-element-and-output.tex
+++ b/Chapter9/Figures/fig-corresponence-between-matrix-element-and-output.tex
@@ -28,10 +28,10 @@

 \node [anchor=west] (y2) at ([xshift=4em]neuron02.east) {$y_2$：\scriptsize{风力}};

-\draw [->,purple!40,line width=0.4mm] (x0.east) -- (neuron02.140) node [pos=0.1,below,yshift=-0.2em] {\tiny{$w_{02}$}};
-\draw [->,purple!40,line width=0.4mm] (x1.east) -- (neuron02.160) node [pos=0.1,below] {\tiny{$w_{12}$}};
-\draw [->,purple!40,line width=0.4mm] (x2.east) -- (neuron02.180) node [pos=0.3,below] {\tiny{$b_{2}$}};
-\draw [->,purple!30,line width=0.4mm] (neuron02.east) -- (y2.west);
+\draw [->,ugreen!50,line width=0.4mm] (x0.east) -- (neuron02.140) node [pos=0.1,below,yshift=-0.2em] {\tiny{$w_{02}$}};
+\draw [->,ugreen!50,line width=0.4mm] (x1.east) -- (neuron02.160) node [pos=0.1,below] {\tiny{$w_{12}$}};
+\draw [->,ugreen!50,line width=0.4mm] (x2.east) -- (neuron02.180) node [pos=0.3,below] {\tiny{$b_{2}$}};
+\draw [->,ugreen!30,line width=0.4mm] (neuron02.east) -- (y2.west);

 \end{scope}
 \end{tikzpicture}

--- a/Chapter9/Figures/fig-perceptron-mode.tex
+++ b/Chapter9/Figures/fig-perceptron-mode.tex
@@ -6,7 +6,7 @@
 \node [anchor=center] (x0) at ([yshift=3em]x1.center) {\Large{$x_0$}};
 \node [anchor=center] (x2) at ([yshift=-3em]x1.center) {\Large{$x_2$}};
 \node [anchor=west] (y) at ([xshift=6em]neuron.east) {\Large{$y$}};
-\node [anchor=center] (neuronmath) at (neuron.center) {\red{\small{$\sum \ge \sigma$}}};
+\node [anchor=center] (neuronmath) at (neuron.center) {\small{$\sum \ge \sigma$}};

 \draw [->,thick] (x0.east) -- (neuron.150) node [pos=0.5,above] {$w_0$};
 \draw [->,thick] (x1.east) -- (neuron.180) node [pos=0.5,above] {$w_1$};

--- a/Chapter9/Figures/fig-perceptron-to-predict-2.tex
+++ b/Chapter9/Figures/fig-perceptron-to-predict-2.tex
@@ -14,7 +14,7 @@
 \draw [->,thick] (neuron.east) -- (y.west);

 \node [anchor=center] (neuronmath) at (neuron.center) {\small{$\sum \ge \sigma$}};
-\node [anchor=south] (ylabel) at (y.north) {\textbf{不去了！}};
+\node [anchor=south] (ylabel) at (y.north) {\textbf{}};


 \end{scope}

--- a/Chapter9/Figures/fig-translation.tex
+++ b/Chapter9/Figures/fig-translation.tex
@@ -19,11 +19,11 @@
 \node[above] at ([xshift=2em,yshift=1em]a2.west){1};
 \node[below] at ([xshift=-0.5em,yshift=0em]a2.west){-1};
 \node [anchor=west] (x) at ([xshift=-3.5cm,yshift=2em]a2.north) {\scriptsize{
-    $w=\begin{bmatrix}
+    $\mathbf{w}=\begin{pmatrix}
    1&0&0\\
    0&-1&0\\
    0&0&1
-    \end{bmatrix}$}
+    \end{pmatrix}$}
    };

 \node [anchor=west,rotate = 180] (x) at ([xshift=0.7em,yshift=1em]a2.south) {\Large{$\textbf{F}$}};
@@ -44,11 +44,11 @@


 \node [anchor=west] (x) at ([xshift=-4cm,yshift=2em]a3.north) {\scriptsize{
-    $b=\begin{bmatrix}
+    $\mathbf{b}=\begin{pmatrix}
    0.5&0&0\\
    0&0&0\\
    0&0&0
-    \end{bmatrix}$}
+    \end{pmatrix}$}
    };
 \draw[-stealth, line width=2pt,dashed] ([xshift=3em,yshift=1em]a2.east) to ([xshift=-3em,yshift=1em]a3.west);
 }

--- a/Chapter9/chapter9.tex
+++ b/Chapter9/chapter9.tex
@@ -514,7 +514,7 @@ l_p(\mathbf x) & = & {\Vert{\mathbf x}\Vert}_p \nonumber \\
 \end{figure}
 %-------------------------------------------

-\parinterval 同样，人工神经元是人工神经网络的基本单元。在人们的想象中，人工神经元应该与生物神经元类似。但事实上，二者在形态上是有明显差别的。如图\ref{fig:5-4} 是一个典型的人工神经元，其本质是一个形似$ y=f(\mathbf x\cdot \mathbf w+b) $的函数。显而易见，一个神经元主要由$ \mathbf x $，$ \mathbf w $，$ b $，$ f $四个部分构成。其中$ \mathbf x $是一个形如$ (x_0,x_1,\dots,x_n) $ 的实数向量，在一个神经元中担任``输入''的角色。$ \mathbf w $是一个权重矩阵，其中的每一个元素都对应着一个输入和一个输出，代表着``某输入对某输出的贡献程度''，通常也被理解为神经元连接的{\small\sffamily\bfseries{权重}}\index{权重}（weight）\index{weight}。$ b $被称作偏置，是一个实数。$ f $被称作激活函数，用于对输入向量各项加权和后进行某种变换。可见，一个人工神经元的功能是将输入向量与权重矩阵右乘（做内积）后，加上偏置量，经过一个非线性激活函数得到一个标量结果。
+\parinterval 同样，人工神经元是人工神经网络的基本单元。在人们的想象中，人工神经元应该与生物神经元类似。但事实上，二者在形态上是有明显差别的。如图\ref{fig:5-4} 是一个典型的人工神经元，其本质是一个形似$ y=f(\mathbf x\cdot \mathbf w+b) $的函数。显而易见，一个神经元主要由$ \mathbf x $，$ \mathbf w $，$ b $，$ f $四个部分构成。其中$ \mathbf x $是一个形如$ (x_0,x_1,\dots,x_n) $ 的实数向量，在一个神经元中担任``输入''的角色。$ \mathbf w $是一个权重矩阵，其中的每一个元素都对应着一个输入和一个输出，代表着``某输入对某输出的贡献程度''，通常也被理解为神经元连接的{\small\sffamily\bfseries{权重}}\index{权重}（weight）\index{weight}。$ b $被称作偏置，是一个实数。$ f $被称作激活函数，用于对输入向量各项加权和后进行某种变换。可见，一个人工神经元的功能是将输入向量与权重矩阵右乘（做内积）后，加上偏置量，经过一个激活函数得到一个标量结果。

 %----------------------------------------------
 \begin{figure}[htp]
@@ -602,7 +602,7 @@ x_0\cdot w_0+x_1\cdot w_1+x_2\cdot w_2 & = & 0\cdot 1+0\cdot 1+1\cdot 1 \nonumbe
 \end{figure}
 %-------------------------------------------

-\parinterval 当然，结果是女友对这个决定非常不满意，让你跪键盘上反思一下自己。
+\parinterval 当然，结果是女友对这个决定非常不满意。

 %----------------------------------------------------------------------------------------
 %    NEW SUBSUB-SECTION
@@ -610,11 +610,11 @@ x_0\cdot w_0+x_1\cdot w_1+x_2\cdot w_2 & = & 0\cdot 1+0\cdot 1+1\cdot 1 \nonumbe

 \subsubsection{3. 神经元的输入\ \dash \ 离散 vs 连续}

-\parinterval 在遭受了女友一万点伤害之后，你意识到决策考虑的因素（即输入）不应该只是非0即1，而应该把``程度''考虑进来，于是你改变了三个输入的形式：
+\parinterval 在受到了女友“批评教育”之后，你意识到决策考虑的因素（即输入）不应该只是非0即1，而应该把``程度''考虑进来，于是你改变了三个输入的形式：

-\parinterval $ x_0 $：10/距离
+\parinterval $ x_0 $：10/距离（km）

-\parinterval $ x_1 $：150/票价
+\parinterval $ x_1 $：150/票价（元）

 \parinterval $ x_2 $：女朋友是否喜欢

@@ -668,7 +668,7 @@ x_0\cdot w_0+x_1\cdot w_1+x_2\cdot w_2 & = & 0\cdot 1+0\cdot 1+1\cdot 1 \nonumbe
 \vspace{0.5em}
 \item 设计有效的决策模型，即定义$ y $；
 \vspace{0.5em}
-\item 决定模型所涉及的参数（如权重$ \{w_i\} $）的最优值。
+\item 得到模型参数（如权重$ \{w_i\} $）的最优值。
 \vspace{0.5em}
 \end{itemize}

@@ -680,7 +680,7 @@ x_0\cdot w_0+x_1\cdot w_1+x_2\cdot w_2 & = & 0\cdot 1+0\cdot 1+1\cdot 1 \nonumbe

 \subsection{多层神经网络}

-\parinterval 感知机是一种最简单的单层神经网络。一个非常自然的问题是：能否把多个这样的网络叠加在一起，获得建模更复杂问题的能力？如果可以，那么在多层神经网络的每一层，神经元之间是怎么组织、工作的呢？单层网络又是通过什么方式构造成多层的呢？
+\parinterval 感知机是一种最简单的单层神经网络。一个很自然的问题是：能否把多个这样的网络叠加在一起，获得建模更复杂问题的能力？如果可以，那么在多层神经网络的每一层，神经元之间是怎么组织、工作的呢？单层网络又是通过什么方式构造成多层的呢？

 %----------------------------------------------------------------------------------------
 %    NEW SUBSUB-SECTION
@@ -690,6 +690,13 @@ x_0\cdot w_0+x_1\cdot w_1+x_2\cdot w_2 & = & 0\cdot 1+0\cdot 1+1\cdot 1 \nonumbe

 \parinterval 为了建立多层神经网络，首先需要把前面提到的简单的神经元进行扩展，把多个神经元组成一``层''神经元。比如，很多实际问题需要同时有多个输出，这时可以把多个相同的神经元并列起来，每个神经元都会有一个单独的输出，这就构成一``层''，形成了单层神经网络。单层神经网络中的每一个神经元都对应着一组权重和一个输出，可以把单层神经网络中的不同输出看作一个事物不同角度的描述。

+
+\parinterval 举个简单的例子，预报天气时，往往需要预测温度、湿度和风力，这就意味着如果使用单层神经网络进行预测，需要设置3个神经元。如图\ref{fig:5-10}，权重矩阵为：
+
+\begin{eqnarray}
+\mathbf w=\begin{pmatrix} w_{00} & w_{01} & w_{02}\\ w_{10} & w_{11} & w_{12}\end{pmatrix}
+\end{eqnarray}
+
 %----------------------------------------------
 \begin{figure}[htp]
 \centering
@@ -699,11 +706,6 @@ x_0\cdot w_0+x_1\cdot w_1+x_2\cdot w_2 & = & 0\cdot 1+0\cdot 1+1\cdot 1 \nonumbe
 \end{figure}
 %-------------------------------------------

-\parinterval 举个简单的例子，预报天气时，往往需要预测温度、湿度和风力，这就意味着如果使用单层神经网络进行预测，需要设置3个神经元。如图\ref{fig:5-10}，权重矩阵为：
-\begin{eqnarray}
-\mathbf w=\begin{pmatrix} w_{00} & w_{01} & w_{02}\\ w_{10} & w_{11} & w_{12}\end{pmatrix}
-\end{eqnarray}
-
 \noindent 它的第一列元素$ \begin{pmatrix} w_{00}\\ w_{10}\end{pmatrix} $是输入相对第一个输出$ y_0 $ 的权重，参数向量$ \mathbf b=(b_0,b_1,b_2) $的第一个元素$ b_0 $是对应于第一个输出$ y_0 $ 的偏置量；类似的，可以得到$ y_1 $和$ y_2 $。预测天气的单层模型如图\ref{fig:5-11}所示（在本例中，假设输入$ \mathbf x=(x_0,x_1) $）。

 %----------------------------------------------
@@ -717,31 +719,22 @@ x_0\cdot w_0+x_1\cdot w_1+x_2\cdot w_2 & = & 0\cdot 1+0\cdot 1+1\cdot 1 \nonumbe

 \parinterval 在神经网络中，对于输入向量$ \mathbf x\in R^m $，一层神经网络首先将其经过线性变换映射到$ R^n $，再经过激活函数变成$  \mathbf y\in R^n $。还是上面天气预测的例子，每个神经元获得相同的输入，权重矩阵$ \mathbf w $是一个$ 2\times 3 $矩阵，矩阵中每个元素$ w_{ij} $代表第$ j $个神经元中$ x_{i} $对应的权重值，假设编号为0的神经元负责预测温度，则$ w_{i0} $含义为预测温度时，输入$ x_{i} $对其影响程度。此外所有神经元的偏置$ b_{0} $，$ b_{1} $，$ b_{2} $组成了最终的偏置向量$ \mathbf b $。在该例中则有，权重矩阵$ \mathbf w=\begin{pmatrix} w_{00} & w_{01} & w_{02}\\ w_{10} & w_{11} & w_{12}\end{pmatrix} $，偏置向量$ \mathbf b=(b_0,b_1,b_2) $。

-%----------------------------------------------
-\begin{figure}[htp]
-\centering
-\input{./Chapter9/Figures/fig-rotation}
-\caption{ $ \mathbf w $对$ \mathbf x $的旋转作用}
-\label{fig:5-12}
-\end{figure}
-%-------------------------------------------
-
 \parinterval 那么，线性变换的本质是什么？

 \begin{itemize}
 \vspace{0.5em}
 \item 从代数角度看，对于线性空间$ \rm V $，任意$ a,b\in {\rm V} $和数域中的任意$ \alpha $，线性变换$ T(\cdot) $需满足：$ T(a+b)=T(a)+T(b) $，且$ T(\alpha a)=\alpha T(a) $；
 \vspace{0.5em}
-\item 从几何角度上看，公式中的$ \mathbf x\cdot \mathbf w+\mathbf b $将$ \mathbf x $右乘$ \mathbf w $相当于对$ \mathbf x $进行旋转变换，如图\ref{fig:5-12}所示，对三个点$ (0,0) $，$ (0,1) $，$ (1,0) $及其围成的矩形区域右乘如下矩阵：
+\item 从几何角度看，公式中的$ \mathbf x\cdot \mathbf w+\mathbf b $将$ \mathbf x $右乘$ \mathbf w $相当于对$ \mathbf x $进行旋转变换。例如，对三个点$ (0,0) $，$ (0,1) $，$ (1,0) $及其围成的矩形区域右乘如下矩阵：
+    
    \begin{eqnarray}
    \mathbf w=\begin{pmatrix} 1 & 0 & 0\\ 0 & -1 & 0\\ 0 & 0 & 1\end{pmatrix}
    \end{eqnarray}
-    这样，矩形区域由第一象限旋转90度到了第四象限。
+    
+    这样，矩形区域由第一象限旋转90度到了第四象限，如图\ref{fig:5-13}第一步所示。公式$ \mathbf x\cdot \mathbf w+\mathbf b $中的公式中的$ \mathbf b $相当于对其进行平移变换。其过程如图\ref{fig:5-13} 第二步所示，偏置矩阵$ \mathbf b=\begin{pmatrix} 0.5 & 0 & 0\\ 0 & 0 & 0\\ 0 & 0 & 0\end{pmatrix} $将矩形区域沿x轴向右平移了一段距离。
 \vspace{0.5em}
 \end{itemize}

-\parinterval 公式$ \mathbf x\cdot \mathbf w+\mathbf b $中的公式中的$ \mathbf b $相当于对其进行平移变换。其过程如图\ref{fig:5-13}所示，偏置矩阵$ \mathbf b=\begin{pmatrix} 0.5 & 0 & 0\\ 0 & 0 & 0\\ 0 & 0 & 0\end{pmatrix} $将矩形区域沿x轴向右平移了一段距离。
-
 %----------------------------------------------
 \begin{figure}[htp]
 \centering