update 17.2

be0790f5 · 曹润柘 · 34324868 · be0790f5 · be0790f5
Commit be0790f5 authored Dec 19, 2020 by 曹润柘
--- a/Chapter17/chapter17.tex
+++ b/Chapter17/chapter17.tex
@@ -253,7 +253,8 @@
 \subsection{基于图像增强的文本翻译}

 \parinterval 在文本翻译中引入图像信息是最典型的多模态机器翻译任务。虽然多模态机器翻译还是一种从源语言文字到目标语言文字的转换，但是在转换的过程中，融入了其他模态的信息减少了歧义的产生。例如前文提到的通过与源语言相关的图像信息，将“A medium sized  child jumps off of a dusty bank”中“bank”译为“河岸”而不是“银行”，通过给定一张相关的图片，机器翻译模型就可以利用视觉信息更好的理解歧义词，避免产生歧义。换句话说，对于同一图像或者视觉场景的描述，源语言和目标语言描述的本质意义是一致的，只不过，体现在语言上会有表达方法上的差异。那么，图像就会存在一些源语言和目标语言的隐含对齐“约束”，将这种“约束”融入到机器翻译系统，会让模型加深对某些歧义词语上下文的理解，从而进一步提高机器翻译质量。
-\parinterval WMT机器翻译评测在2016年首次将融合图像和文本的多模态机器翻译作为机器翻译和跨语言图像描述的共享任务[2]，这项任务也受到了广泛的研究[5-6]。如何融入视觉信息，更好的理解多模态上下文语义是多模态机器翻译研究的热点，大体的研究方向包括基于特征融合的方法[7，15, 17]、基于多任务学习的方法[18,21]。接下来将从这两个方向，对多模态机器翻译的研究展开介绍。
+
+\parinterval WMT机器翻译评测在2016年首次将融合图像和文本的多模态机器翻译作为机器翻译和跨语言图像描述的共享任务\upcite{DBLP:conf/wmt/SpeciaFSE16}，这项任务也受到了广泛的研究\upcite{DBLP:conf/wmt/CaglayanABGBBMH17,DBLP:conf/wmt/LibovickyHTBP16}。如何融入视觉信息，更好的理解多模态上下文语义是多模态机器翻译研究的热点，大体的研究方向包括基于特征融合的方法\upcite{DBLP:conf/emnlp/CalixtoL17,DBLP:journals/corr/abs-1712-03449,DBLP:conf/wmt/HelclLV18}、基于多任务学习的方法\upcite{DBLP:conf/ijcnlp/ElliottK17,DBLP:conf/acl/YinMSZYZL20}。接下来将从这两个方向，对多模态机器翻译的研究展开介绍。

 %----------------------------------------------------------------------------------------
 %    NEW SUBSUB-SECTION
@@ -261,7 +262,7 @@

 \subsubsection{1. 基于特征融合的方法}

-\parinterval 较为早期的研究工作通常将图像信息作为输入句子的一部分[7-8]，或者用其对编码器、解码器的状态进行初始化[7, 9-10]。如图2所示，对图像特征的提取通常是基于卷积神经网络，有关卷积神经网络的内容，请参考{\chaptereleven}内容。通过卷积神经网络得到全局视觉特征，在进行维度变换后，将其作为源语言输入的一部分或者初始化状态引入到模型当中。但是，这种图像信息的引入方式有以下两个缺点：
+\parinterval 较为早期的研究工作通常将图像信息作为输入句子的一部分\upcite{DBLP:conf/emnlp/CalixtoL17,DBLP:conf/wmt/HuangLSOD16}，或者用其对编码器、解码器的状态进行初始化\upcite{DBLP:conf/emnlp/CalixtoL17,Elliott2015MultilingualID,DBLP:conf/wmt/MadhyasthaWS17}。如图2所示，对图像特征的提取通常是基于卷积神经网络，有关卷积神经网络的内容，请参考{\chaptereleven}内容。通过卷积神经网络得到全局视觉特征，在进行维度变换后，将其作为源语言输入的一部分或者初始化状态引入到模型当中。但是，这种图像信息的引入方式有以下两个缺点：

 \begin{itemize}
    \vspace{0.5em}
@@ -305,7 +306,7 @@

 \noindent 其中，${\alpha}_{i,j}$是注意力权重，它表示目标语言第j个位置与图片编码状态序列第i个位置的相关性大小，计算方式与{\chapterten}描述的注意力函数一致。

-\parinterval 这里，将每个时间步编码器的输出$\mathbi{h}_{i}$看作源图像序列位置$i$的表示结果。图3说明了模型在生成目标词“man”时，图像经过注意力机制对图像区域关注度的可视化效果，可以看到，经过注意力机制后，模型更注重的是与目标词相关的图像部分。当然，多模态机器翻译的输入还包括源语言文字序列。通常，源语言文字对于翻译的作用比图像更大[23]。从这个角度说，图像信息更多的是作为文字信息的补充，而不是替代。除此之外，注意力机制在多模态机器翻译中也有很多研究，不仅仅在解码器端将经过注意力机制的文本特征和视觉特征作为解码输入的一部分，还有的工作在编码端将源语言与图像信息进行注意力建模[22，23]，得到更好的源语言特征表示。
+\parinterval 这里，将每个时间步编码器的输出$\mathbi{h}_{i}$看作源图像序列位置$i$的表示结果。图3说明了模型在生成目标词“man”时，图像经过注意力机制对图像区域关注度的可视化效果，可以看到，经过注意力机制后，模型更注重的是与目标词相关的图像部分。当然，多模态机器翻译的输入还包括源语言文字序列。通常，源语言文字对于翻译的作用比图像更大\upcite{DBLP:conf/acl/YaoW20}。从这个角度说，图像信息更多的是作为文字信息的补充，而不是替代。除此之外，注意力机制在多模态机器翻译中也有很多研究，不仅仅在解码器端将经过注意力机制的文本特征和视觉特征作为解码输入的一部分，还有的工作在编码端将源语言与图像信息进行注意力建模\upcite{DBLP:journals/corr/abs-1712-03449,DBLP:conf/acl/YaoW20}，得到更好的源语言特征表示。

 %----------------------------------------------------------------------------------------
 %    NEW SUBSUB-SECTION
@@ -315,7 +316,7 @@

 \parinterval 基于多任务学习的方法通常是把翻译任务与其他视觉任务结合，进行联合训练。在{\chapterfifteen}和{\chaptersixteen}已经提到过多任务学习。一种常见的多任务学习框架是针对多个相关的任务，共享模型的部分参数来学习不同任务之间相似的部分，并通过特定的模块来学习每个任务特有的部分。在多模态机器翻译中，应用多任务学习的主要策略就是将翻译作为主任务，同时设置一些与其他模态相关的子任务，通过这些子任务来辅助源语言理解自身的语言知识。

-\parinterval 如图4所示，可以将多模态机器翻译任务分解为两个子任务：机器翻译和图片生成[18]。其中机器翻译作为主任务，图片生成作为子任务，图片生成这里指的是从一个图片描述生成对应图片，对于图片生成任务在后面叙述。通过单个编码器对源语言数据进行建模，然后通过两个解码器（翻译解码器和图像解码器）来学习翻译任务和图像生成任务。顶层任务学习每个任务的独立特征，底层共享参数层能够学习到更丰富的文本特征表示。另外在视觉问答领域有研究表明[24]，在多模态任务中，不宜引入多层的注意力，因为多层注意力会导致模型严重的过拟合，从另一角度来说，利用多任务学习的方式，提高模型的泛化能力，也是一种有效防止过拟合现象的方式。类似的思想，也大量使用在多模态自然语言处理中，例如图像描述生成、视觉问答[42]等。
+\parinterval 如图4所示，可以将多模态机器翻译任务分解为两个子任务：机器翻译和图片生成\upcite{DBLP:conf/ijcnlp/ElliottK17}。其中机器翻译作为主任务，图片生成作为子任务，图片生成这里指的是从一个图片描述生成对应图片，对于图片生成任务在后面叙述。通过单个编码器对源语言数据进行建模，然后通过两个解码器（翻译解码器和图像解码器）来学习翻译任务和图像生成任务。顶层任务学习每个任务的独立特征，底层共享参数层能够学习到更丰富的文本特征表示。另外在视觉问答领域有研究表明\upcite{DBLP:conf/nips/LuYBP16}，在多模态任务中，不宜引入多层的注意力，因为多层注意力会导致模型严重的过拟合，从另一角度来说，利用多任务学习的方式，提高模型的泛化能力，也是一种有效防止过拟合现象的方式。类似的思想，也大量使用在多模态自然语言处理中，例如图像描述生成、视觉问答\upcite{DBLP:conf/iccv/AntolALMBZP15}等。

 %----------------------------------------------------------------------------------------------------
 \begin{table}[htp]
@@ -341,7 +342,7 @@
 \end{table}
 %----------------------------------------------------------------------------------------------------

-\parinterval 传统图像描述生成有两种范式：基于检索的方法和基于模板的方法。其中基于检索的方法（图5左）是指在指定的图像描述候选句子中选择其中的句子作为图像的描述，这种方法的弊端是所选择的句子可能会和图像很大程度上不相符。而基于模板的方法（图5右）是指在图像上检测视觉特征，然后把内容填在实现设计好的模板当中，这种方法的缺点是生成的图像描述过于呆板，‘像是在一个模子中刻出来的’说的就是这个意思。近几年来 ，由于卷积神经网络在计算机视觉领域效果显著，而循环神经网络在自然语言处理领域卓有成效，受到机器翻译领域编码器-解码器框架的启发，逐渐的，这种基于卷积神经网络作为编码器编码图像，循环神经网络作为解码器解码描述的编码器-解码器框架成了图像描述任务的基础范式。本章节，从基础的图像描述范式编码器-解码器框架展开[25,26]，从编码器的改进、解码器的改进展开介绍。  
+\parinterval 传统图像描述生成有两种范式：基于检索的方法和基于模板的方法。其中基于检索的方法（图5左）是指在指定的图像描述候选句子中选择其中的句子作为图像的描述，这种方法的弊端是所选择的句子可能会和图像很大程度上不相符。而基于模板的方法（图5右）是指在图像上检测视觉特征，然后把内容填在实现设计好的模板当中，这种方法的缺点是生成的图像描述过于呆板，‘像是在一个模子中刻出来的’说的就是这个意思。近几年来 ，由于卷积神经网络在计算机视觉领域效果显著，而循环神经网络在自然语言处理领域卓有成效，受到机器翻译领域编码器-解码器框架的启发，逐渐的，这种基于卷积神经网络作为编码器编码图像，循环神经网络作为解码器解码描述的编码器-解码器框架成了图像描述任务的基础范式。本章节，从基础的图像描述范式编码器-解码器框架展开\upcite{DBLP:conf/cvpr/VinyalsTBE15,DBLP:conf/icml/XuBKCCSZB15}，从编码器的改进、解码器的改进展开介绍。  

 %----------------------------------------------------------------------------------------
 %    NEW SUBSUB-SECTION
@@ -349,7 +350,7 @@

 \subsubsection{1. 基础框架}

-\parinterval 受到神经机器翻译的启发，编码器-解码器框架也应用到图像描述任务当中。其中，编码器将输入的图像转换为一种新的“表示”形式，这种表示包含了输入图像的所有信息。之后解码器把这种“表示”重新转换为输出的描述。图XX中（上）是编码器-解码器框架在图像描述生成的应用[25]。首先，通过卷积神经网络提取图像特征到一个合适的长度向量表示。然后，利用长短时记忆网络（LSTM）解码生成文字描述，这个过程中与机器翻译解码过程类似。这种建模方式存在一定的短板：生成的描述单词不一定需要所有的图像信息，将全局的图像信息送入模型中，可能会引入噪音，使这种“表示”形式不准确。针对这个问题，图XX（下）[26]为了弥补这种建模的局限性，引入了注意力机制。利用注意力机制在生成不同单词时，使模型不再只关注图像的全局特征，而是关注“应该”关注的图像特征。
+\parinterval 受到神经机器翻译的启发，编码器-解码器框架也应用到图像描述任务当中。其中，编码器将输入的图像转换为一种新的“表示”形式，这种表示包含了输入图像的所有信息。之后解码器把这种“表示”重新转换为输出的描述。图XX中（上）是编码器-解码器框架在图像描述生成的应用\upcite{DBLP:conf/cvpr/VinyalsTBE15}。首先，通过卷积神经网络提取图像特征到一个合适的长度向量表示。然后，利用长短时记忆网络（LSTM）解码生成文字描述，这个过程中与机器翻译解码过程类似。这种建模方式存在一定的短板：生成的描述单词不一定需要所有的图像信息，将全局的图像信息送入模型中，可能会引入噪音，使这种“表示”形式不准确。针对这个问题，图XX（下）\upcite{DBLP:conf/icml/XuBKCCSZB15}为了弥补这种建模的局限性，引入了注意力机制。利用注意力机制在生成不同单词时，使模型不再只关注图像的全局特征，而是关注“应该”关注的图像特征。

 %----------------------------------------------------------------------------------------------------
 \begin{table}[htp]
@@ -375,9 +376,9 @@

 \subsubsection{2. 编码器的改进}

-\parinterval 要想使编码器-解码器框架在图像描述中充分发挥作用，编码器也要更好的表示图像信息。对于编码器的改进，大多也是从这个方向出发。通常，体现在向编码器中添加图像的语义信息[27,28,29]和位置信息[28,31]。
+\parinterval 要想使编码器-解码器框架在图像描述中充分发挥作用，编码器也要更好的表示图像信息。对于编码器的改进，大多也是从这个方向出发。通常，体现在向编码器中添加图像的语义信息\upcite{DBLP:conf/cvpr/YouJWFL16,DBLP:conf/cvpr/ChenZXNSLC17,DBLP:journals/pami/FuJCSZ17}和位置信息\upcite{DBLP:conf/cvpr/ChenZXNSLC17,DBLP:conf/ijcai/LiuSWWY17}。

-\parinterval 图像的语义信息一般是指图像中存在的实体、属性、场景等等。如图XX所示，从图像中利用属性或实体检测器提取出“child”、“river”、“bank”等等的属性词和实体词作为图像的语义信息，提取全局的图像特征初始化循环神经网络，再利用注意力机制计算目标词与属性词或实体词之间的注意力权重，根据该权重计算上下文向量，从而将编码语义信息送入解码端[27]，在解码‘bank’单词时，会更关注图像语义信息中的‘bank’。当然，除了图像中的实体和属性作为语义信息外，也可以将图片的场景信息也加入到编码器当中[29]。有关如何做属性、实体和场景的检测，涉及到目标检测任务的工作，例如Faster-RCNN[32]、YOLO[33,34]等等,这里不过多赘述。
+\parinterval 图像的语义信息一般是指图像中存在的实体、属性、场景等等。如图XX所示，从图像中利用属性或实体检测器提取出“child”、“river”、“bank”等等的属性词和实体词作为图像的语义信息，提取全局的图像特征初始化循环神经网络，再利用注意力机制计算目标词与属性词或实体词之间的注意力权重，根据该权重计算上下文向量，从而将编码语义信息送入解码端\upcite{DBLP:conf/cvpr/YouJWFL16}，在解码‘bank’单词时，会更关注图像语义信息中的‘bank’。当然，除了图像中的实体和属性作为语义信息外，也可以将图片的场景信息也加入到编码器当中\upcite{DBLP:journals/pami/FuJCSZ17}。有关如何做属性、实体和场景的检测，涉及到目标检测任务的工作，例如Faster-RCNN\upcite{DBLP:journals/pami/RenHG017}、YOLO\upcite{DBLP:journals/corr/abs-1804-02767,DBLP:journals/corr/abs-2004-10934}等等,这里不过多赘述。

 %----------------------------------------------------------------------------------------------------
 \begin{table}[htp]
@@ -387,7 +388,7 @@
 \end{table}
 %----------------------------------------------------------------------------------------------------

-\parinterval 以上的方法大都是将图像中的实体、属性、场景等映射到文字上，并把这些信息显式地添加到编码器端。令一种方式，把图像中的语义特征隐式地作用到编码器端[28]。例如，可以图像数据可以分解为三个通道（红、绿、蓝），简单来说，就是将图像的每一个像素点按照红色、绿色、蓝色分成三个部分，这样就将图像分成了三个通道。在很多图像中，不同通道随伴随的特征是不一样的，可以将其作用于编码器端。另一种方法是基于位置信息的编码器增强。位置信息指的是图像中对象（物体）的位置。利用目标检测技术检测系统获得图中的对象和对应的特征，这样就确定了图中的对象位置。显然，这些信息也可以加入到编码端，以加强编码器的表示能力[30]。
+\parinterval 以上的方法大都是将图像中的实体、属性、场景等映射到文字上，并把这些信息显式地添加到编码器端。令一种方式，把图像中的语义特征隐式地作用到编码器端\upcite{DBLP:conf/cvpr/ChenZXNSLC17}。例如，可以图像数据可以分解为三个通道（红、绿、蓝），简单来说，就是将图像的每一个像素点按照红色、绿色、蓝色分成三个部分，这样就将图像分成了三个通道。在很多图像中，不同通道随伴随的特征是不一样的，可以将其作用于编码器端。另一种方法是基于位置信息的编码器增强。位置信息指的是图像中对象（物体）的位置。利用目标检测技术检测系统获得图中的对象和对应的特征，这样就确定了图中的对象位置。显然，这些信息也可以加入到编码端，以加强编码器的表示能力\upcite{DBLP:conf/eccv/YaoPLM18}。

 %----------------------------------------------------------------------------------------
 %    NEW SUBSUB-SECTION
@@ -395,8 +396,8 @@

 \subsubsection{3. 解码器的改进}

-\parinterval 由于解码器输出的是语言文字序列，因此需要考虑语言的特点对其进行改进。 例如，解码过程中， “the”,“on”，“at”这种介词或者冠词与图像的相关性较低，这时图像信息的引入就会产生负面影响[35]。因此，可以通过门等结构，控制视觉信号作用于文字生成的程度。另外,在解码过程中，生成的每个单词对应着图像的区域可能是不同的。因此也可以设计更为有效的注意力机制来捕捉解码端对不同图像局部信息的关注程度[36]。 
-\parinterval 除了在解码端更好的使生成文本与图像特征相互作用以外，还有一些其他的解码器端改进的方向。例如：用其它结构（如卷积神经网络或者Transformer）代替解码器端循环神经网络[39]。或者使用更深层的神经网络学习动词或者名词等视觉中不易表现出来的单词[38]（这个参考文献层次有些低，我怕引用了有些问题。不过这个观点还是很有意思的，可以先确定文献的正规性，或者有没有顶会做类似事情的），其思想与深层神经机器翻译模型有相通之处（{\chapterfifteen}）。
+\parinterval 由于解码器输出的是语言文字序列，因此需要考虑语言的特点对其进行改进。 例如，解码过程中， “the”,“on”，“at”这种介词或者冠词与图像的相关性较低，这时图像信息的引入就会产生负面影响\upcite{DBLP:conf/cvpr/LuXPS17}。因此，可以通过门等结构，控制视觉信号作用于文字生成的程度。另外,在解码过程中，生成的每个单词对应着图像的区域可能是不同的。因此也可以设计更为有效的注意力机制来捕捉解码端对不同图像局部信息的关注程度\upcite{DBLP:conf/cvpr/00010BT0GZ18}。 
+\parinterval 除了在解码端更好的使生成文本与图像特征相互作用以外，还有一些其他的解码器端改进的方向。例如：用其它结构（如卷积神经网络或者Transformer）代替解码器端循环神经网络\upcite{DBLP:conf/cvpr/AnejaDS18}。或者使用更深层的神经网络学习动词或者名词等视觉中不易表现出来的单词\upcite{DBLP:journals/mta/FangWCT18}，其思想与深层神经机器翻译模型有相通之处（{\chapterfifteen}）。

 %----------------------------------------------------------------------------------------
 %    NEW SUB-SECTION
@@ -408,9 +409,9 @@

 \parinterval 计算机视觉领域，图像风格转移、图像语义分割、图像超分辨率等任务，都可以被视为{\small\bfnew{图像到图像的翻译}}\index{图像到图像的翻译}（Image-to-Image Translation）\index{Image-to-Image Translation}问题。与机器翻译类似，这些问题的共同目标是学习从一个对象到另一个对象的映射，只不过这里的对象是指图像，而非机器翻译中的文字。例如，给定物体的轮廓生成真实物体照片或者给定白天照片生成夜晚的照片等。图像到图像的翻译有广阔的应用场景，如图片补全、风格迁移等。

-\parinterval 对抗神经网络被广泛地应用再图像到图像的翻译任务当中[53,54,55]。实际上，这类方法非常适合图像生成类的任务。简单来说，对抗生成网络包括两个部分分别是：生成器和判别器。基于输入生成器生成一个结果，而判别器要判别生成的结果和真实结果是否是相同的，对抗的思想是，通过强化生成器的生成能力和判别器的判别能力，当生成器生成的结果可以“骗”过判别器时，即判别器无法分清真实结果和生成结果，认为模型学到了这种映射关系。在图像到图像的翻译中，根据输入图像，生成器生成预测图像，判别器判别是否为目标图像，多次迭代后，生成图像被判别为目标图像时，则模型学习到了“翻译能力”。以上的工作都是有监督的，即基于对齐的图像对数据集，但是，这种数据的标注是极为费时费力的，所以有很多的工作也基于无监督的方法展开[57,58,59]，这里不过多赘述。
+\parinterval 对抗神经网络被广泛地应用再图像到图像的翻译任务当中\upcite{DBLP:conf/nips/GoodfellowPMXWOCB14,DBLP:conf/nips/ZhuZPDEWS17,DBLP:journals/corr/abs-1908-06616}。实际上，这类方法非常适合图像生成类的任务。简单来说，对抗生成网络包括两个部分分别是：生成器和判别器。基于输入生成器生成一个结果，而判别器要判别生成的结果和真实结果是否是相同的，对抗的思想是，通过强化生成器的生成能力和判别器的判别能力，当生成器生成的结果可以“骗”过判别器时，即判别器无法分清真实结果和生成结果，认为模型学到了这种映射关系。在图像到图像的翻译中，根据输入图像，生成器生成预测图像，判别器判别是否为目标图像，多次迭代后，生成图像被判别为目标图像时，则模型学习到了“翻译能力”。以上的工作都是有监督的，即基于对齐的图像对数据集，但是，这种数据的标注是极为费时费力的，所以有很多的工作也基于无监督的方法展开\upcite{DBLP:conf/iccv/ZhuPIE17,DBLP:conf/iccv/YiZTG17,DBLP:conf/nips/LiuBK17}，这里不过多赘述。

-\parinterval {\small\bfnew{文本到图像的翻译}}\index{文本到图像的翻译}（Text-to-Image Translation）\index{Text-to-Image Translation}是指给定描述物体颜色和形状等细节的一自然语言文字，生成对应的图像。该任务也可以看作是图像描述任务的逆任务。目前方法上大部分基于对抗神经网络[61,62,63]。基本流程为：首先利用自然语言处理技术提取出文本信息，然后再用文本特征作为后面生成图像的约束，在对抗神经网络中生成器（Generator）中根据文本特征生成图像的约束，从而别鉴别器（Discriminator）鉴定其生成效果。
+\parinterval {\small\bfnew{文本到图像的翻译}}\index{文本到图像的翻译}（Text-to-Image Translation）\index{Text-to-Image Translation}是指给定描述物体颜色和形状等细节的一自然语言文字，生成对应的图像。该任务也可以看作是图像描述任务的逆任务。目前方法上大部分基于对抗神经网络\upcite{DBLP:conf/icml/ReedAYLSL16,DBLP:journals/corr/DashGALA17,DBLP:conf/nips/ReedAMTSL16}。基本流程为：首先利用自然语言处理技术提取出文本信息，然后再用文本特征作为后面生成图像的约束，在对抗神经网络中生成器（Generator）中根据文本特征生成图像的约束，从而别鉴别器（Discriminator）鉴定其生成效果。

 %----------------------------------------------------------------------------------------
 %    NEW SECTION
@@ -638,11 +639,11 @@ D_i&\subseteq&\{X_{-i},Y_{-i}\} \label{eq:17-3-2}

 \section{小结及扩展阅读}

-\parinterval 本章仅对音频处理和语音识别进行了简单的介绍，具体内容可以参考一些经典书籍，比如关于信号处理的基础知识\upcite{[Discrete-Time Signal Processing (3rd version)][ Discrete-Time Speech Signal Processing: Principles and Practice]}，以及语音识别的传统方法\upcite{[Fundamentals of Speech Recognition][ Spoken Language Processing: A Guide to Theory, Algorithm, and System Development]}和基于深度学习的最新方法\upcite{[ Automatic Speech Recognition: A Deep Learning Approach, 俞栋、邓力]}。此外，语音翻译的一个重要应用是机器同声传译。
+\parinterval 本章仅对音频处理和语音识别进行了简单的介绍，具体内容可以参考一些经典书籍，比如关于信号处理的基础知识\upcite{Oppenheim2001DiscretetimeSP,Quatieri2001DiscreteTimeSS}，以及语音识别的传统方法\upcite{DBLP:books/daglib/0071550,Huang2001SpokenLP}和基于深度学习的最新方法\upcite{benesty2008automatic}。此外，语音翻译的一个重要应用是机器同声传译。

 \parinterval 在篇章级翻译方面，一些研究工作对这类模型的上下文建模能力进行了探索\upcite{DBLP:conf/discomt/KimTN19,DBLP:conf/acl/LiLWJXZLL20}，发现模型性能在小数据集上的BLEU提升并不完全来自于上下文信息的利用。同时，受限于数据规模，篇章级翻译模型相对难以训练。一些研究人员通过调整训练策略来帮助模型更容易捕获上下文信息\upcite{DBLP:journals/corr/abs-1903-04715,DBLP:conf/acl/SaundersSB20,DBLP:conf/mtsummit/StojanovskiF19}。除了训练策略的调整，也可以使用数据增强\upcite{DBLP:conf/discomt/SugiyamaY19}和预训练\upcite{DBLP:journals/corr/abs-1911-03110,DBLP:journals/tacl/LiuGGLEGLZ20}的手段来缓解数据稀缺的问题。此外，区别于传统的篇章级翻译，一些对话翻译也需要使用长距离上下文信息\upcite{DBLP:conf/wmt/MarufMH18}。

-\parinterval 最近，多模态机器翻译、图像描述、视觉问答[42]（Visual Question Answering）等多模态任务受到人工智能领域的广泛关注。如何将多个模态的信息充分融合，是研究多模态任务的重要问题。在自然语言处理领域transformer[43]框架的提出后，被应用到计算机视觉[44]、多模态任务[45,46,47]效果也有显著的提升。另外，数据稀缺是多模态任务受限之处，可以采取数据增强[48,49]的方式缓解。但是，这时仍需要回答在：模型没有充分训练时，图像等模态信息究竟在翻译里发挥了多少作用？类似的问题在篇章级机器翻译中也存在，上下文模型在训练数据量很小的时候对翻译的作用十分微弱（引用李北ACL）。因此，也有必要探究究竟图像等上下文信息如何可以更有效地发挥作用。此外，受到预训练模型的启发，在多模态领域，图像和文本联合预训练[50,51,52]的工作也相继开展，利用transformer框架，通过自注意力机制捕捉图像和文本的隐藏对齐，提升模型性能，同时缓解数据稀缺问题。
+\parinterval 最近，多模态机器翻译、图像描述、视觉问答\upcite{DBLP:conf/iccv/AntolALMBZP15}（Visual Question Answering）等多模态任务受到人工智能领域的广泛关注。如何将多个模态的信息充分融合，是研究多模态任务的重要问题。在自然语言处理领域transformer\upcite{vaswani2017attention}框架的提出后，被应用到计算机视觉\upcite{DBLP:conf/eccv/CarionMSUKZ20}、多模态任务\upcite{DBLP:conf/acl/YaoW20,DBLP:journals/tcsv/YuLYH20,Huasong2020SelfAdaptiveNM}效果也有显著的提升。另外，数据稀缺是多模态任务受限之处，可以采取数据增强\upcite{DBLP:conf/emnlp/GokhaleBBY20,DBLP:conf/eccv/Tang0ZWY20}的方式缓解。但是，这时仍需要回答在：模型没有充分训练时，图像等模态信息究竟在翻译里发挥了多少作用？类似的问题在篇章级机器翻译中也存在，上下文模型在训练数据量很小的时候对翻译的作用十分微弱（引用李北ACL）。因此，也有必要探究究竟图像等上下文信息如何可以更有效地发挥作用。此外，受到预训练模型的启发，在多模态领域，图像和文本联合预训练\upcite{DBLP:conf/eccv/Li0LZHZWH0WCG20,DBLP:conf/aaai/ZhouPZHCG20,DBLP:conf/iclr/SuZCLLWD20}的工作也相继开展，利用transformer框架，通过自注意力机制捕捉图像和文本的隐藏对齐，提升模型性能，同时缓解数据稀缺问题。




--- a/bibliography.bib
+++ b/bibliography.bib
@@ -9824,21 +9824,21 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/corr/abs200111327,
+@article{DBLP:journals/corr/abs200111327,
  author    = {Idris Abdulmumin and
               Bashir Shehu Galadanci and
               Abubakar Isa},
  title     = {Iterative Batch Back-Translation for Neural Machine Translation: {A}
               Conceptual Model},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  year      = {2020}
 }
-@inproceedings{DBLP:journals/corr/abs200403672,
+@article{DBLP:journals/corr/abs200403672,
  author    = {Zi-Yi Dou and
               Antonios Anastasopoulos and
               Graham Neubig},
  title     = {Dynamic Data Selection and Weighting for Iterative Back-Translation},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  year      = {2020}
 }
 @inproceedings{DBLP:conf/emnlp/WuZHGQLL19,
@@ -9854,15 +9854,14 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/corr/abs-1901-09069,
+@article{DBLP:journals/corr/abs-1901-09069,
  author    = {Felipe Almeida and
               Geraldo Xex{\'{e}}o},
  title     = {Word Embeddings: {A} Survey},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  year      = {2019}
 }
-
-@inproceedings{DBLP:journals/corr/abs-2002-06823,
+@article{DBLP:journals/corr/abs-2002-06823,
  author    = {Jinhua Zhu and
               Yingce Xia and
               Lijun Wu and
@@ -9872,7 +9871,7 @@ author    = {Zhuang Liu and
               Houqiang Li and
               Tie-Yan Liu},
  title     = {Incorporating {BERT} into Neural Machine Translation},
-  publisher   = {International Conference on Learning Representations},
+  journal   = {CoRR},
  year      = {2020}
 }
 @inproceedings{song2019mass,
@@ -9884,13 +9883,13 @@ author    = {Zhuang Liu and
  title     = {{MASS:} Masked Sequence to Sequence Pre-training for Language Generation},
  volume    = {97},
  pages     = {5926--5936},
-  publisher = {International Conference on Machine Learning},
+  publisher = {{PMLR}},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/corr/Ruder17a,
+@article{DBLP:journals/corr/Ruder17a,
  author    = {Sebastian Ruder},
  title     = {An Overview of Multi-Task Learning in Deep Neural Networks},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1706.05098},
  year      = {2017}
 }
@@ -9904,18 +9903,7 @@ author    = {Zhuang Liu and
  title     = {Dual Supervised Learning},
  volume    = {70},
  pages     = {3789--3798},
-  publisher = {International Conference on Machine Learning},
-  year      = {2017}
-}
-@inproceedings{DBLP:conf/iccv/ZhuPIE17,
-  author    = {Jun-Yan Zhu and
-               Taesung Park and
-               Phillip Isola and
-               Alexei A. Efros},
-  title     = {Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial
-               Networks},
-  pages     = {2242--2251},
-  publisher = {{IEEE} Computer Society},
+  publisher = {{PMLR}},
  year      = {2017}
 }
 @inproceedings{DBLP:conf/nips/HeXQWYLM16,
@@ -9958,12 +9946,12 @@ author    = {Zhuang Liu and
  title     = {Analyzing Uncertainty in Neural Machine Translation},
  volume    = {80},
  pages     = {3953--3962},
-  publisher = {International Conference on Machine Learning},
+  publisher = {{PMLR}},
  year      = {2018}
 }
 @inproceedings{finding2006adafre,
  author    = {S. F. Adafre and Maarten de Rijke},
-  title     = {Finding Similar Sentences across Multiple Languages in Wikipedia},
+  title     = {Finding Similar Sentences across Multiple Languages in Wikipedia },
  publisher = {Annual Conference of the European Association for Machine Translation},
  year      = {2006}
 }
@@ -9973,12 +9961,12 @@ author    = {Zhuang Liu and
  publisher = {AAAI Conference on Artificial Intelligence},
  year      = {2008}
 }
-@inproceedings{DBLP:journals/coling/MunteanuM05,
+@article{DBLP:journals/coling/MunteanuM05,
  author    = {Dragos Stefan Munteanu and
               Daniel Marcu},
  title     = {Improving Machine Translation Performance by Exploiting Non-Parallel
               Corpora},
-  publisher   = {Computational Linguistics},
+  journal   = {Computational Linguistics},
  volume    = {31},
  number    = {4},
  pages     = {477--504},
@@ -10032,9 +10020,9 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{2015OnGulcehre,
+@article{2015OnGulcehre,
  title = {On Using Monolingual Corpora in Neural Machine Translation},
-  author = {Gulcehre Caglar  and  
+  author = { Gulcehre Caglar  and  
           Firat Orhan  and  
           Xu Kelvin  and  
           Cho Kyunghyun  and  
@@ -10042,8 +10030,8 @@ author    = {Zhuang Liu and
           Lin Huei Chi  and  
           Bougares Fethi  and  
           Schwenk Holger  and  
-           Bengio  Yoshua},
-  publisher = {Computer Science},
+           Bengio  Yoshua },
+  journal = {Computer Science},
  year = {2015},
 }
 @phdthesis{黄书剑0统计机器翻译中的词对齐研究,
@@ -10052,12 +10040,12 @@ author    = {Zhuang Liu and
  publisher={南京大学},
  year={2012}
 }
-@inproceedings{DBLP:journals/corr/MikolovLS13,
+@article{DBLP:journals/corr/MikolovLS13,
  author    = {Tomas Mikolov and
               Quoc V. Le and
               Ilya Sutskever},
  title     = {Exploiting Similarities among Languages for Machine Translation},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1309.4168},
  year      = {2013}
 }
@@ -10077,10 +10065,10 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2017}
 }
-@inproceedings{1966ASchnemann,
+@article{1966ASchnemann,
  title={A generalized solution of the orthogonal procrustes problem},
-  author={Schnemann and Peter},
-  publisher={Psychometrika},
+  author={Schnemann, Peter H. },
+  journal={Psychometrika},
  volume={31},
  number={1},
  pages={1-10},
@@ -10158,12 +10146,12 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/talip/MarieF20,
+@article{DBLP:journals/talip/MarieF20,
  author    = {Benjamin Marie and
               Atsushi Fujita},
  title     = {Iterative Training of Unsupervised Neural and Statistical Machine
               Translation Systems},
-  publisher   = {ACM Transactions on Asian and Low-Resource Language Information Processing},
+  journal   = {{ACM} Trans. Asian Low Resour. Lang. Inf. Process.},
  volume    = {19},
  number    = {5},
  pages     = {68:1--68:21},
@@ -10197,7 +10185,7 @@ author    = {Zhuang Liu and
  pages     = {7057--7067},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/ipm/FarhanTAJATT20,
+@article{DBLP:journals/ipm/FarhanTAJATT20,
  author    = {Wael Farhan and
               Bashar Talafha and
               Analle Abuammar and
@@ -10206,13 +10194,13 @@ author    = {Zhuang Liu and
               Ahmad Bisher Tarakji and
               Anas Toma},
  title     = {Unsupervised dialectal neural machine translation},
-  publisher   = {Information Processing \& Management},
+  journal   = {Information Processing \& Management},
  volume    = {57},
  number    = {3},
  pages     = {102181},
  year      = {2020}
 }
-@inproceedings{A2020Li,
+@article{A2020Li,
  title={A Simple and Effective Approach to Robust Unsupervised Bilingual Dictionary Induction},
  author={Yanyang Li and Yingfeng Luo and Ye Lin and Quan Du and Huizhen Wang and Shujian Huang and Tong Xiao and Jingbo Zhu},
  publisher={International Conference on Computational Linguistics},
@@ -10257,8 +10245,7 @@ author    = {Zhuang Liu and
  publisher = {AAAI Conference on Artificial Intelligence},
  year      = {2020}
 }
-
-@inproceedings{DBLP:journals/corr/abs-2001-08210,
+@article{DBLP:journals/corr/abs-2001-08210,
  author    = {Yinhan Liu and
               Jiatao Gu and
               Naman Goyal and
@@ -10268,13 +10255,10 @@ author    = {Zhuang Liu and
               Mike Lewis and
               Luke Zettlemoyer},
  title     = {Multilingual Denoising Pre-training for Neural Machine Translation},
-  publisher   = {Transactions of the Association for Computational Linguistics},
-  volume    = {8},
-  pages     = {726--742},
+  journal   = {CoRR},
+  volume    = {abs/2001.08210},
  year      = {2020}
 }
-
-
 @inproceedings{DBLP:conf/aaai/JiZDZCL20,
  author    = {Baijun Ji and
               Zhirui Zhang and
@@ -10303,25 +10287,25 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2020}
 }
-@inproceedings{DBLP:journals/corr/abs-2009-08088,
+@article{DBLP:journals/corr/abs-2009-08088,
  author    = {Zhen Yang and
               Bojie Hu and
               Ambyera Han and
               Shen Huang and
               Qi Ju},
-  title     = {{CSP:} Code-Switching Pre-training for Neural Machine Translation},
-  pages     = {2624--2636},
-  publisher = {Conference on Empirical Methods in Natural Language Processing},
+  title     = {Code-switching pre-training for neural machine translation},
+  journal   = {CoRR},
+  volume    = {abs/2009.08088},
  year      = {2020}
 }
-@inproceedings{DBLP:journals/corr/abs-2010-09403,
+@article{DBLP:journals/corr/abs-2010-09403,
  author    = {Dusan Varis and
               Ondrej Bojar},
  title     = {Unsupervised Pretraining for Neural Machine Translation Using Elastic
               Weight Consolidation},
-  pages     = {130--135},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2019}
+  journal   = {CoRR},
+  volume    = {abs/2010.09403},
+  year      = {2020}
 }
 @inproceedings{DBLP:conf/emnlp/LampleOCDR18,
  author    = {Guillaume Lample and
@@ -10334,11 +10318,11 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2018}
 }
-@inproceedings{DBLP:journals/jbd/ShortenK19,
+@article{DBLP:journals/jbd/ShortenK19,
  author    = {Connor Shorten and
               Taghi M. Khoshgoftaar},
  title     = {A survey on Image Data Augmentation for Deep Learning},
-  publisher   = {Journal of Big Data},
+  journal   = {J. Big Data},
  volume    = {6},
  pages     = {60},
  year      = {2019}
@@ -10361,13 +10345,13 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/corr/abs-1811-01124,
+@article{DBLP:journals/corr/abs-1811-01124,
  author    = {Jean Alaux and
               Edouard Grave and
               Marco Cuturi and
               Armand Joulin},
  title     = {Unsupervised Hyperalignment for Multilingual Word Embeddings},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1811.01124},
  year      = {2018}
 }
@@ -10466,10 +10450,9 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
  year      = {2020}
 }
-@inproceedings{hartmann2018empirical,
+@article{hartmann2018empirical,
  title={Empirical observations on the instability of aligning word vector spaces with GANs},
  author={Hartmann, Mareike and Kementchedjhieva, Yova and S{\o}gaard, Anders},
-  publisher = {openreview.net},
  year={2018}
 }
 @inproceedings{DBLP:conf/emnlp/Kementchedjhieva19,
@@ -10532,10 +10515,9 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{2019ADabre,
+@article{2019ADabre,
  title={A Survey of Multilingual Neural Machine Translation},
  author={Dabre, Raj  and  Chu, Chenhui  and  Kunchukuttan, Anoop },
-  publisher={ACM Computing Surveys},
  year={2019},
 }
 @inproceedings{DBLP:conf/naacl/ZophK16,
@@ -10568,17 +10550,17 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2017}
 }
-@inproceedings{DBLP:journals/mt/WuW07,
+@article{DBLP:journals/mt/WuW07,
  author    = {Hua Wu and
               Haifeng Wang},
  title     = {Pivot language approach for phrase-based statistical machine translation},
-  publisher   = {Machine Translation},
+  journal   = {Mach. Transl.},
  volume    = {21},
  number    = {3},
  pages     = {165--181},
  year      = {2007}
 }
-@inproceedings{Farsi2010somayeh,
+@article{Farsi2010somayeh,
  author    = {Somayeh Bakhshaei and Shahram Khadivi and Noushin Riahi },
  title     = {Farsi-german statistical machine translation through bridge language},
  publisher   = {International Telecommunications Symposium},
@@ -10605,7 +10587,7 @@ author    = {Zhuang Liu and
  title     = {Improving Pivot-Based Statistical Machine Translation by Pivoting
               the Co-occurrence Count of Phrase Pairs},
  pages     = {1665--1675},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  publisher = {{ACL}},
  year      = {2014}
 }
 @inproceedings{DBLP:conf/acl/MiuraNSTN15,
@@ -10635,14 +10617,14 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2009}
 }
-@inproceedings{DBLP:journals/corr/ChengLYSX16,
+@article{DBLP:journals/corr/ChengLYSX16,
  author    = {Yong Cheng and
               Yang Liu and
               Qian Yang and
               Maosong Sun and
               Wei Xu},
  title     = {Neural Machine Translation with Pivot Languages},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1611.04928},
  year      = {2016}
 }
@@ -10658,7 +10640,7 @@ author    = {Zhuang Liu and
 @inproceedings{de2006catalan,
  title={Catalan-English statistical machine translation without parallel corpus: bridging through Spanish},
  author={De Gispert, Adri{\`a} and Marino, Jose B},
-  publisher={International Conference on Language Resources and Evaluation},
+  booktitle={Proc. of 5th International Conference on Language Resources and Evaluation (LREC)},
  pages={65--68},
  year={2006}
 }
@@ -10680,28 +10662,21 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2011}
 }
-@inproceedings{DBLP:journals/corr/HintonVD15,
+@article{DBLP:journals/corr/HintonVD15,
  author    = {Geoffrey E. Hinton and
               Oriol Vinyals and
               Jeffrey Dean},
  title     = {Distilling the Knowledge in a Neural Network},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1503.02531},
  year      = {2015}
 }
-
-@inproceedings{gu2018meta,
-  author    = {Jiatao Gu and
-               Yong Wang and
-               Yun Chen and
-               Victor O. K. Li and
-               Kyunghyun Cho},
-  title     = {Meta-Learning for Low-Resource Neural Machine Translation},
-  pages     = {3622--3631},
-  publisher = {Conference on Empirical Methods in Natural Language Processing},
-  year      = {2018}
+@article{gu2018meta,
+  title={Meta-learning for low-resource neural machine translation},
+  author={Gu, Jiatao and Wang, Yong and Chen, Yun and Cho, Kyunghyun and Li, Victor OK},
+  journal={arXiv preprint arXiv:1808.08437},
+  year={2018}
 }
-
 @inproceedings{DBLP:conf/naacl/GuHDL18,
  author    = {Jiatao Gu and
               Hany Hassan and
@@ -10743,11 +10718,11 @@ author    = {Zhuang Liu and
  publisher = {European Language Resources Association},
  year      = {2018}
 }
-@inproceedings{DBLP:journals/tkde/PanY10,
+@article{DBLP:journals/tkde/PanY10,
  author    = {Sinno Jialin Pan and
               Qiang Yang},
  title     = {A Survey on Transfer Learning},
-  publisher   = {IEEE Transactions on knowledge and data engineering},
+  journal   = {IEEE Transactions on knowledge and data engineering},
  volume    = {22},
  number    = {10},
  pages     = {1345--1359},
@@ -10755,14 +10730,14 @@ author    = {Zhuang Liu and
 }
 @book{2009Handbook,
  title={Handbook Of Research On Machine Learning Applications and Trends: Algorithms, Methods and Techniques - 2 Volumes},
-  author={Olivas, Emilio Soria  and  Guerrero, Jose David Martin  and  Sober, Marcelino Martinez  and  Benedito, Jose Rafael Magdalena  and  Lopez, Antonio Jose Serrano },
+  author={ Olivas, Emilio Soria  and  Guerrero, Jose David Martin  and  Sober, Marcelino Martinez  and  Benedito, Jose Rafael Magdalena  and  Lopez, Antonio Jose Serrano },
  publisher={Information Science Reference - Imprint of: IGI Publishing},
  year={2009},
 }
 @incollection{DBLP:books/crc/aggarwal14/Pan14,
  author    = {Sinno Jialin Pan},
  title     = {Transfer Learning},
-  publisher = {Data Classification: Algorithms and Applications},
+  booktitle = {Data Classification: Algorithms and Applications},
  pages     = {537--570},
  publisher = {{CRC} Press},
  year      = {2014}
@@ -10778,22 +10753,16 @@ author    = {Zhuang Liu and
  publisher = {OpenReview.net},
  year      = {2019}
 }
-
-
-@inproceedings{platanios2018contextual,
-  author    = {Emmanouil Antonios Platanios and
-               Mrinmaya Sachan and
-               Graham Neubig and
-               Tom M. Mitchell},
-  title     = {Contextual Parameter Generation for Universal Neural Machine Translation},
-  pages     = {425--435},
-  publisher = {Conference on Empirical Methods in Natural Language Processing},
-  year      = {2018}
+@article{platanios2018contextual,
+  title={Contextual parameter generation for universal neural machine translation},
+  author={Platanios, Emmanouil Antonios and Sachan, Mrinmaya and Neubig, Graham and Mitchell, Tom},
+  journal={arXiv preprint arXiv:1808.08493},
+  year={2018}
 }
 @inproceedings{ji2020cross,
  title={Cross-Lingual Pre-Training Based Transfer for Zero-Shot Neural Machine Translation},
  author={Ji, Baijun and Zhang, Zhirui and Duan, Xiangyu and Zhang, Min and Chen, Boxing and Luo, Weihua},
-  publisher={Proceedings of the AAAI Conference on Artificial Intelligence},
+  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={01},
  pages={115--122},
@@ -10829,16 +10798,16 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2009}
 }
-@inproceedings{dabre2019brief,
+@article{dabre2019brief,
  title={A Brief Survey of Multilingual Neural Machine Translation},
  author={Dabre, Raj and Chu, Chenhui and Kunchukuttan, Anoop},
-  publisher={arXiv preprint arXiv:1905.05395},
+  journal={arXiv preprint arXiv:1905.05395},
  year={2019}
 }
-@inproceedings{dabre2020survey,
+@article{dabre2020survey,
  title={A survey of multilingual neural machine translation},
  author={Dabre, Raj and Chu, Chenhui and Kunchukuttan, Anoop},
-  publisher={ACM Computing Surveys},
+  journal={ACM Computing Surveys (CSUR)},
  volume={53},
  number={5},
  pages={1--38},
@@ -10874,13 +10843,13 @@ author    = {Zhuang Liu and
  publisher = {Conference on Empirical Methods in Natural Language Processing},
  year      = {2018}
 }
-@inproceedings{DBLP:journals/tacl/LeeCH17,
+@article{DBLP:journals/tacl/LeeCH17,
  author    = {Jason Lee and
               Kyunghyun Cho and
               Thomas Hofmann},
  title     = {Fully Character-Level Neural Machine Translation without Explicit
               Segmentation},
-  publisher   = {Transactions of the Association for Computational Linguistics},
+  journal   = {Transactions of the Association for Computational Linguistics},
  volume    = {5},
  pages     = {365--378},
  year      = {2017}
@@ -10895,13 +10864,13 @@ author    = {Zhuang Liu and
  publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  year      = {2016}
 }
-@inproceedings{DBLP:journals/corr/HaNW16,
+@article{DBLP:journals/corr/HaNW16,
  author    = {Thanh-Le Ha and
               Jan Niehues and
               Alexander H. Waibel},
  title     = {Toward Multilingual Neural Machine Translation with Universal Encoder
               and Decoder},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1611.04798},
  year      = {2016}
 }
@@ -10968,7 +10937,7 @@ author    = {Zhuang Liu and
  publisher = {Conference on Empirical Methods in Natural Language Processing},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/corr/abs-1903-07091,
+@article{DBLP:journals/corr/abs-1903-07091,
  author    = {Naveen Arivazhagan and
               Ankur Bapna and
               Orhan Firat and
@@ -10976,7 +10945,7 @@ author    = {Zhuang Liu and
               Melvin Johnson and
               Wolfgang Macherey},
  title     = {The Missing Ingredient in Zero-Shot Neural Machine Translation},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1903.07091},
  year      = {2019}
 }
@@ -10988,27 +10957,19 @@ author    = {Zhuang Liu and
  publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  year      = {2019}
 }
-
-
-@inproceedings{firat2016zero,
-  author    = {Orhan Firat and
-               Baskaran Sankaran and
-               Yaser Al-Onaizan and
-               Fatos T. Yarman-Vural and
-               Kyunghyun Cho},
-  title     = {Zero-Resource Translation with Multi-Lingual Neural Machine Translation},
-  pages     = {268--277},
-  publisher = {Conference on Empirical Methods in Natural Language Processing},
-  year      = {2016}
+@article{firat2016zero,
+  title={Zero-resource translation with multi-lingual neural machine translation},
+  author={Firat, Orhan and Sankaran, Baskaran and Al-Onaizan, Yaser and Vural, Fatos T Yarman and Cho, Kyunghyun},
+  journal={arXiv preprint arXiv:1606.04164},
+  year={2016}
 }
-
-@inproceedings{DBLP:journals/corr/abs-1805-10338,
+@article{DBLP:journals/corr/abs-1805-10338,
  author    = {Lierni Sestorain and
               Massimiliano Ciaramita and
               Christian Buck and
               Thomas Hofmann},
  title     = {Zero-Shot Dual Machine Translation},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1805.10338},
  year      = {2018}
 }
@@ -11088,7 +11049,7 @@ author    = {Zhuang Liu and
               Yoshua Bengio and
               Pierre-Antoine Manzagol},
  title     = {Extracting and composing robust features with denoising autoencoders},
-  series    = {International Conference on Learning Representations},
+  series    = {{ACM} International Conference Proceeding Series},
  volume    = {307},
  pages     = {1096--1103},
  publisher = {International Conference on Machine Learning}
@@ -11102,20 +11063,20 @@ author    = {Zhuang Liu and
  publisher = {International Conference on Learning Representations},
  year      = {2018}
 }
-@inproceedings{DBLP:journals/coling/BhagatH13,
+@article{DBLP:journals/coling/BhagatH13,
  author    = {Rahul Bhagat and
               Eduard H. Hovy},
  title     = {What Is a Paraphrase?},
-  publisher   = {Computational Linguistics},
+  journal   = {Computational Linguistics},
  volume    = {39},
  number    = {3},
  pages     = {463--472},
  year      = {2013}
 }
-@inproceedings{2010Generating,
+@article{2010Generating,
  title={Generating Phrasal and Sentential Paraphrases: A Survey of Data-Driven Methods},
  author={ Madnani, Nitin  and  Dorr, Bonnie J. },
-  publisher={Computational Linguistics},
+  journal={Computational Linguistics},
  volume={36},
  number={3},
  pages={341-387},
@@ -11148,10 +11109,10 @@ author    = {Zhuang Liu and
  publisher = {Annual Conference of the European Association for Machine Translation},
  year      = {2017}
 }
-@inproceedings{2005Improving,
+@article{2005Improving,
  title={Improving Machine Translation Performance by Exploiting Non-Parallel Corpora},
  author={ Munteanu, Ds  and  Marcu, D },
-  publisher={Computational Linguistics},
+  journal={Computational Linguistics},
  volume={31},
  number={4},
  pages={477-504},
@@ -11167,12 +11128,12 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2010}
 }
-@inproceedings{DBLP:journals/jair/RuderVS19,
+@article{DBLP:journals/jair/RuderVS19,
  author    = {Sebastian Ruder and
               Ivan Vulic and
               Anders S{\o}gaard},
  title     = {A Survey of Cross-lingual Word Embedding Models},
-  publisher   = {Journal of Artificial Intelligence Research},
+  journal   = {J. Artif. Intell. Res.},
  volume    = {65},
  pages     = {569--631},
  year      = {2019}
@@ -11187,14 +11148,14 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2016}
 }
-@inproceedings{DBLP:journals/tacl/TuLLLL17,
+@article{DBLP:journals/tacl/TuLLLL17,
  author    = {Zhaopeng Tu and
               Yang Liu and
               Zhengdong Lu and
               Xiaohua Liu and
               Hang Li},
  title     = {Context Gates for Neural Machine Translation},
-  publisher   = {Annual Meeting of the Association for Computational Linguistics},
+  journal   = {Annual Meeting of the Association for Computational Linguistics},
  volume    = {5},
  pages     = {87--99},
  year      = {2017}
@@ -11214,21 +11175,12 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2017}
 }
-
-
-@inproceedings{ng2019facebook,
-  author    = {Nathan Ng and
-               Kyra Yee and
-               Alexei Baevski and
-               Myle Ott and
-               Michael Auli and
-               Sergey Edunov},
-  title     = {Facebook FAIR's {WMT19} News Translation Task Submission},
-  pages     = {314--319},
-  publisher = {Association for Computational Linguistics},
-  year      = {2019}
+@article{ng2019facebook,
+  title={Facebook FAIR's WMT19 News Translation Task Submission},
+  author={Ng, Nathan and Yee, Kyra and Baevski, Alexei and Ott, Myle and Auli, Michael and Edunov, Sergey},
+  journal={arXiv preprint arXiv:1907.06616},
+  year={2019}
 }
-
 @inproceedings{DBLP:conf/wmt/WangLLJZLLXZ18,
  author    = {Qiang Wang and
               Bei Li and
@@ -11275,9 +11227,7 @@ author    = {Zhuang Liu and
  publisher = {Conference and Workshop on Neural Information Processing Systems},
  year      = {2015}
 }
-
-
-@inproceedings{DBLP:journals/corr/abs-1802-05365,
+@article{DBLP:journals/corr/abs-1802-05365,
  author    = {Matthew E. Peters and
               Mark Neumann and
               Mohit Iyyer and
@@ -11285,12 +11235,11 @@ author    = {Zhuang Liu and
               Christopher Clark and
               Kenton Lee and
               Luke Zettlemoyer},
-  title     = {Deep Contextualized Word Representations},
-  pages     = {2227--2237},
-  publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
+  title     = {Deep contextualized word representations},
+  journal   = {CoRR},
+  volume    = {abs/1802.05365},
  year      = {2018}
 }
-
 @inproceedings{DBLP:conf/icml/CollobertW08,
  author    = {Ronan Collobert and
               Jason Weston},
@@ -11347,13 +11296,13 @@ author    = {Zhuang Liu and
  publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/corr/abs-1908-06259,
+@article{DBLP:journals/corr/abs-1908-06259,
  author    = {Tianyu He and
               Xu Tan and
               Tao Qin},
  title     = {Hard but Robust, Easy but Sensitive: How Encoder and Decoder Perform
               in Neural Machine Translation},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1908.06259},
  year      = {2019}
 }
@@ -11378,18 +11327,12 @@ author    = {Zhuang Liu and
  publisher = {Springer},
  year      = {1998}
 }
-
-@inproceedings{liu2019multi,
-  author    = {Xiaodong Liu and
-               Pengcheng He and
-               Weizhu Chen and
-               Jianfeng Gao},
-  title     = {Multi-Task Deep Neural Networks for Natural Language Understanding},
-  pages     = {4487--4496},
-  publisher = {Annual Meeting of the Association for Computational Linguistics},
-  year      = {2019}
+@article{liu2019multi,
+  title={Multi-task deep neural networks for natural language understanding},
+  author={Liu, Xiaodong and He, Pengcheng and Chen, Weizhu and Gao, Jianfeng},
+  journal={arXiv preprint arXiv:1901.11504},
+  year={2019}
 }
-
 @inproceedings{DBLP:journals/corr/LuongLSVK15,
  author    = {Minh-Thang Luong and
               Quoc V. Le and
@@ -11408,7 +11351,7 @@ author    = {Zhuang Liu and
  publisher = {Conference on Empirical Methods in Natural Language Processing},
  year      = {2016}
 }
-@inproceedings{DBLP:journals/tacl/JohnsonSLKWCTVW17,
+@article{DBLP:journals/tacl/JohnsonSLKWCTVW17,
  author    = {Melvin Johnson and
               Mike Schuster and
               Quoc V. Le and
@@ -11423,19 +11366,19 @@ author    = {Zhuang Liu and
               Jeffrey Dean},
  title     = {Google's Multilingual Neural Machine Translation System: Enabling
               Zero-Shot Translation},
-  publisher   = {Transactions of the Association for Computational Linguistics},
+  journal   = {Transactions of the Association for Computational Linguistics},
  volume    = {5},
  pages     = {339--351},
  year      = {2017}
 }
-@inproceedings{DBLP:journals/csl/GulcehreFXCB17,
+@article{DBLP:journals/csl/GulcehreFXCB17,
  author    = {{\c{C}}aglar G{\"{u}}l{\c{c}}ehre and
               Orhan Firat and
               Kelvin Xu and
               Kyunghyun Cho and
               Yoshua Bengio},
  title     = {On integrating a language model into neural machine translation},
-  publisher   = {Computational Linguistics},
+  journal   = {Computational Linguistics},
  volume    = {45},
  pages     = {137--148},
  year      = {2017}
@@ -11503,10 +11446,10 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2013}
 }
-@inproceedings{imamura2016multi,
+@article{imamura2016multi,
  title={Multi-domain adaptation for statistical machine translation based on feature augmentation},
  author={Imamura, Kenji and Sumita, Eiichiro},
-  publisher={Association for Machine Translation in the Americas},
+  journal={Association for Machine Translation in the Americas},
  pages={79},
  year={2016}
 }
@@ -11529,10 +11472,10 @@ author    = {Zhuang Liu and
  publisher = {Conference on Empirical Methods in Natural Language Processing},
  year      = {2010}
 }
-@inproceedings{shah2012general,
+@article{shah2012general,
  title={A general framework to weight heterogeneous parallel data for model adaptation in statistical machine translation},
  author={Shah, Kashif and Barrault, Lo{\i}c and Schwenk, Holger and Le Mans, France},
-  publisher={Machine Translation Summit},
+  journal={MT Summit, Octobre},
  year={2012}
 }
 @inproceedings{DBLP:conf/iwslt/MansourN12,
@@ -11588,17 +11531,17 @@ author    = {Zhuang Liu and
  publisher = {International Conference on Computational Linguistics},
  year      = {2014}
 }
-@inproceedings{joty2015using,
+@article{joty2015using,
  title={Using joint models for domain adaptation in statistical machine translation},
  author={Joty, Nadir Durrani Hassan Sajjad Shafiq and Vogel, Ahmed Abdelali Stephan},
-  publisher={Proceedings of MT Summit XV},
+  journal={Proceedings of MT Summit XV},
  pages={117},
  year={2015}
 }
 @inproceedings{chen2016bilingual,
  title={Bilingual methods for adaptive training data selection for machine translation},
  author={Chen, Boxing and Kuhn, Roland and Foster, George and Cherry, Colin and Huang, Fei},
-  publisher={Association for Machine Translation in the Americas},
+  booktitle={Association for Machine Translation in the Americas},
  pages={93--103},
  year={2016}
 }
@@ -11669,7 +11612,7 @@ author    = {Zhuang Liu and
  publisher={International Workshop on Spoken Language Translation},
  year={2011}
 }
-@inproceedings{moore2010intelligent,
+@article{moore2010intelligent,
  title = {Intelligent selection of language model training data},
  author = {Moore, Robert C and Lewis, Will},
  publisher = {Annual Meeting of the Association for Computational Linguistics},
@@ -11716,16 +11659,16 @@ author    = {Zhuang Liu and
  publisher = {International Conference on Computational Linguistics},
  year      = {2016}
 }
-@inproceedings{chu2015integrated,
+@article{chu2015integrated,
  title={Integrated parallel data extraction from comparable corpora for statistical machine translation},
  author={Chu, Chenhui},
  year={2015},
  publisher={Kyoto University}
 }
-@inproceedings{DBLP:journals/tit/Scudder65a,
+@article{DBLP:journals/tit/Scudder65a,
  author    = {H. J. Scudder III},
  title     = {Probability of error of some adaptive pattern-recognition machines},
-  publisher   = {{IEEE} Transactions on Information Theory},
+  journal   = {{IEEE} Transactions on Information Theory},
  volume    = {11},
  number    = {3},
  pages     = {363--371},
@@ -11739,14 +11682,14 @@ author    = {Zhuang Liu and
  publisher = {International Conference on Computational Linguistics},
  year      = {2018}
 }
-@inproceedings{DBLP:journals/corr/abs-1708-08712,
+@article{DBLP:journals/corr/abs-1708-08712,
  author    = {Hassan Sajjad and
               Nadir Durrani and
               Fahim Dalvi and
               Yonatan Belinkov and
               Stephan Vogel},
  title     = {Neural Machine Translation Training in a Multi-Domain Scenario},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1708.08712},
  year      = {2017}
 }
@@ -11813,7 +11756,7 @@ author    = {Zhuang Liu and
 @inproceedings{britz2017effective,
  title={Effective domain mixing for neural machine translation},
  author={Britz, Denny and Le, Quoc and Pryzant, Reid},
-  publisher={Proceedings of the Second Conference on Machine Translation},
+  booktitle={Proceedings of the Second Conference on Machine Translation},
  pages={118--126},
  year={2017}
 }
@@ -11848,21 +11791,21 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2017}
 }
-@inproceedings{DBLP:journals/corr/abs-1906-03129,
+@article{DBLP:journals/corr/abs-1906-03129,
  author    = {Shen Yan and
               Leonard Dahlmann and
               Pavel Petrushkov and
               Sanjika Hewavitharana and
               Shahram Khadivi},
  title     = {Word-based Domain Adaptation for Neural Machine Translation},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1906.03129},
  year      = {2019}
 }
-@inproceedings{dakwale2017finetuning,
+@article{dakwale2017finetuning,
  title={Finetuning for neural machine translation with limited degradation across in-and out-of-domain data},
  author={Dakwale, Praveen and Monz, Christof},
-  publisher={Proceedings of the XVI Machine Translation Summit},
+  journal={Proceedings of the XVI Machine Translation Summit},
  volume={117},
  year={2017}
 }
@@ -11879,19 +11822,12 @@ author    = {Zhuang Liu and
  publisher = {Conference on Empirical Methods in Natural Language Processing},
  year      = {2019}
 }
-
-
-@inproceedings{barone2017regularization,
-  author    = {Antonio Valerio Miceli Barone and
-               Barry Haddow and
-               Ulrich Germann and
-               Rico Sennrich},
-  title     = {Regularization techniques for fine-tuning in neural machine translation},
-  pages     = {1489--1494},
-  publisher = {Conference on Empirical Methods in Natural Language Processing},
-  year      = {2017}
+@article{barone2017regularization,
+  title={Regularization techniques for fine-tuning in neural machine translation},
+  author={Barone, Antonio Valerio Miceli and Haddow, Barry and Germann, Ulrich and Sennrich, Rico},
+  journal={arXiv preprint arXiv:1707.09920},
+  year={2017}
 }
-
 @inproceedings{DBLP:conf/acl/SaundersB20,
  author    = {Danielle Saunders and
               Bill Byrne},
@@ -11904,7 +11840,7 @@ author    = {Zhuang Liu and
 @inproceedings{khayrallah2017neural,
  title={Neural lattice search for domain adaptation in machine translation},
  author={Khayrallah, Huda and Kumar, Gaurav and Duh, Kevin and Post, Matt and Koehn, Philipp},
-  publisher={Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
+  booktitle={Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
  pages={20--25},
  year={2017}
 }
@@ -11918,11 +11854,11 @@ author    = {Zhuang Liu and
  publisher = {Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/corr/FreitagA16,
+@article{DBLP:journals/corr/FreitagA16,
  author    = {Markus Freitag and
               Yaser Al-Onaizan},
  title     = {Fast Domain Adaptation for Neural Machine Translation},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1612.06897},
  year      = {2016}
 }
@@ -11945,10 +11881,10 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2017}
 }
-@inproceedings{DBLP:journals/ibmrd/Luhn58,
+@article{DBLP:journals/ibmrd/Luhn58,
  author    = {Hans Peter Luhn},
  title     = {The Automatic Creation of Literature Abstracts},
-  publisher   = {IBM Journal of research and development},
+  journal   = {{IBM} J. Res. Dev.},
  volume    = {2},
  number    = {2},
  pages     = {159--165},
@@ -12011,7 +11947,7 @@ author    = {Zhuang Liu and
  publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/corr/abs-2010-11125,
+@article{DBLP:journals/corr/abs-2010-11125,
  author    = {Angela Fan and
               Shruti Bhosale and
               Holger Schwenk and
@@ -12030,7 +11966,7 @@ author    = {Zhuang Liu and
               Michael Auli and
               Armand Joulin},
  title     = {Beyond English-Centric Multilingual Machine Translation},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/2010.11125},
  year      = {2020}
 }
@@ -12102,13 +12038,13 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/ejasmp/RadzikowskiNWY19,
+@article{DBLP:journals/ejasmp/RadzikowskiNWY19,
  author    = {Kacper Radzikowski and
               Robert Nowak and
               Le Wang and
               Osamu Yoshie},
  title     = {Dual supervised learning for non-native speech recognition},
-  publisher   = {EURASIP Journal on Audio, Speech, and Music Processing},
+  journal   = {{EURASIP} J. Audio Speech Music. Process.},
  volume    = {2019},
  pages     = {3},
  year      = {2019}
@@ -12130,13 +12066,13 @@ author    = {Zhuang Liu and
  publisher = {{IEEE} Computer Society},
  year      = {2017}
 }
-@inproceedings{DBLP:journals/access/DuRZH20,
+@article{DBLP:journals/access/DuRZH20,
  author    = {Liang Du and
               Xin Ren and
               Peng Zhou and
               Zhiguo Hu},
  title     = {Unsupervised Dual Learning for Feature and Instance Selection},
-  publisher   = {{IEEE} Access},
+  journal   = {{IEEE} Access},
  volume    = {8},
  pages     = {170248--170260},
  year      = {2020}
@@ -12150,7 +12086,6 @@ author    = {Zhuang Liu and
  publisher = {Annual Meeting of the Association for Computational Linguistics},
  year      = {2020}
 }
-
 @inproceedings{DBLP:conf/nips/YangDYCSL19,
  author    = {Zhilin Yang and
               Zihang Dai and
@@ -12159,14 +12094,13 @@ author    = {Zhuang Liu and
               Ruslan Salakhutdinov and
               Quoc V. Le},
  title     = {XLNet: Generalized Autoregressive Pretraining for Language Understanding},
-  publisher = {Annual Conference on Neural Information Processing Systems},
  pages     = {5754--5764},
  year      = {2019}
 }
-@inproceedings{lewis2019bart,
+@article{lewis2019bart,
  title={Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension},
  author={Lewis, Mike and Liu, Yinhan and Goyal, Naman and Ghazvininejad, Marjan and Mohamed, Abdelrahman and Levy, Omer and Stoyanov, Ves and Zettlemoyer, Luke},
-  publisher={arXiv preprint arXiv:1910.13461},
+  journal={arXiv preprint arXiv:1910.13461},
  year={2019}
 }
 @inproceedings{DBLP:conf/iclr/LanCGGSS20,
@@ -12218,7 +12152,7 @@ author    = {Zhuang Liu and
  publisher = {International Conference on Computer Vision},
  year      = {2019}
 }
-@inproceedings{DBLP:journals/corr/abs-2010-12831,
+@article{DBLP:journals/corr/abs-2010-12831,
  author    = {Liunian Harold Li and
               Haoxuan You and
               Zhecan Wang and
@@ -12227,7 +12161,7 @@ author    = {Zhuang Liu and
               Kai-Wei Chang},
  title     = {Weakly-supervised VisualBERT: Pre-training without Parallel Images
               and Captions},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/2010.12831},
  year      = {2020}
 }
@@ -12277,18 +12211,18 @@ author    = {Zhuang Liu and
 @inproceedings{shen2020q,
  title={Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT.},
  author={Shen, Sheng and Dong, Zhen and Ye, Jiayu and Ma, Linjian and Yao, Zhewei and Gholami, Amir and Mahoney, Michael W and Keutzer, Kurt},
-  publisher={AAAI Conference on Artificial Intelligence},
+  booktitle={AAAI Conference on Artificial Intelligence},
  pages={8815--8821},
  year={2020}
 }
-@inproceedings{DBLP:journals/corr/abs-1910-01108,
+@article{DBLP:journals/corr/abs-1910-01108,
  author    = {Victor Sanh and
               Lysandre Debut and
               Julien Chaumond and
               Thomas Wolf},
  title     = {DistilBERT, a distilled version of {BERT:} smaller, faster, cheaper
               and lighter},
-  publisher   = {CoRR},
+  journal   = {CoRR},
  volume    = {abs/1910.01108},
  year      = {2019}
 }
@@ -13248,6 +13182,728 @@ author    = {Zhuang Liu and
  publisher={电子工业出版社},
  year={2020}
 }
+%%%%%%%%%%%%%%%%%王屹超部分，孟霞加%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+@inproceedings{DBLP:conf/mm/LinMSYYGZL20,
+  author    = {Huan Lin and
+               Fandong Meng and
+               Jinsong Su and
+               Yongjing Yin and
+               Zhengyuan Yang and
+               Yubin Ge and
+               Jie Zhou and
+               Jiebo Luo},
+  title     = {Dynamic Context-guided Capsule Network for Multimodal Machine Translation},
+  pages     = {1320--1329},
+  publisher = {	ACM Multimedia},
+  year      = {2020}
+}
+
+@inproceedings{DBLP:conf/wmt/SpeciaFSE16,
+  author    = {Lucia Specia and
+               Stella Frank and
+               Khalil Sima'an and
+               Desmond Elliott},
+  title     = {A Shared Task on Multimodal Machine Translation and Crosslingual Image
+               Description},
+  pages     = {543--553},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2016}
+}
+
+@inproceedings{DBLP:conf/wmt/ElliottFBBS17,
+  author    = {Desmond Elliott and
+               Stella Frank and
+               Lo{\"{\i}}c Barrault and
+               Fethi Bougares and
+               Lucia Specia},
+  title     = {Findings of the Second Shared Task on Multimodal Machine Translation
+               and Multilingual Image Description},
+  pages     = {215--233},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/wmt/BarraultBSLEF18,
+  author    = {Lo{\"{\i}}c Barrault and
+               Fethi Bougares and
+               Lucia Specia and
+               Chiraag Lala and
+               Desmond Elliott and
+               Stella Frank},
+  title     = {Findings of the Third Shared Task on Multimodal Machine Translation},
+  pages     = {304--323},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2018}
+}
+
+@inproceedings{DBLP:conf/wmt/CaglayanABGBBMH17,
+  author    = {Ozan Caglayan and
+               Walid Aransa and
+               Adrien Bardet and
+               Mercedes Garc{\'{\i}}a-Mart{\'{\i}}nez and
+               Fethi Bougares and
+               Lo{\"{\i}}c Barrault and
+               Marc Masana and
+               Luis Herranz and
+               Joost van de Weijer},
+  title     = {{LIUM-CVC} Submissions for {WMT17} Multimodal Translation Task},
+  pages     = {432--439},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/wmt/LibovickyHTBP16,
+  author    = {Jindrich Libovick{\'{y}} and
+               Jindrich Helcl and
+               Marek Tlust{\'{y}} and
+               Ondrej Bojar and
+               Pavel Pecina},
+  title     = {{CUNI} System for {WMT16} Automatic Post-Editing and Multimodal Translation
+               Tasks},
+  pages     = {646--654},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2016}
+}
+
+@inproceedings{DBLP:conf/emnlp/CalixtoL17,
+  author    = {Iacer Calixto and
+               Qun Liu},
+  title     = {Incorporating Global Visual Features into Attention-based Neural Machine
+               Translation},
+  pages     = {992--1003},
+  publisher = {Conference on Empirical Methods in Natural Language Processing},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/wmt/HuangLSOD16,
+  author    = {Po-Yao Huang and
+               Frederick Liu and
+               Sz-Rung Shiang and
+               Jean Oh and
+               Chris Dyer},
+  title     = {Attention-based Multimodal Neural Machine Translation},
+  pages     = {639--645},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2016}
+}
+
+@article{Elliott2015MultilingualID,
+  title={Multilingual Image Description with Neural Sequence Models},
+  author={Desmond Elliott and 
+          Stella Frank and 
+		  Eva Hasler},
+  journal={arXiv: Computation and Language},
+  year={2015}
+}
+
+@inproceedings{DBLP:conf/wmt/MadhyasthaWS17,
+  author    = {Pranava Swaroop Madhyastha and
+               Josiah Wang and
+               Lucia Specia},
+  title     = {Sheffield MultiMT: Using Object Posterior Predictions for Multimodal
+               Machine Translation},
+  pages     = {470--476},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2017}
+}
+
+@article{DBLP:journals/corr/CaglayanBB16,
+  author    = {Ozan Caglayan and
+               Lo{\"{\i}}c Barrault and
+               Fethi Bougares},
+  title     = {Multimodal Attention for Neural Machine Translation},
+  journal   = {CoRR},
+  volume    = {abs/1609.03976},
+  year      = {2016}
+}
+
+@inproceedings{DBLP:conf/acl/CalixtoLC17,
+  author    = {Iacer Calixto and
+               Qun Liu and
+               Nick Campbell},
+  title     = {Doubly-Attentive Decoder for Multi-modal Neural Machine Translation},
+  pages     = {1913--1924},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2017}
+}
+
+@article{DBLP:journals/corr/DelbrouckD17,
+  author    = {Jean-Benoit Delbrouck and
+               St{\'{e}}phane Dupont},
+  title     = {Multimodal Compact Bilinear Pooling for Multimodal Neural Machine
+               Translation},
+  journal   = {CoRR},
+  volume    = {abs/1703.08084},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/acl/LibovickyH17,
+  author    = {Jindrich Libovick{\'{y}} and
+               Jindrich Helcl},
+  title     = {Attention Strategies for Multi-Source Sequence-to-Sequence Learning},
+  pages     = {196--202},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2017}
+}
+
+@article{DBLP:journals/corr/abs-1712-03449,
+  author    = {Jean-Benoit Delbrouck and
+               St{\'{e}}phane Dupont},
+  title     = {Modulating and attending the source image during encoding improves
+               Multimodal Translation},
+  journal   = {CoRR},
+  volume    = {abs/1712.03449},
+  year      = {2017}
+}
+
+@article{DBLP:journals/corr/abs-1807-11605,
+  author    = {Hasan Sait Arslan and
+               Mark Fishel and
+               Gholamreza Anbarjafari},
+  title     = {Doubly Attentive Transformer Machine Translation},
+  journal   = {CoRR},
+  volume    = {abs/1807.11605},
+  year      = {2018}
+}
+
+@inproceedings{DBLP:conf/wmt/HelclLV18,
+  author    = {Jindrich Helcl and
+               Jindrich Libovick{\'{y}} and
+               Dusan Varis},
+  title     = {{CUNI} System for the {WMT18} Multimodal Translation Task},
+  pages     = {616--623},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2018}
+}
+
+@inproceedings{DBLP:conf/ijcnlp/ElliottK17,
+  author    = {Desmond Elliott and
+               {\'{A}}kos K{\'{a}}d{\'{a}}r},
+  title     = {Imagination Improves Multimodal Translation},
+  pages     = {130--141},
+  publisher = {International Joint Conference on Natural Language Processing},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/emnlp/ZhouCLY18,
+  author    = {Mingyang Zhou and
+               Runxiang Cheng and
+               Yong Jae Lee and
+               Zhou Yu},
+  title     = {A Visual Attention Grounding Neural Model for Multimodal Machine Translation},
+  pages     = {3643--3653},
+  publisher = {Conference on Empirical Methods in Natural Language Processing},
+  year      = {2018}
+}
+
+@inproceedings{DBLP:conf/acl/CalixtoRA19,
+  author    = {Iacer Calixto and
+               Miguel Rios and
+               Wilker Aziz},
+  title     = {Latent Variable Model for Multi-modal Translation},
+  pages     = {6392--6405},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2019}
+}
+
+@inproceedings{DBLP:conf/acl/YinMSZYZL20,
+  author    = {Yongjing Yin and
+               Fandong Meng and
+               Jinsong Su and
+               Chulun Zhou and
+               Zhengyuan Yang and
+               Jie Zhou and
+               Jiebo Luo},
+  title     = {A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine
+               Translation},
+  pages     = {3025--3035},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2020}
+}
+
+@inproceedings{DBLP:conf/acl/YaoW20,
+  author    = {Shaowei Yao and
+               Xiaojun Wan},
+  title     = {Multimodal Transformer for Multimodal Machine Translation},
+  pages     = {4346--4350},
+  publisher = {Annual Meeting of the Association for Computational Linguistics},
+  year      = {2020}
+}
+
+@inproceedings{DBLP:conf/nips/LuYBP16,
+  author    = {Jiasen Lu and
+               Jianwei Yang and
+               Dhruv Batra and
+               Devi Parikh},
+  title     = {Hierarchical Question-Image Co-Attention for Visual Question Answering},
+  booktitle = {Conference on Neural Information Processing Systems},
+  pages     = {289--297},
+  year      = {2016}
+}
+
+@inproceedings{DBLP:conf/cvpr/VinyalsTBE15,
+  author    = {Oriol Vinyals and
+               Alexander Toshev and
+               Samy Bengio and
+               Dumitru Erhan},
+  title     = {Show and tell: {A} neural image caption generator},
+  pages     = {3156--3164},
+  publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
+  year      = {2015}
+}
+
+@inproceedings{DBLP:conf/icml/XuBKCCSZB15,
+  author    = {Kelvin Xu and
+               Jimmy Ba and
+               Ryan Kiros and
+               Kyunghyun Cho and
+               Aaron C. Courville and
+               Ruslan Salakhutdinov and
+               Richard S. Zemel and
+               Yoshua Bengio},
+  title     = {Show, Attend and Tell: Neural Image Caption Generation with Visual
+               Attention},
+  volume    = {37},
+  pages     = {2048--2057},
+  publisher = {International Conference on Machine Learning},
+  year      = {2015}
+}
+
+@inproceedings{DBLP:conf/cvpr/YouJWFL16,
+  author    = {Quanzeng You and
+               Hailin Jin and
+               Zhaowen Wang and
+               Chen Fang and
+               Jiebo Luo},
+  title     = {Image Captioning with Semantic Attention},
+  pages     = {4651--4659},
+  publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
+  year      = {2016}
+}
+
+@inproceedings{DBLP:conf/cvpr/ChenZXNSLC17,
+  author    = {Long Chen and
+               Hanwang Zhang and
+               Jun Xiao and
+               Liqiang Nie and
+               Jian Shao and
+               Wei Liu and
+               Tat-Seng Chua},
+  title     = {{SCA-CNN:} Spatial and Channel-Wise Attention in Convolutional Networks
+               for Image Captioning},
+  pages     = {6298--6306},
+  publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
+  year      = {2017}
+}
+
+@article{DBLP:journals/pami/FuJCSZ17,
+  author    = {Kun Fu and
+               Junqi Jin and
+               Runpeng Cui and
+               Fei Sha and
+               Changshui Zhang},
+  title     = {Aligning Where to See and What to Tell: Image Captioning with Region-Based
+               Attention and Scene-Specific Contexts},
+  journal   = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
+  volume    = {39},
+  number    = {12},
+  pages     = {2321--2334},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/eccv/YaoPLM18,
+  author    = {Ting Yao and
+               Yingwei Pan and
+               Yehao Li and
+               Tao Mei},
+  title     = {Exploring Visual Relationship for Image Captioning},
+  series    = {Lecture Notes in Computer Science},
+  volume    = {11218},
+  pages     = {711--727},
+  publisher = {European Conference on Computer Vision},
+  year      = {2018}
+}
+
+@inproceedings{DBLP:conf/ijcai/LiuSWWY17,
+  author    = {Chang Liu and
+               Fuchun Sun and
+               Changhu Wang and
+               Feng Wang and
+               Alan L. Yuille},
+  title     = {{MAT:} {A} Multimodal Attentive Translator for Image Captioning},
+  pages     = {4033--4039},
+  publisher = {International Joint Conference on Artificial Intelligence},
+  year      = {2017}
+}
+
+@article{DBLP:journals/corr/abs-1804-02767,
+  author    = {Joseph Redmon and
+               Ali Farhadi},
+  title     = {YOLOv3: An Incremental Improvement},
+  journal   = {CoRR},
+  volume    = {abs/1804.02767},
+  year      = {2018}
+}
+
+@article{DBLP:journals/corr/abs-2004-10934,
+  author    = {Alexey Bochkovskiy and
+               Chien-Yao Wang and
+               Hong-Yuan Mark Liao},
+  title     = {YOLOv4: Optimal Speed and Accuracy of Object Detection},
+  journal   = {CoRR},
+  volume    = {abs/2004.10934},
+  year      = {2020}
+}
+
+@inproceedings{DBLP:conf/cvpr/LuXPS17,
+  author    = {Jiasen Lu and
+               Caiming Xiong and
+               Devi Parikh and
+               Richard Socher},
+  title     = {Knowing When to Look: Adaptive Attention via a Visual Sentinel for
+               Image Captioning},
+  pages     = {3242--3250},
+  publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/cvpr/00010BT0GZ18,
+  author    = {Peter Anderson and
+               Xiaodong He and
+               Chris Buehler and
+               Damien Teney and
+               Mark Johnson and
+               Stephen Gould and
+               Lei Zhang},
+  title     = {Bottom-Up and Top-Down Attention for Image Captioning and Visual Question
+               Answering},
+  pages     = {6077--6086},
+  publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
+  year      = {2018}
+}
+
+@inproceedings{DBLP:conf/mm/ZhouXKC17,
+  author    = {Luowei Zhou and
+               Chenliang Xu and
+               Parker A. Koch and
+               Jason J. Corso},
+  title     = {Watch What You Just Said: Image Captioning with Text-Conditional Attention},
+  pages     = {305--313},
+  publisher = {ACM Multimedia},
+  year      = {2017}
+}
+
+@article{DBLP:journals/mta/FangWCT18,
+  author    = {Fang Fang and
+               Hanli Wang and
+               Yihao Chen and
+               Pengjie Tang},
+  title     = {Looking deeper and transferring attention for image captioning},
+  journal   = {Multimedia Tools Applications},
+  volume    = {77},
+  number    = {23},
+  pages     = {31159--31175},
+  year      = {2018}
+}
+
+@inproceedings{DBLP:conf/cvpr/AnejaDS18,
+  author    = {Jyoti Aneja and
+               Aditya Deshpande and
+               Alexander G. Schwing},
+  title     = {Convolutional Image Captioning},
+  pages     = {5561--5570},
+  publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
+  year      = {2018}
+}
+
+@article{DBLP:journals/corr/abs-1805-09019,
+  author    = {Qingzhong Wang and
+               Antoni B. Chan},
+  title     = {{CNN+CNN:} Convolutional Decoders for Image Captioning},
+  journal   = {CoRR},
+  volume    = {abs/1805.09019},
+  year      = {2018}
+}
+
+@inproceedings{DBLP:conf/eccv/DaiYL18,
+  author    = {Bo Dai and
+               Deming Ye and
+               Dahua Lin},
+  title     = {Rethinking the Form of Latent States in Image Captioning},
+  volume    = {11209},
+  pages     = {294--310},
+  publisher = {European Conference on Computer Vision},
+  year      = {2018}
+}
+
+@inproceedings{DBLP:conf/iccv/AntolALMBZP15,
+  author    = {Stanislaw Antol and
+               Aishwarya Agrawal and
+               Jiasen Lu and
+               Margaret Mitchell and
+               Dhruv Batra and
+               C. Lawrence Zitnick and
+               Devi Parikh},
+  title     = {{VQA:} Visual Question Answering},
+  pages     = {2425--2433},
+  publisher = {International Conference on Computer Vision},
+  year      = {2015}
+}
+
+@inproceedings{DBLP:conf/eccv/CarionMSUKZ20,
+  author    = {Nicolas Carion and
+               Francisco Massa and
+               Gabriel Synnaeve and
+               Nicolas Usunier and
+               Alexander Kirillov and
+               Sergey Zagoruyko},
+  title     = {End-to-End Object Detection with Transformers},
+  volume    = {12346},
+  pages     = {213--229},
+  publisher = {European Conference on Computer Vision},
+  year      = {2020}
+}
+
+@article{DBLP:journals/tcsv/YuLYH20,
+  author    = {Jun Yu and
+               Jing Li and
+               Zhou Yu and
+               Qingming Huang},
+  title     = {Multimodal Transformer With Multi-View Visual Representation for Image
+               Captioning},
+  journal   = {IEEE Transactions on Circuits and Systems for Video Technology},
+  volume    = {30},
+  number    = {12},
+  pages     = {4467--4480},
+  year      = {2020}
+}
+
+@article{Huasong2020SelfAdaptiveNM,
+  title={Self-Adaptive Neural Module Transformer for Visual Question Answering},
+  author={Zhong Huasong and Jingyuan Chen and Chen Shen and Hanwang Zhang and Jianqiang Huang and Xian-Sheng Hua},
+  journal={IEEE Transactions on Multimedia},
+  year={2020},
+  pages={1-1}
+}
+
+@inproceedings{DBLP:conf/emnlp/GokhaleBBY20,
+  author    = {Tejas Gokhale and
+               Pratyay Banerjee and
+               Chitta Baral and
+               Yezhou Yang},
+  title     = {{MUTANT:} {A} Training Paradigm for Out-of-Distribution Generalization
+               in Visual Question Answering},
+  pages     = {878--892},
+  publisher = {Conference on Empirical Methods in Natural Language Processing},
+  year      = {2020}
+}
+
+@inproceedings{DBLP:conf/eccv/Tang0ZWY20,
+  author    = {Ruixue Tang and
+               Chao Ma and
+               Wei Emma Zhang and
+               Qi Wu and
+               Xiaokang Yang},
+  title     = {Semantic Equivalent Adversarial Data Augmentation for Visual Question
+               Answering},
+  volume    = {12364},
+  pages     = {437--453},
+  publisher = {	European Conference on Computer Vision},
+  year      = {2020}
+}
+
+@inproceedings{DBLP:conf/eccv/Li0LZHZWH0WCG20,
+  author    = {Xiujun Li and
+               Xi Yin and
+               Chunyuan Li and
+               Pengchuan Zhang and
+               Xiaowei Hu and
+               Lei Zhang and
+               Lijuan Wang and
+               Houdong Hu and
+               Li Dong and
+               Furu Wei and
+               Yejin Choi and
+               Jianfeng Gao},
+  title     = {Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks},
+  volume    = {12375},
+  pages     = {121--137},
+  publisher = {	European Conference on Computer Vision},
+  year      = {2020}
+}
+
+@inproceedings{DBLP:conf/aaai/ZhouPZHCG20,
+  author    = {Luowei Zhou and
+               Hamid Palangi and
+               Lei Zhang and
+               Houdong Hu and
+               Jason J. Corso and
+               Jianfeng Gao},
+  title     = {Unified Vision-Language Pre-Training for Image Captioning and {VQA}},
+  pages     = {13041--13049},
+  publisher = {AAAI Conference on Artificial Intelligence},
+  year      = {2020}
+}
+
+@inproceedings{DBLP:conf/iclr/SuZCLLWD20,
+  author    = {Weijie Su and
+               Xizhou Zhu and
+               Yue Cao and
+               Bin Li and
+               Lewei Lu and
+               Furu Wei and
+               Jifeng Dai},
+  title     = {{VL-BERT:} Pre-training of Generic Visual-Linguistic Representations},
+  publisher = {International Conference on Learning Representations},
+  year      = {2020}
+}
+
+@inproceedings{DBLP:conf/nips/GoodfellowPMXWOCB14,
+  author    = {Ian J. Goodfellow and
+               Jean Pouget-Abadie and
+               Mehdi Mirza and
+               Bing Xu and
+               David Warde-Farley and
+               Sherjil Ozair and
+               Aaron C. Courville and
+               Yoshua Bengio},
+  title     = {Generative Adversarial Nets},
+  publisher = {Conference on Neural Information Processing Systems},
+  pages     = {2672--2680},
+  year      = {2014}
+}
+
+@inproceedings{DBLP:conf/nips/ZhuZPDEWS17,
+  author    = {Jun-Yan Zhu and
+               Richard Zhang and
+               Deepak Pathak and
+               Trevor Darrell and
+               Alexei A. Efros and
+               Oliver Wang and
+               Eli Shechtman},
+  title     = {Toward Multimodal Image-to-Image Translation},
+  publisher = {Conference on Neural Information Processing Systems},
+  pages     = {465--476},
+  year      = {2017}
+}
+
+@article{DBLP:journals/corr/abs-1908-06616,
+  author    = {Hajar Emami and
+               Majid Moradi Aliabadi and
+               Ming Dong and
+               Ratna Babu Chinnam},
+  title     = {{SPA-GAN:} Spatial Attention {GAN} for Image-to-Image Translation},
+  journal   = {CoRR},
+  volume    = {abs/1908.06616},
+  year      = {2019}
+}
+
+@article{DBLP:journals/access/XiongWG19,
+  author    = {Feng Xiong and
+               Qianqian Wang and
+               Quanxue Gao},
+  title     = {Consistent Embedded {GAN} for Image-to-Image Translation},
+  journal   = {International Conference on Access Networks},
+  volume    = {7},
+  pages     = {126651--126661},
+  year      = {2019}
+}
+
+@inproceedings{DBLP:conf/iccv/ZhuPIE17,
+  author    = {Jun-Yan Zhu and
+               Taesung Park and
+               Phillip Isola and
+               Alexei A. Efros},
+  title     = {Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial
+               Networks},
+  pages     = {2242--2251},
+  publisher = {International Conference on Computer Vision},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/iccv/YiZTG17,
+  author    = {Zili Yi and
+               Hao (Richard) Zhang and
+               Ping Tan and
+               Minglun Gong},
+  title     = {DualGAN: Unsupervised Dual Learning for Image-to-Image Translation},
+  pages     = {2868--2876},
+  publisher = {International Conference on Computer Vision},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/nips/LiuBK17,
+  author    = {Ming-Yu Liu and
+               Thomas Breuel and
+               Jan Kautz},
+  title     = {Unsupervised Image-to-Image Translation Networks},
+  publisher = {Conference on Neural Information Processing Systems},
+  pages     = {700--708},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/cvpr/IsolaZZE17,
+  author    = {Phillip Isola and
+               Jun-Yan Zhu and
+               Tinghui Zhou and
+               Alexei A. Efros},
+  title     = {Image-to-Image Translation with Conditional Adversarial Networks},
+  pages     = {5967--5976},
+  publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/icml/ReedAYLSL16,
+  author    = {Scott E. Reed and
+               Zeynep Akata and
+               Xinchen Yan and
+               Lajanugen Logeswaran and
+               Bernt Schiele and
+               Honglak Lee},
+  title     = {Generative Adversarial Text to Image Synthesis},
+  volume    = {48},
+  pages     = {1060--1069},
+  publisher = {International Conference on Machine Learning},
+  year      = {2016}
+}
+
+@article{DBLP:journals/corr/DashGALA17,
+  author    = {Ayushman Dash and
+               John Cristian Borges Gamboa and
+               Sheraz Ahmed and
+               Marcus Liwicki and
+               Muhammad Zeshan Afzal},
+  title     = {{TAC-GAN} - Text Conditioned Auxiliary Classifier Generative Adversarial
+               Network},
+  journal   = {CoRR},
+  volume    = {abs/1703.06412},
+  year      = {2017}
+}
+
+@inproceedings{DBLP:conf/nips/ReedAMTSL16,
+  author    = {Scott E. Reed and
+               Zeynep Akata and
+               Santosh Mohan and
+               Samuel Tenka and
+               Bernt Schiele and
+               Honglak Lee},
+  title     = {Learning What and Where to Draw},
+  publisher = {Conference on Neural Information Processing Systems},
+  pages     = {217--225},
+  year      = {2016}
+}
+
+@inproceedings{DBLP:conf/cvpr/ZhangXY18,
+  author    = {Zizhao Zhang and
+               Yuanpu Xie and
+               Lin Yang},
+  title     = {Photographic Text-to-Image Synthesis With a Hierarchically-Nested
+               Adversarial Network},
+  pages     = {6199--6208},
+  publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
+  year      = {2018}
+}
+
 %%%%% chapter 17------------------------------------------------------
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%