Commit be0790f5 by 曹润柘

update 17.2

parent 34324868
......@@ -253,7 +253,8 @@
\subsection{基于图像增强的文本翻译}
\parinterval 在文本翻译中引入图像信息是最典型的多模态机器翻译任务。虽然多模态机器翻译还是一种从源语言文字到目标语言文字的转换,但是在转换的过程中,融入了其他模态的信息减少了歧义的产生。例如前文提到的通过与源语言相关的图像信息,将“A medium sized child jumps off of a dusty bank”中“bank”译为“河岸”而不是“银行”,通过给定一张相关的图片,机器翻译模型就可以利用视觉信息更好的理解歧义词,避免产生歧义。换句话说,对于同一图像或者视觉场景的描述,源语言和目标语言描述的本质意义是一致的,只不过,体现在语言上会有表达方法上的差异。那么,图像就会存在一些源语言和目标语言的隐含对齐“约束”,将这种“约束”融入到机器翻译系统,会让模型加深对某些歧义词语上下文的理解,从而进一步提高机器翻译质量。
\parinterval WMT机器翻译评测在2016年首次将融合图像和文本的多模态机器翻译作为机器翻译和跨语言图像描述的共享任务[2],这项任务也受到了广泛的研究[5-6]。如何融入视觉信息,更好的理解多模态上下文语义是多模态机器翻译研究的热点,大体的研究方向包括基于特征融合的方法[7,15, 17]、基于多任务学习的方法[18,21]。接下来将从这两个方向,对多模态机器翻译的研究展开介绍。
\parinterval WMT机器翻译评测在2016年首次将融合图像和文本的多模态机器翻译作为机器翻译和跨语言图像描述的共享任务\upcite{DBLP:conf/wmt/SpeciaFSE16},这项任务也受到了广泛的研究\upcite{DBLP:conf/wmt/CaglayanABGBBMH17,DBLP:conf/wmt/LibovickyHTBP16}。如何融入视觉信息,更好的理解多模态上下文语义是多模态机器翻译研究的热点,大体的研究方向包括基于特征融合的方法\upcite{DBLP:conf/emnlp/CalixtoL17,DBLP:journals/corr/abs-1712-03449,DBLP:conf/wmt/HelclLV18}、基于多任务学习的方法\upcite{DBLP:conf/ijcnlp/ElliottK17,DBLP:conf/acl/YinMSZYZL20}。接下来将从这两个方向,对多模态机器翻译的研究展开介绍。
%----------------------------------------------------------------------------------------
% NEW SUBSUB-SECTION
......@@ -261,7 +262,7 @@
\subsubsection{1. 基于特征融合的方法}
\parinterval 较为早期的研究工作通常将图像信息作为输入句子的一部分[7-8],或者用其对编码器、解码器的状态进行初始化[7, 9-10]。如图2所示,对图像特征的提取通常是基于卷积神经网络,有关卷积神经网络的内容,请参考{\chaptereleven}内容。通过卷积神经网络得到全局视觉特征,在进行维度变换后,将其作为源语言输入的一部分或者初始化状态引入到模型当中。但是,这种图像信息的引入方式有以下两个缺点:
\parinterval 较为早期的研究工作通常将图像信息作为输入句子的一部分\upcite{DBLP:conf/emnlp/CalixtoL17,DBLP:conf/wmt/HuangLSOD16},或者用其对编码器、解码器的状态进行初始化\upcite{DBLP:conf/emnlp/CalixtoL17,Elliott2015MultilingualID,DBLP:conf/wmt/MadhyasthaWS17}。如图2所示,对图像特征的提取通常是基于卷积神经网络,有关卷积神经网络的内容,请参考{\chaptereleven}内容。通过卷积神经网络得到全局视觉特征,在进行维度变换后,将其作为源语言输入的一部分或者初始化状态引入到模型当中。但是,这种图像信息的引入方式有以下两个缺点:
\begin{itemize}
\vspace{0.5em}
......@@ -305,7 +306,7 @@
\noindent 其中,${\alpha}_{i,j}$是注意力权重,它表示目标语言第j个位置与图片编码状态序列第i个位置的相关性大小,计算方式与{\chapterten}描述的注意力函数一致。
\parinterval 这里,将每个时间步编码器的输出$\mathbi{h}_{i}$看作源图像序列位置$i$的表示结果。图3说明了模型在生成目标词“man”时,图像经过注意力机制对图像区域关注度的可视化效果,可以看到,经过注意力机制后,模型更注重的是与目标词相关的图像部分。当然,多模态机器翻译的输入还包括源语言文字序列。通常,源语言文字对于翻译的作用比图像更大[23]。从这个角度说,图像信息更多的是作为文字信息的补充,而不是替代。除此之外,注意力机制在多模态机器翻译中也有很多研究,不仅仅在解码器端将经过注意力机制的文本特征和视觉特征作为解码输入的一部分,还有的工作在编码端将源语言与图像信息进行注意力建模[22,23],得到更好的源语言特征表示。
\parinterval 这里,将每个时间步编码器的输出$\mathbi{h}_{i}$看作源图像序列位置$i$的表示结果。图3说明了模型在生成目标词“man”时,图像经过注意力机制对图像区域关注度的可视化效果,可以看到,经过注意力机制后,模型更注重的是与目标词相关的图像部分。当然,多模态机器翻译的输入还包括源语言文字序列。通常,源语言文字对于翻译的作用比图像更大\upcite{DBLP:conf/acl/YaoW20}。从这个角度说,图像信息更多的是作为文字信息的补充,而不是替代。除此之外,注意力机制在多模态机器翻译中也有很多研究,不仅仅在解码器端将经过注意力机制的文本特征和视觉特征作为解码输入的一部分,还有的工作在编码端将源语言与图像信息进行注意力建模\upcite{DBLP:journals/corr/abs-1712-03449,DBLP:conf/acl/YaoW20},得到更好的源语言特征表示。
%----------------------------------------------------------------------------------------
% NEW SUBSUB-SECTION
......@@ -315,7 +316,7 @@
\parinterval 基于多任务学习的方法通常是把翻译任务与其他视觉任务结合,进行联合训练。在{\chapterfifteen}{\chaptersixteen}已经提到过多任务学习。一种常见的多任务学习框架是针对多个相关的任务,共享模型的部分参数来学习不同任务之间相似的部分,并通过特定的模块来学习每个任务特有的部分。在多模态机器翻译中,应用多任务学习的主要策略就是将翻译作为主任务,同时设置一些与其他模态相关的子任务,通过这些子任务来辅助源语言理解自身的语言知识。
\parinterval 如图4所示,可以将多模态机器翻译任务分解为两个子任务:机器翻译和图片生成[18]。其中机器翻译作为主任务,图片生成作为子任务,图片生成这里指的是从一个图片描述生成对应图片,对于图片生成任务在后面叙述。通过单个编码器对源语言数据进行建模,然后通过两个解码器(翻译解码器和图像解码器)来学习翻译任务和图像生成任务。顶层任务学习每个任务的独立特征,底层共享参数层能够学习到更丰富的文本特征表示。另外在视觉问答领域有研究表明[24],在多模态任务中,不宜引入多层的注意力,因为多层注意力会导致模型严重的过拟合,从另一角度来说,利用多任务学习的方式,提高模型的泛化能力,也是一种有效防止过拟合现象的方式。类似的思想,也大量使用在多模态自然语言处理中,例如图像描述生成、视觉问答[42]等。
\parinterval 如图4所示,可以将多模态机器翻译任务分解为两个子任务:机器翻译和图片生成\upcite{DBLP:conf/ijcnlp/ElliottK17}。其中机器翻译作为主任务,图片生成作为子任务,图片生成这里指的是从一个图片描述生成对应图片,对于图片生成任务在后面叙述。通过单个编码器对源语言数据进行建模,然后通过两个解码器(翻译解码器和图像解码器)来学习翻译任务和图像生成任务。顶层任务学习每个任务的独立特征,底层共享参数层能够学习到更丰富的文本特征表示。另外在视觉问答领域有研究表明\upcite{DBLP:conf/nips/LuYBP16},在多模态任务中,不宜引入多层的注意力,因为多层注意力会导致模型严重的过拟合,从另一角度来说,利用多任务学习的方式,提高模型的泛化能力,也是一种有效防止过拟合现象的方式。类似的思想,也大量使用在多模态自然语言处理中,例如图像描述生成、视觉问答\upcite{DBLP:conf/iccv/AntolALMBZP15}等。
%----------------------------------------------------------------------------------------------------
\begin{table}[htp]
......@@ -341,7 +342,7 @@
\end{table}
%----------------------------------------------------------------------------------------------------
\parinterval 传统图像描述生成有两种范式:基于检索的方法和基于模板的方法。其中基于检索的方法(图5左)是指在指定的图像描述候选句子中选择其中的句子作为图像的描述,这种方法的弊端是所选择的句子可能会和图像很大程度上不相符。而基于模板的方法(图5右)是指在图像上检测视觉特征,然后把内容填在实现设计好的模板当中,这种方法的缺点是生成的图像描述过于呆板,‘像是在一个模子中刻出来的’说的就是这个意思。近几年来 ,由于卷积神经网络在计算机视觉领域效果显著,而循环神经网络在自然语言处理领域卓有成效,受到机器翻译领域编码器-解码器框架的启发,逐渐的,这种基于卷积神经网络作为编码器编码图像,循环神经网络作为解码器解码描述的编码器-解码器框架成了图像描述任务的基础范式。本章节,从基础的图像描述范式编码器-解码器框架展开[25,26],从编码器的改进、解码器的改进展开介绍。
\parinterval 传统图像描述生成有两种范式:基于检索的方法和基于模板的方法。其中基于检索的方法(图5左)是指在指定的图像描述候选句子中选择其中的句子作为图像的描述,这种方法的弊端是所选择的句子可能会和图像很大程度上不相符。而基于模板的方法(图5右)是指在图像上检测视觉特征,然后把内容填在实现设计好的模板当中,这种方法的缺点是生成的图像描述过于呆板,‘像是在一个模子中刻出来的’说的就是这个意思。近几年来 ,由于卷积神经网络在计算机视觉领域效果显著,而循环神经网络在自然语言处理领域卓有成效,受到机器翻译领域编码器-解码器框架的启发,逐渐的,这种基于卷积神经网络作为编码器编码图像,循环神经网络作为解码器解码描述的编码器-解码器框架成了图像描述任务的基础范式。本章节,从基础的图像描述范式编码器-解码器框架展开\upcite{DBLP:conf/cvpr/VinyalsTBE15,DBLP:conf/icml/XuBKCCSZB15},从编码器的改进、解码器的改进展开介绍。
%----------------------------------------------------------------------------------------
% NEW SUBSUB-SECTION
......@@ -349,7 +350,7 @@
\subsubsection{1. 基础框架}
\parinterval 受到神经机器翻译的启发,编码器-解码器框架也应用到图像描述任务当中。其中,编码器将输入的图像转换为一种新的“表示”形式,这种表示包含了输入图像的所有信息。之后解码器把这种“表示”重新转换为输出的描述。图XX中(上)是编码器-解码器框架在图像描述生成的应用[25]。首先,通过卷积神经网络提取图像特征到一个合适的长度向量表示。然后,利用长短时记忆网络(LSTM)解码生成文字描述,这个过程中与机器翻译解码过程类似。这种建模方式存在一定的短板:生成的描述单词不一定需要所有的图像信息,将全局的图像信息送入模型中,可能会引入噪音,使这种“表示”形式不准确。针对这个问题,图XX(下)[26]为了弥补这种建模的局限性,引入了注意力机制。利用注意力机制在生成不同单词时,使模型不再只关注图像的全局特征,而是关注“应该”关注的图像特征。
\parinterval 受到神经机器翻译的启发,编码器-解码器框架也应用到图像描述任务当中。其中,编码器将输入的图像转换为一种新的“表示”形式,这种表示包含了输入图像的所有信息。之后解码器把这种“表示”重新转换为输出的描述。图XX中(上)是编码器-解码器框架在图像描述生成的应用\upcite{DBLP:conf/cvpr/VinyalsTBE15}。首先,通过卷积神经网络提取图像特征到一个合适的长度向量表示。然后,利用长短时记忆网络(LSTM)解码生成文字描述,这个过程中与机器翻译解码过程类似。这种建模方式存在一定的短板:生成的描述单词不一定需要所有的图像信息,将全局的图像信息送入模型中,可能会引入噪音,使这种“表示”形式不准确。针对这个问题,图XX(下)\upcite{DBLP:conf/icml/XuBKCCSZB15}为了弥补这种建模的局限性,引入了注意力机制。利用注意力机制在生成不同单词时,使模型不再只关注图像的全局特征,而是关注“应该”关注的图像特征。
%----------------------------------------------------------------------------------------------------
\begin{table}[htp]
......@@ -375,9 +376,9 @@
\subsubsection{2. 编码器的改进}
\parinterval 要想使编码器-解码器框架在图像描述中充分发挥作用,编码器也要更好的表示图像信息。对于编码器的改进,大多也是从这个方向出发。通常,体现在向编码器中添加图像的语义信息[27,28,29]和位置信息[28,31]
\parinterval 要想使编码器-解码器框架在图像描述中充分发挥作用,编码器也要更好的表示图像信息。对于编码器的改进,大多也是从这个方向出发。通常,体现在向编码器中添加图像的语义信息\upcite{DBLP:conf/cvpr/YouJWFL16,DBLP:conf/cvpr/ChenZXNSLC17,DBLP:journals/pami/FuJCSZ17}和位置信息\upcite{DBLP:conf/cvpr/ChenZXNSLC17,DBLP:conf/ijcai/LiuSWWY17}
\parinterval 图像的语义信息一般是指图像中存在的实体、属性、场景等等。如图XX所示,从图像中利用属性或实体检测器提取出“child”、“river”、“bank”等等的属性词和实体词作为图像的语义信息,提取全局的图像特征初始化循环神经网络,再利用注意力机制计算目标词与属性词或实体词之间的注意力权重,根据该权重计算上下文向量,从而将编码语义信息送入解码端[27],在解码‘bank’单词时,会更关注图像语义信息中的‘bank’。当然,除了图像中的实体和属性作为语义信息外,也可以将图片的场景信息也加入到编码器当中[29]。有关如何做属性、实体和场景的检测,涉及到目标检测任务的工作,例如Faster-RCNN[32]、YOLO[33,34]等等,这里不过多赘述。
\parinterval 图像的语义信息一般是指图像中存在的实体、属性、场景等等。如图XX所示,从图像中利用属性或实体检测器提取出“child”、“river”、“bank”等等的属性词和实体词作为图像的语义信息,提取全局的图像特征初始化循环神经网络,再利用注意力机制计算目标词与属性词或实体词之间的注意力权重,根据该权重计算上下文向量,从而将编码语义信息送入解码端\upcite{DBLP:conf/cvpr/YouJWFL16},在解码‘bank’单词时,会更关注图像语义信息中的‘bank’。当然,除了图像中的实体和属性作为语义信息外,也可以将图片的场景信息也加入到编码器当中\upcite{DBLP:journals/pami/FuJCSZ17}。有关如何做属性、实体和场景的检测,涉及到目标检测任务的工作,例如Faster-RCNN\upcite{DBLP:journals/pami/RenHG017}、YOLO\upcite{DBLP:journals/corr/abs-1804-02767,DBLP:journals/corr/abs-2004-10934}等等,这里不过多赘述。
%----------------------------------------------------------------------------------------------------
\begin{table}[htp]
......@@ -387,7 +388,7 @@
\end{table}
%----------------------------------------------------------------------------------------------------
\parinterval 以上的方法大都是将图像中的实体、属性、场景等映射到文字上,并把这些信息显式地添加到编码器端。令一种方式,把图像中的语义特征隐式地作用到编码器端[28]。例如,可以图像数据可以分解为三个通道(红、绿、蓝),简单来说,就是将图像的每一个像素点按照红色、绿色、蓝色分成三个部分,这样就将图像分成了三个通道。在很多图像中,不同通道随伴随的特征是不一样的,可以将其作用于编码器端。另一种方法是基于位置信息的编码器增强。位置信息指的是图像中对象(物体)的位置。利用目标检测技术检测系统获得图中的对象和对应的特征,这样就确定了图中的对象位置。显然,这些信息也可以加入到编码端,以加强编码器的表示能力[30]
\parinterval 以上的方法大都是将图像中的实体、属性、场景等映射到文字上,并把这些信息显式地添加到编码器端。令一种方式,把图像中的语义特征隐式地作用到编码器端\upcite{DBLP:conf/cvpr/ChenZXNSLC17}。例如,可以图像数据可以分解为三个通道(红、绿、蓝),简单来说,就是将图像的每一个像素点按照红色、绿色、蓝色分成三个部分,这样就将图像分成了三个通道。在很多图像中,不同通道随伴随的特征是不一样的,可以将其作用于编码器端。另一种方法是基于位置信息的编码器增强。位置信息指的是图像中对象(物体)的位置。利用目标检测技术检测系统获得图中的对象和对应的特征,这样就确定了图中的对象位置。显然,这些信息也可以加入到编码端,以加强编码器的表示能力\upcite{DBLP:conf/eccv/YaoPLM18}
%----------------------------------------------------------------------------------------
% NEW SUBSUB-SECTION
......@@ -395,8 +396,8 @@
\subsubsection{3. 解码器的改进}
\parinterval 由于解码器输出的是语言文字序列,因此需要考虑语言的特点对其进行改进。 例如,解码过程中, “the”,“on”,“at”这种介词或者冠词与图像的相关性较低,这时图像信息的引入就会产生负面影响[35]。因此,可以通过门等结构,控制视觉信号作用于文字生成的程度。另外,在解码过程中,生成的每个单词对应着图像的区域可能是不同的。因此也可以设计更为有效的注意力机制来捕捉解码端对不同图像局部信息的关注程度[36]
\parinterval 除了在解码端更好的使生成文本与图像特征相互作用以外,还有一些其他的解码器端改进的方向。例如:用其它结构(如卷积神经网络或者Transformer)代替解码器端循环神经网络[39]。或者使用更深层的神经网络学习动词或者名词等视觉中不易表现出来的单词[38](这个参考文献层次有些低,我怕引用了有些问题。不过这个观点还是很有意思的,可以先确定文献的正规性,或者有没有顶会做类似事情的),其思想与深层神经机器翻译模型有相通之处({\chapterfifteen})。
\parinterval 由于解码器输出的是语言文字序列,因此需要考虑语言的特点对其进行改进。 例如,解码过程中, “the”,“on”,“at”这种介词或者冠词与图像的相关性较低,这时图像信息的引入就会产生负面影响\upcite{DBLP:conf/cvpr/LuXPS17}。因此,可以通过门等结构,控制视觉信号作用于文字生成的程度。另外,在解码过程中,生成的每个单词对应着图像的区域可能是不同的。因此也可以设计更为有效的注意力机制来捕捉解码端对不同图像局部信息的关注程度\upcite{DBLP:conf/cvpr/00010BT0GZ18}
\parinterval 除了在解码端更好的使生成文本与图像特征相互作用以外,还有一些其他的解码器端改进的方向。例如:用其它结构(如卷积神经网络或者Transformer)代替解码器端循环神经网络\upcite{DBLP:conf/cvpr/AnejaDS18}。或者使用更深层的神经网络学习动词或者名词等视觉中不易表现出来的单词\upcite{DBLP:journals/mta/FangWCT18},其思想与深层神经机器翻译模型有相通之处({\chapterfifteen})。
%----------------------------------------------------------------------------------------
% NEW SUB-SECTION
......@@ -408,9 +409,9 @@
\parinterval 计算机视觉领域,图像风格转移、图像语义分割、图像超分辨率等任务,都可以被视为{\small\bfnew{图像到图像的翻译}}\index{图像到图像的翻译}(Image-to-Image Translation)\index{Image-to-Image Translation}问题。与机器翻译类似,这些问题的共同目标是学习从一个对象到另一个对象的映射,只不过这里的对象是指图像,而非机器翻译中的文字。例如,给定物体的轮廓生成真实物体照片或者给定白天照片生成夜晚的照片等。图像到图像的翻译有广阔的应用场景,如图片补全、风格迁移等。
\parinterval 对抗神经网络被广泛地应用再图像到图像的翻译任务当中[53,54,55]。实际上,这类方法非常适合图像生成类的任务。简单来说,对抗生成网络包括两个部分分别是:生成器和判别器。基于输入生成器生成一个结果,而判别器要判别生成的结果和真实结果是否是相同的,对抗的思想是,通过强化生成器的生成能力和判别器的判别能力,当生成器生成的结果可以“骗”过判别器时,即判别器无法分清真实结果和生成结果,认为模型学到了这种映射关系。在图像到图像的翻译中,根据输入图像,生成器生成预测图像,判别器判别是否为目标图像,多次迭代后,生成图像被判别为目标图像时,则模型学习到了“翻译能力”。以上的工作都是有监督的,即基于对齐的图像对数据集,但是,这种数据的标注是极为费时费力的,所以有很多的工作也基于无监督的方法展开[57,58,59],这里不过多赘述。
\parinterval 对抗神经网络被广泛地应用再图像到图像的翻译任务当中\upcite{DBLP:conf/nips/GoodfellowPMXWOCB14,DBLP:conf/nips/ZhuZPDEWS17,DBLP:journals/corr/abs-1908-06616}。实际上,这类方法非常适合图像生成类的任务。简单来说,对抗生成网络包括两个部分分别是:生成器和判别器。基于输入生成器生成一个结果,而判别器要判别生成的结果和真实结果是否是相同的,对抗的思想是,通过强化生成器的生成能力和判别器的判别能力,当生成器生成的结果可以“骗”过判别器时,即判别器无法分清真实结果和生成结果,认为模型学到了这种映射关系。在图像到图像的翻译中,根据输入图像,生成器生成预测图像,判别器判别是否为目标图像,多次迭代后,生成图像被判别为目标图像时,则模型学习到了“翻译能力”。以上的工作都是有监督的,即基于对齐的图像对数据集,但是,这种数据的标注是极为费时费力的,所以有很多的工作也基于无监督的方法展开\upcite{DBLP:conf/iccv/ZhuPIE17,DBLP:conf/iccv/YiZTG17,DBLP:conf/nips/LiuBK17},这里不过多赘述。
\parinterval {\small\bfnew{文本到图像的翻译}}\index{文本到图像的翻译}(Text-to-Image Translation)\index{Text-to-Image Translation}是指给定描述物体颜色和形状等细节的一自然语言文字,生成对应的图像。该任务也可以看作是图像描述任务的逆任务。目前方法上大部分基于对抗神经网络[61,62,63]。基本流程为:首先利用自然语言处理技术提取出文本信息,然后再用文本特征作为后面生成图像的约束,在对抗神经网络中生成器(Generator)中根据文本特征生成图像的约束,从而别鉴别器(Discriminator)鉴定其生成效果。
\parinterval {\small\bfnew{文本到图像的翻译}}\index{文本到图像的翻译}(Text-to-Image Translation)\index{Text-to-Image Translation}是指给定描述物体颜色和形状等细节的一自然语言文字,生成对应的图像。该任务也可以看作是图像描述任务的逆任务。目前方法上大部分基于对抗神经网络\upcite{DBLP:conf/icml/ReedAYLSL16,DBLP:journals/corr/DashGALA17,DBLP:conf/nips/ReedAMTSL16}。基本流程为:首先利用自然语言处理技术提取出文本信息,然后再用文本特征作为后面生成图像的约束,在对抗神经网络中生成器(Generator)中根据文本特征生成图像的约束,从而别鉴别器(Discriminator)鉴定其生成效果。
%----------------------------------------------------------------------------------------
% NEW SECTION
......@@ -638,11 +639,11 @@ D_i&\subseteq&\{X_{-i},Y_{-i}\} \label{eq:17-3-2}
\section{小结及扩展阅读}
\parinterval 本章仅对音频处理和语音识别进行了简单的介绍,具体内容可以参考一些经典书籍,比如关于信号处理的基础知识\upcite{[Discrete-Time Signal Processing (3rd version)][ Discrete-Time Speech Signal Processing: Principles and Practice]},以及语音识别的传统方法\upcite{[Fundamentals of Speech Recognition][ Spoken Language Processing: A Guide to Theory, Algorithm, and System Development]}和基于深度学习的最新方法\upcite{[ Automatic Speech Recognition: A Deep Learning Approach, 俞栋、邓力]}。此外,语音翻译的一个重要应用是机器同声传译。
\parinterval 本章仅对音频处理和语音识别进行了简单的介绍,具体内容可以参考一些经典书籍,比如关于信号处理的基础知识\upcite{Oppenheim2001DiscretetimeSP,Quatieri2001DiscreteTimeSS},以及语音识别的传统方法\upcite{DBLP:books/daglib/0071550,Huang2001SpokenLP}和基于深度学习的最新方法\upcite{benesty2008automatic}。此外,语音翻译的一个重要应用是机器同声传译。
\parinterval 在篇章级翻译方面,一些研究工作对这类模型的上下文建模能力进行了探索\upcite{DBLP:conf/discomt/KimTN19,DBLP:conf/acl/LiLWJXZLL20},发现模型性能在小数据集上的BLEU提升并不完全来自于上下文信息的利用。同时,受限于数据规模,篇章级翻译模型相对难以训练。一些研究人员通过调整训练策略来帮助模型更容易捕获上下文信息\upcite{DBLP:journals/corr/abs-1903-04715,DBLP:conf/acl/SaundersSB20,DBLP:conf/mtsummit/StojanovskiF19}。除了训练策略的调整,也可以使用数据增强\upcite{DBLP:conf/discomt/SugiyamaY19}和预训练\upcite{DBLP:journals/corr/abs-1911-03110,DBLP:journals/tacl/LiuGGLEGLZ20}的手段来缓解数据稀缺的问题。此外,区别于传统的篇章级翻译,一些对话翻译也需要使用长距离上下文信息\upcite{DBLP:conf/wmt/MarufMH18}
\parinterval 最近,多模态机器翻译、图像描述、视觉问答[42](Visual Question Answering)等多模态任务受到人工智能领域的广泛关注。如何将多个模态的信息充分融合,是研究多模态任务的重要问题。在自然语言处理领域transformer[43]框架的提出后,被应用到计算机视觉[44]、多模态任务[45,46,47]效果也有显著的提升。另外,数据稀缺是多模态任务受限之处,可以采取数据增强[48,49]的方式缓解。但是,这时仍需要回答在:模型没有充分训练时,图像等模态信息究竟在翻译里发挥了多少作用?类似的问题在篇章级机器翻译中也存在,上下文模型在训练数据量很小的时候对翻译的作用十分微弱(引用李北ACL)。因此,也有必要探究究竟图像等上下文信息如何可以更有效地发挥作用。此外,受到预训练模型的启发,在多模态领域,图像和文本联合预训练[50,51,52]的工作也相继开展,利用transformer框架,通过自注意力机制捕捉图像和文本的隐藏对齐,提升模型性能,同时缓解数据稀缺问题。
\parinterval 最近,多模态机器翻译、图像描述、视觉问答\upcite{DBLP:conf/iccv/AntolALMBZP15}(Visual Question Answering)等多模态任务受到人工智能领域的广泛关注。如何将多个模态的信息充分融合,是研究多模态任务的重要问题。在自然语言处理领域transformer\upcite{vaswani2017attention}框架的提出后,被应用到计算机视觉\upcite{DBLP:conf/eccv/CarionMSUKZ20}、多模态任务\upcite{DBLP:conf/acl/YaoW20,DBLP:journals/tcsv/YuLYH20,Huasong2020SelfAdaptiveNM}效果也有显著的提升。另外,数据稀缺是多模态任务受限之处,可以采取数据增强\upcite{DBLP:conf/emnlp/GokhaleBBY20,DBLP:conf/eccv/Tang0ZWY20}的方式缓解。但是,这时仍需要回答在:模型没有充分训练时,图像等模态信息究竟在翻译里发挥了多少作用?类似的问题在篇章级机器翻译中也存在,上下文模型在训练数据量很小的时候对翻译的作用十分微弱(引用李北ACL)。因此,也有必要探究究竟图像等上下文信息如何可以更有效地发挥作用。此外,受到预训练模型的启发,在多模态领域,图像和文本联合预训练\upcite{DBLP:conf/eccv/Li0LZHZWH0WCG20,DBLP:conf/aaai/ZhouPZHCG20,DBLP:conf/iclr/SuZCLLWD20}的工作也相继开展,利用transformer框架,通过自注意力机制捕捉图像和文本的隐藏对齐,提升模型性能,同时缓解数据稀缺问题。
......
......@@ -9824,21 +9824,21 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:journals/corr/abs200111327,
@article{DBLP:journals/corr/abs200111327,
author = {Idris Abdulmumin and
Bashir Shehu Galadanci and
Abubakar Isa},
title = {Iterative Batch Back-Translation for Neural Machine Translation: {A}
Conceptual Model},
publisher = {CoRR},
journal = {CoRR},
year = {2020}
}
@inproceedings{DBLP:journals/corr/abs200403672,
@article{DBLP:journals/corr/abs200403672,
author = {Zi-Yi Dou and
Antonios Anastasopoulos and
Graham Neubig},
title = {Dynamic Data Selection and Weighting for Iterative Back-Translation},
publisher = {CoRR},
journal = {CoRR},
year = {2020}
}
@inproceedings{DBLP:conf/emnlp/WuZHGQLL19,
......@@ -9854,15 +9854,14 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:journals/corr/abs-1901-09069,
@article{DBLP:journals/corr/abs-1901-09069,
author = {Felipe Almeida and
Geraldo Xex{\'{e}}o},
title = {Word Embeddings: {A} Survey},
publisher = {CoRR},
journal = {CoRR},
year = {2019}
}
@inproceedings{DBLP:journals/corr/abs-2002-06823,
@article{DBLP:journals/corr/abs-2002-06823,
author = {Jinhua Zhu and
Yingce Xia and
Lijun Wu and
......@@ -9872,7 +9871,7 @@ author = {Zhuang Liu and
Houqiang Li and
Tie-Yan Liu},
title = {Incorporating {BERT} into Neural Machine Translation},
publisher = {International Conference on Learning Representations},
journal = {CoRR},
year = {2020}
}
@inproceedings{song2019mass,
......@@ -9884,13 +9883,13 @@ author = {Zhuang Liu and
title = {{MASS:} Masked Sequence to Sequence Pre-training for Language Generation},
volume = {97},
pages = {5926--5936},
publisher = {International Conference on Machine Learning},
publisher = {{PMLR}},
year = {2019}
}
@inproceedings{DBLP:journals/corr/Ruder17a,
@article{DBLP:journals/corr/Ruder17a,
author = {Sebastian Ruder},
title = {An Overview of Multi-Task Learning in Deep Neural Networks},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1706.05098},
year = {2017}
}
......@@ -9904,18 +9903,7 @@ author = {Zhuang Liu and
title = {Dual Supervised Learning},
volume = {70},
pages = {3789--3798},
publisher = {International Conference on Machine Learning},
year = {2017}
}
@inproceedings{DBLP:conf/iccv/ZhuPIE17,
author = {Jun-Yan Zhu and
Taesung Park and
Phillip Isola and
Alexei A. Efros},
title = {Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial
Networks},
pages = {2242--2251},
publisher = {{IEEE} Computer Society},
publisher = {{PMLR}},
year = {2017}
}
@inproceedings{DBLP:conf/nips/HeXQWYLM16,
......@@ -9958,12 +9946,12 @@ author = {Zhuang Liu and
title = {Analyzing Uncertainty in Neural Machine Translation},
volume = {80},
pages = {3953--3962},
publisher = {International Conference on Machine Learning},
publisher = {{PMLR}},
year = {2018}
}
@inproceedings{finding2006adafre,
author = {S. F. Adafre and Maarten de Rijke},
title = {Finding Similar Sentences across Multiple Languages in Wikipedia},
title = {Finding Similar Sentences across Multiple Languages in Wikipedia },
publisher = {Annual Conference of the European Association for Machine Translation},
year = {2006}
}
......@@ -9973,12 +9961,12 @@ author = {Zhuang Liu and
publisher = {AAAI Conference on Artificial Intelligence},
year = {2008}
}
@inproceedings{DBLP:journals/coling/MunteanuM05,
@article{DBLP:journals/coling/MunteanuM05,
author = {Dragos Stefan Munteanu and
Daniel Marcu},
title = {Improving Machine Translation Performance by Exploiting Non-Parallel
Corpora},
publisher = {Computational Linguistics},
journal = {Computational Linguistics},
volume = {31},
number = {4},
pages = {477--504},
......@@ -10032,9 +10020,9 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{2015OnGulcehre,
@article{2015OnGulcehre,
title = {On Using Monolingual Corpora in Neural Machine Translation},
author = {Gulcehre Caglar and
author = { Gulcehre Caglar and
Firat Orhan and
Xu Kelvin and
Cho Kyunghyun and
......@@ -10042,8 +10030,8 @@ author = {Zhuang Liu and
Lin Huei Chi and
Bougares Fethi and
Schwenk Holger and
Bengio Yoshua},
publisher = {Computer Science},
Bengio Yoshua },
journal = {Computer Science},
year = {2015},
}
@phdthesis{黄书剑0统计机器翻译中的词对齐研究,
......@@ -10052,12 +10040,12 @@ author = {Zhuang Liu and
publisher={南京大学},
year={2012}
}
@inproceedings{DBLP:journals/corr/MikolovLS13,
@article{DBLP:journals/corr/MikolovLS13,
author = {Tomas Mikolov and
Quoc V. Le and
Ilya Sutskever},
title = {Exploiting Similarities among Languages for Machine Translation},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1309.4168},
year = {2013}
}
......@@ -10077,10 +10065,10 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{1966ASchnemann,
@article{1966ASchnemann,
title={A generalized solution of the orthogonal procrustes problem},
author={Schnemann and Peter},
publisher={Psychometrika},
author={Schnemann, Peter H. },
journal={Psychometrika},
volume={31},
number={1},
pages={1-10},
......@@ -10158,12 +10146,12 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:journals/talip/MarieF20,
@article{DBLP:journals/talip/MarieF20,
author = {Benjamin Marie and
Atsushi Fujita},
title = {Iterative Training of Unsupervised Neural and Statistical Machine
Translation Systems},
publisher = {ACM Transactions on Asian and Low-Resource Language Information Processing},
journal = {{ACM} Trans. Asian Low Resour. Lang. Inf. Process.},
volume = {19},
number = {5},
pages = {68:1--68:21},
......@@ -10197,7 +10185,7 @@ author = {Zhuang Liu and
pages = {7057--7067},
year = {2019}
}
@inproceedings{DBLP:journals/ipm/FarhanTAJATT20,
@article{DBLP:journals/ipm/FarhanTAJATT20,
author = {Wael Farhan and
Bashar Talafha and
Analle Abuammar and
......@@ -10206,13 +10194,13 @@ author = {Zhuang Liu and
Ahmad Bisher Tarakji and
Anas Toma},
title = {Unsupervised dialectal neural machine translation},
publisher = {Information Processing \& Management},
journal = {Information Processing \& Management},
volume = {57},
number = {3},
pages = {102181},
year = {2020}
}
@inproceedings{A2020Li,
@article{A2020Li,
title={A Simple and Effective Approach to Robust Unsupervised Bilingual Dictionary Induction},
author={Yanyang Li and Yingfeng Luo and Ye Lin and Quan Du and Huizhen Wang and Shujian Huang and Tong Xiao and Jingbo Zhu},
publisher={International Conference on Computational Linguistics},
......@@ -10257,8 +10245,7 @@ author = {Zhuang Liu and
publisher = {AAAI Conference on Artificial Intelligence},
year = {2020}
}
@inproceedings{DBLP:journals/corr/abs-2001-08210,
@article{DBLP:journals/corr/abs-2001-08210,
author = {Yinhan Liu and
Jiatao Gu and
Naman Goyal and
......@@ -10268,13 +10255,10 @@ author = {Zhuang Liu and
Mike Lewis and
Luke Zettlemoyer},
title = {Multilingual Denoising Pre-training for Neural Machine Translation},
publisher = {Transactions of the Association for Computational Linguistics},
volume = {8},
pages = {726--742},
journal = {CoRR},
volume = {abs/2001.08210},
year = {2020}
}
@inproceedings{DBLP:conf/aaai/JiZDZCL20,
author = {Baijun Ji and
Zhirui Zhang and
......@@ -10303,25 +10287,25 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2020}
}
@inproceedings{DBLP:journals/corr/abs-2009-08088,
@article{DBLP:journals/corr/abs-2009-08088,
author = {Zhen Yang and
Bojie Hu and
Ambyera Han and
Shen Huang and
Qi Ju},
title = {{CSP:} Code-Switching Pre-training for Neural Machine Translation},
pages = {2624--2636},
publisher = {Conference on Empirical Methods in Natural Language Processing},
title = {Code-switching pre-training for neural machine translation},
journal = {CoRR},
volume = {abs/2009.08088},
year = {2020}
}
@inproceedings{DBLP:journals/corr/abs-2010-09403,
@article{DBLP:journals/corr/abs-2010-09403,
author = {Dusan Varis and
Ondrej Bojar},
title = {Unsupervised Pretraining for Neural Machine Translation Using Elastic
Weight Consolidation},
pages = {130--135},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
journal = {CoRR},
volume = {abs/2010.09403},
year = {2020}
}
@inproceedings{DBLP:conf/emnlp/LampleOCDR18,
author = {Guillaume Lample and
......@@ -10334,11 +10318,11 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:journals/jbd/ShortenK19,
@article{DBLP:journals/jbd/ShortenK19,
author = {Connor Shorten and
Taghi M. Khoshgoftaar},
title = {A survey on Image Data Augmentation for Deep Learning},
publisher = {Journal of Big Data},
journal = {J. Big Data},
volume = {6},
pages = {60},
year = {2019}
......@@ -10361,13 +10345,13 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:journals/corr/abs-1811-01124,
@article{DBLP:journals/corr/abs-1811-01124,
author = {Jean Alaux and
Edouard Grave and
Marco Cuturi and
Armand Joulin},
title = {Unsupervised Hyperalignment for Multilingual Word Embeddings},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1811.01124},
year = {2018}
}
......@@ -10466,10 +10450,9 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2020}
}
@inproceedings{hartmann2018empirical,
@article{hartmann2018empirical,
title={Empirical observations on the instability of aligning word vector spaces with GANs},
author={Hartmann, Mareike and Kementchedjhieva, Yova and S{\o}gaard, Anders},
publisher = {openreview.net},
year={2018}
}
@inproceedings{DBLP:conf/emnlp/Kementchedjhieva19,
......@@ -10532,10 +10515,9 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{2019ADabre,
@article{2019ADabre,
title={A Survey of Multilingual Neural Machine Translation},
author={Dabre, Raj and Chu, Chenhui and Kunchukuttan, Anoop },
publisher={ACM Computing Surveys},
year={2019},
}
@inproceedings{DBLP:conf/naacl/ZophK16,
......@@ -10568,17 +10550,17 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{DBLP:journals/mt/WuW07,
@article{DBLP:journals/mt/WuW07,
author = {Hua Wu and
Haifeng Wang},
title = {Pivot language approach for phrase-based statistical machine translation},
publisher = {Machine Translation},
journal = {Mach. Transl.},
volume = {21},
number = {3},
pages = {165--181},
year = {2007}
}
@inproceedings{Farsi2010somayeh,
@article{Farsi2010somayeh,
author = {Somayeh Bakhshaei and Shahram Khadivi and Noushin Riahi },
title = {Farsi-german statistical machine translation through bridge language},
publisher = {International Telecommunications Symposium},
......@@ -10605,7 +10587,7 @@ author = {Zhuang Liu and
title = {Improving Pivot-Based Statistical Machine Translation by Pivoting
the Co-occurrence Count of Phrase Pairs},
pages = {1665--1675},
publisher = {Annual Meeting of the Association for Computational Linguistics},
publisher = {{ACL}},
year = {2014}
}
@inproceedings{DBLP:conf/acl/MiuraNSTN15,
......@@ -10635,14 +10617,14 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2009}
}
@inproceedings{DBLP:journals/corr/ChengLYSX16,
@article{DBLP:journals/corr/ChengLYSX16,
author = {Yong Cheng and
Yang Liu and
Qian Yang and
Maosong Sun and
Wei Xu},
title = {Neural Machine Translation with Pivot Languages},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1611.04928},
year = {2016}
}
......@@ -10658,7 +10640,7 @@ author = {Zhuang Liu and
@inproceedings{de2006catalan,
title={Catalan-English statistical machine translation without parallel corpus: bridging through Spanish},
author={De Gispert, Adri{\`a} and Marino, Jose B},
publisher={International Conference on Language Resources and Evaluation},
booktitle={Proc. of 5th International Conference on Language Resources and Evaluation (LREC)},
pages={65--68},
year={2006}
}
......@@ -10680,28 +10662,21 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2011}
}
@inproceedings{DBLP:journals/corr/HintonVD15,
@article{DBLP:journals/corr/HintonVD15,
author = {Geoffrey E. Hinton and
Oriol Vinyals and
Jeffrey Dean},
title = {Distilling the Knowledge in a Neural Network},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1503.02531},
year = {2015}
}
@inproceedings{gu2018meta,
author = {Jiatao Gu and
Yong Wang and
Yun Chen and
Victor O. K. Li and
Kyunghyun Cho},
title = {Meta-Learning for Low-Resource Neural Machine Translation},
pages = {3622--3631},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2018}
@article{gu2018meta,
title={Meta-learning for low-resource neural machine translation},
author={Gu, Jiatao and Wang, Yong and Chen, Yun and Cho, Kyunghyun and Li, Victor OK},
journal={arXiv preprint arXiv:1808.08437},
year={2018}
}
@inproceedings{DBLP:conf/naacl/GuHDL18,
author = {Jiatao Gu and
Hany Hassan and
......@@ -10743,11 +10718,11 @@ author = {Zhuang Liu and
publisher = {European Language Resources Association},
year = {2018}
}
@inproceedings{DBLP:journals/tkde/PanY10,
@article{DBLP:journals/tkde/PanY10,
author = {Sinno Jialin Pan and
Qiang Yang},
title = {A Survey on Transfer Learning},
publisher = {IEEE Transactions on knowledge and data engineering},
journal = {IEEE Transactions on knowledge and data engineering},
volume = {22},
number = {10},
pages = {1345--1359},
......@@ -10755,14 +10730,14 @@ author = {Zhuang Liu and
}
@book{2009Handbook,
title={Handbook Of Research On Machine Learning Applications and Trends: Algorithms, Methods and Techniques - 2 Volumes},
author={Olivas, Emilio Soria and Guerrero, Jose David Martin and Sober, Marcelino Martinez and Benedito, Jose Rafael Magdalena and Lopez, Antonio Jose Serrano },
author={ Olivas, Emilio Soria and Guerrero, Jose David Martin and Sober, Marcelino Martinez and Benedito, Jose Rafael Magdalena and Lopez, Antonio Jose Serrano },
publisher={Information Science Reference - Imprint of: IGI Publishing},
year={2009},
}
@incollection{DBLP:books/crc/aggarwal14/Pan14,
author = {Sinno Jialin Pan},
title = {Transfer Learning},
publisher = {Data Classification: Algorithms and Applications},
booktitle = {Data Classification: Algorithms and Applications},
pages = {537--570},
publisher = {{CRC} Press},
year = {2014}
......@@ -10778,22 +10753,16 @@ author = {Zhuang Liu and
publisher = {OpenReview.net},
year = {2019}
}
@inproceedings{platanios2018contextual,
author = {Emmanouil Antonios Platanios and
Mrinmaya Sachan and
Graham Neubig and
Tom M. Mitchell},
title = {Contextual Parameter Generation for Universal Neural Machine Translation},
pages = {425--435},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2018}
@article{platanios2018contextual,
title={Contextual parameter generation for universal neural machine translation},
author={Platanios, Emmanouil Antonios and Sachan, Mrinmaya and Neubig, Graham and Mitchell, Tom},
journal={arXiv preprint arXiv:1808.08493},
year={2018}
}
@inproceedings{ji2020cross,
title={Cross-Lingual Pre-Training Based Transfer for Zero-Shot Neural Machine Translation},
author={Ji, Baijun and Zhang, Zhirui and Duan, Xiangyu and Zhang, Min and Chen, Boxing and Luo, Weihua},
publisher={Proceedings of the AAAI Conference on Artificial Intelligence},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={34},
number={01},
pages={115--122},
......@@ -10829,16 +10798,16 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2009}
}
@inproceedings{dabre2019brief,
@article{dabre2019brief,
title={A Brief Survey of Multilingual Neural Machine Translation},
author={Dabre, Raj and Chu, Chenhui and Kunchukuttan, Anoop},
publisher={arXiv preprint arXiv:1905.05395},
journal={arXiv preprint arXiv:1905.05395},
year={2019}
}
@inproceedings{dabre2020survey,
@article{dabre2020survey,
title={A survey of multilingual neural machine translation},
author={Dabre, Raj and Chu, Chenhui and Kunchukuttan, Anoop},
publisher={ACM Computing Surveys},
journal={ACM Computing Surveys (CSUR)},
volume={53},
number={5},
pages={1--38},
......@@ -10874,13 +10843,13 @@ author = {Zhuang Liu and
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2018}
}
@inproceedings{DBLP:journals/tacl/LeeCH17,
@article{DBLP:journals/tacl/LeeCH17,
author = {Jason Lee and
Kyunghyun Cho and
Thomas Hofmann},
title = {Fully Character-Level Neural Machine Translation without Explicit
Segmentation},
publisher = {Transactions of the Association for Computational Linguistics},
journal = {Transactions of the Association for Computational Linguistics},
volume = {5},
pages = {365--378},
year = {2017}
......@@ -10895,13 +10864,13 @@ author = {Zhuang Liu and
publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
year = {2016}
}
@inproceedings{DBLP:journals/corr/HaNW16,
@article{DBLP:journals/corr/HaNW16,
author = {Thanh-Le Ha and
Jan Niehues and
Alexander H. Waibel},
title = {Toward Multilingual Neural Machine Translation with Universal Encoder
and Decoder},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1611.04798},
year = {2016}
}
......@@ -10968,7 +10937,7 @@ author = {Zhuang Liu and
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2019}
}
@inproceedings{DBLP:journals/corr/abs-1903-07091,
@article{DBLP:journals/corr/abs-1903-07091,
author = {Naveen Arivazhagan and
Ankur Bapna and
Orhan Firat and
......@@ -10976,7 +10945,7 @@ author = {Zhuang Liu and
Melvin Johnson and
Wolfgang Macherey},
title = {The Missing Ingredient in Zero-Shot Neural Machine Translation},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1903.07091},
year = {2019}
}
......@@ -10988,27 +10957,19 @@ author = {Zhuang Liu and
publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{firat2016zero,
author = {Orhan Firat and
Baskaran Sankaran and
Yaser Al-Onaizan and
Fatos T. Yarman-Vural and
Kyunghyun Cho},
title = {Zero-Resource Translation with Multi-Lingual Neural Machine Translation},
pages = {268--277},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2016}
@article{firat2016zero,
title={Zero-resource translation with multi-lingual neural machine translation},
author={Firat, Orhan and Sankaran, Baskaran and Al-Onaizan, Yaser and Vural, Fatos T Yarman and Cho, Kyunghyun},
journal={arXiv preprint arXiv:1606.04164},
year={2016}
}
@inproceedings{DBLP:journals/corr/abs-1805-10338,
@article{DBLP:journals/corr/abs-1805-10338,
author = {Lierni Sestorain and
Massimiliano Ciaramita and
Christian Buck and
Thomas Hofmann},
title = {Zero-Shot Dual Machine Translation},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1805.10338},
year = {2018}
}
......@@ -11088,7 +11049,7 @@ author = {Zhuang Liu and
Yoshua Bengio and
Pierre-Antoine Manzagol},
title = {Extracting and composing robust features with denoising autoencoders},
series = {International Conference on Learning Representations},
series = {{ACM} International Conference Proceeding Series},
volume = {307},
pages = {1096--1103},
publisher = {International Conference on Machine Learning}
......@@ -11102,20 +11063,20 @@ author = {Zhuang Liu and
publisher = {International Conference on Learning Representations},
year = {2018}
}
@inproceedings{DBLP:journals/coling/BhagatH13,
@article{DBLP:journals/coling/BhagatH13,
author = {Rahul Bhagat and
Eduard H. Hovy},
title = {What Is a Paraphrase?},
publisher = {Computational Linguistics},
journal = {Computational Linguistics},
volume = {39},
number = {3},
pages = {463--472},
year = {2013}
}
@inproceedings{2010Generating,
@article{2010Generating,
title={Generating Phrasal and Sentential Paraphrases: A Survey of Data-Driven Methods},
author={ Madnani, Nitin and Dorr, Bonnie J. },
publisher={Computational Linguistics},
journal={Computational Linguistics},
volume={36},
number={3},
pages={341-387},
......@@ -11148,10 +11109,10 @@ author = {Zhuang Liu and
publisher = {Annual Conference of the European Association for Machine Translation},
year = {2017}
}
@inproceedings{2005Improving,
@article{2005Improving,
title={Improving Machine Translation Performance by Exploiting Non-Parallel Corpora},
author={ Munteanu, Ds and Marcu, D },
publisher={Computational Linguistics},
journal={Computational Linguistics},
volume={31},
number={4},
pages={477-504},
......@@ -11167,12 +11128,12 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2010}
}
@inproceedings{DBLP:journals/jair/RuderVS19,
@article{DBLP:journals/jair/RuderVS19,
author = {Sebastian Ruder and
Ivan Vulic and
Anders S{\o}gaard},
title = {A Survey of Cross-lingual Word Embedding Models},
publisher = {Journal of Artificial Intelligence Research},
journal = {J. Artif. Intell. Res.},
volume = {65},
pages = {569--631},
year = {2019}
......@@ -11187,14 +11148,14 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
}
@inproceedings{DBLP:journals/tacl/TuLLLL17,
@article{DBLP:journals/tacl/TuLLLL17,
author = {Zhaopeng Tu and
Yang Liu and
Zhengdong Lu and
Xiaohua Liu and
Hang Li},
title = {Context Gates for Neural Machine Translation},
publisher = {Annual Meeting of the Association for Computational Linguistics},
journal = {Annual Meeting of the Association for Computational Linguistics},
volume = {5},
pages = {87--99},
year = {2017}
......@@ -11214,21 +11175,12 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{ng2019facebook,
author = {Nathan Ng and
Kyra Yee and
Alexei Baevski and
Myle Ott and
Michael Auli and
Sergey Edunov},
title = {Facebook FAIR's {WMT19} News Translation Task Submission},
pages = {314--319},
publisher = {Association for Computational Linguistics},
year = {2019}
@article{ng2019facebook,
title={Facebook FAIR's WMT19 News Translation Task Submission},
author={Ng, Nathan and Yee, Kyra and Baevski, Alexei and Ott, Myle and Auli, Michael and Edunov, Sergey},
journal={arXiv preprint arXiv:1907.06616},
year={2019}
}
@inproceedings{DBLP:conf/wmt/WangLLJZLLXZ18,
author = {Qiang Wang and
Bei Li and
......@@ -11275,9 +11227,7 @@ author = {Zhuang Liu and
publisher = {Conference and Workshop on Neural Information Processing Systems},
year = {2015}
}
@inproceedings{DBLP:journals/corr/abs-1802-05365,
@article{DBLP:journals/corr/abs-1802-05365,
author = {Matthew E. Peters and
Mark Neumann and
Mohit Iyyer and
......@@ -11285,12 +11235,11 @@ author = {Zhuang Liu and
Christopher Clark and
Kenton Lee and
Luke Zettlemoyer},
title = {Deep Contextualized Word Representations},
pages = {2227--2237},
publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
title = {Deep contextualized word representations},
journal = {CoRR},
volume = {abs/1802.05365},
year = {2018}
}
@inproceedings{DBLP:conf/icml/CollobertW08,
author = {Ronan Collobert and
Jason Weston},
......@@ -11347,13 +11296,13 @@ author = {Zhuang Liu and
publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:journals/corr/abs-1908-06259,
@article{DBLP:journals/corr/abs-1908-06259,
author = {Tianyu He and
Xu Tan and
Tao Qin},
title = {Hard but Robust, Easy but Sensitive: How Encoder and Decoder Perform
in Neural Machine Translation},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1908.06259},
year = {2019}
}
......@@ -11378,18 +11327,12 @@ author = {Zhuang Liu and
publisher = {Springer},
year = {1998}
}
@inproceedings{liu2019multi,
author = {Xiaodong Liu and
Pengcheng He and
Weizhu Chen and
Jianfeng Gao},
title = {Multi-Task Deep Neural Networks for Natural Language Understanding},
pages = {4487--4496},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
@article{liu2019multi,
title={Multi-task deep neural networks for natural language understanding},
author={Liu, Xiaodong and He, Pengcheng and Chen, Weizhu and Gao, Jianfeng},
journal={arXiv preprint arXiv:1901.11504},
year={2019}
}
@inproceedings{DBLP:journals/corr/LuongLSVK15,
author = {Minh-Thang Luong and
Quoc V. Le and
......@@ -11408,7 +11351,7 @@ author = {Zhuang Liu and
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2016}
}
@inproceedings{DBLP:journals/tacl/JohnsonSLKWCTVW17,
@article{DBLP:journals/tacl/JohnsonSLKWCTVW17,
author = {Melvin Johnson and
Mike Schuster and
Quoc V. Le and
......@@ -11423,19 +11366,19 @@ author = {Zhuang Liu and
Jeffrey Dean},
title = {Google's Multilingual Neural Machine Translation System: Enabling
Zero-Shot Translation},
publisher = {Transactions of the Association for Computational Linguistics},
journal = {Transactions of the Association for Computational Linguistics},
volume = {5},
pages = {339--351},
year = {2017}
}
@inproceedings{DBLP:journals/csl/GulcehreFXCB17,
@article{DBLP:journals/csl/GulcehreFXCB17,
author = {{\c{C}}aglar G{\"{u}}l{\c{c}}ehre and
Orhan Firat and
Kelvin Xu and
Kyunghyun Cho and
Yoshua Bengio},
title = {On integrating a language model into neural machine translation},
publisher = {Computational Linguistics},
journal = {Computational Linguistics},
volume = {45},
pages = {137--148},
year = {2017}
......@@ -11503,10 +11446,10 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2013}
}
@inproceedings{imamura2016multi,
@article{imamura2016multi,
title={Multi-domain adaptation for statistical machine translation based on feature augmentation},
author={Imamura, Kenji and Sumita, Eiichiro},
publisher={Association for Machine Translation in the Americas},
journal={Association for Machine Translation in the Americas},
pages={79},
year={2016}
}
......@@ -11529,10 +11472,10 @@ author = {Zhuang Liu and
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2010}
}
@inproceedings{shah2012general,
@article{shah2012general,
title={A general framework to weight heterogeneous parallel data for model adaptation in statistical machine translation},
author={Shah, Kashif and Barrault, Lo{\i}c and Schwenk, Holger and Le Mans, France},
publisher={Machine Translation Summit},
journal={MT Summit, Octobre},
year={2012}
}
@inproceedings{DBLP:conf/iwslt/MansourN12,
......@@ -11588,17 +11531,17 @@ author = {Zhuang Liu and
publisher = {International Conference on Computational Linguistics},
year = {2014}
}
@inproceedings{joty2015using,
@article{joty2015using,
title={Using joint models for domain adaptation in statistical machine translation},
author={Joty, Nadir Durrani Hassan Sajjad Shafiq and Vogel, Ahmed Abdelali Stephan},
publisher={Proceedings of MT Summit XV},
journal={Proceedings of MT Summit XV},
pages={117},
year={2015}
}
@inproceedings{chen2016bilingual,
title={Bilingual methods for adaptive training data selection for machine translation},
author={Chen, Boxing and Kuhn, Roland and Foster, George and Cherry, Colin and Huang, Fei},
publisher={Association for Machine Translation in the Americas},
booktitle={Association for Machine Translation in the Americas},
pages={93--103},
year={2016}
}
......@@ -11669,7 +11612,7 @@ author = {Zhuang Liu and
publisher={International Workshop on Spoken Language Translation},
year={2011}
}
@inproceedings{moore2010intelligent,
@article{moore2010intelligent,
title = {Intelligent selection of language model training data},
author = {Moore, Robert C and Lewis, Will},
publisher = {Annual Meeting of the Association for Computational Linguistics},
......@@ -11716,16 +11659,16 @@ author = {Zhuang Liu and
publisher = {International Conference on Computational Linguistics},
year = {2016}
}
@inproceedings{chu2015integrated,
@article{chu2015integrated,
title={Integrated parallel data extraction from comparable corpora for statistical machine translation},
author={Chu, Chenhui},
year={2015},
publisher={Kyoto University}
}
@inproceedings{DBLP:journals/tit/Scudder65a,
@article{DBLP:journals/tit/Scudder65a,
author = {H. J. Scudder III},
title = {Probability of error of some adaptive pattern-recognition machines},
publisher = {{IEEE} Transactions on Information Theory},
journal = {{IEEE} Transactions on Information Theory},
volume = {11},
number = {3},
pages = {363--371},
......@@ -11739,14 +11682,14 @@ author = {Zhuang Liu and
publisher = {International Conference on Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:journals/corr/abs-1708-08712,
@article{DBLP:journals/corr/abs-1708-08712,
author = {Hassan Sajjad and
Nadir Durrani and
Fahim Dalvi and
Yonatan Belinkov and
Stephan Vogel},
title = {Neural Machine Translation Training in a Multi-Domain Scenario},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1708.08712},
year = {2017}
}
......@@ -11813,7 +11756,7 @@ author = {Zhuang Liu and
@inproceedings{britz2017effective,
title={Effective domain mixing for neural machine translation},
author={Britz, Denny and Le, Quoc and Pryzant, Reid},
publisher={Proceedings of the Second Conference on Machine Translation},
booktitle={Proceedings of the Second Conference on Machine Translation},
pages={118--126},
year={2017}
}
......@@ -11848,21 +11791,21 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{DBLP:journals/corr/abs-1906-03129,
@article{DBLP:journals/corr/abs-1906-03129,
author = {Shen Yan and
Leonard Dahlmann and
Pavel Petrushkov and
Sanjika Hewavitharana and
Shahram Khadivi},
title = {Word-based Domain Adaptation for Neural Machine Translation},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1906.03129},
year = {2019}
}
@inproceedings{dakwale2017finetuning,
@article{dakwale2017finetuning,
title={Finetuning for neural machine translation with limited degradation across in-and out-of-domain data},
author={Dakwale, Praveen and Monz, Christof},
publisher={Proceedings of the XVI Machine Translation Summit},
journal={Proceedings of the XVI Machine Translation Summit},
volume={117},
year={2017}
}
......@@ -11879,19 +11822,12 @@ author = {Zhuang Liu and
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2019}
}
@inproceedings{barone2017regularization,
author = {Antonio Valerio Miceli Barone and
Barry Haddow and
Ulrich Germann and
Rico Sennrich},
title = {Regularization techniques for fine-tuning in neural machine translation},
pages = {1489--1494},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2017}
@article{barone2017regularization,
title={Regularization techniques for fine-tuning in neural machine translation},
author={Barone, Antonio Valerio Miceli and Haddow, Barry and Germann, Ulrich and Sennrich, Rico},
journal={arXiv preprint arXiv:1707.09920},
year={2017}
}
@inproceedings{DBLP:conf/acl/SaundersB20,
author = {Danielle Saunders and
Bill Byrne},
......@@ -11904,7 +11840,7 @@ author = {Zhuang Liu and
@inproceedings{khayrallah2017neural,
title={Neural lattice search for domain adaptation in machine translation},
author={Khayrallah, Huda and Kumar, Gaurav and Duh, Kevin and Post, Matt and Koehn, Philipp},
publisher={Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
booktitle={Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
pages={20--25},
year={2017}
}
......@@ -11918,11 +11854,11 @@ author = {Zhuang Liu and
publisher = {Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:journals/corr/FreitagA16,
@article{DBLP:journals/corr/FreitagA16,
author = {Markus Freitag and
Yaser Al-Onaizan},
title = {Fast Domain Adaptation for Neural Machine Translation},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1612.06897},
year = {2016}
}
......@@ -11945,10 +11881,10 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{DBLP:journals/ibmrd/Luhn58,
@article{DBLP:journals/ibmrd/Luhn58,
author = {Hans Peter Luhn},
title = {The Automatic Creation of Literature Abstracts},
publisher = {IBM Journal of research and development},
journal = {{IBM} J. Res. Dev.},
volume = {2},
number = {2},
pages = {159--165},
......@@ -12011,7 +11947,7 @@ author = {Zhuang Liu and
publisher = {Annual Conference of the North American Chapter of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:journals/corr/abs-2010-11125,
@article{DBLP:journals/corr/abs-2010-11125,
author = {Angela Fan and
Shruti Bhosale and
Holger Schwenk and
......@@ -12030,7 +11966,7 @@ author = {Zhuang Liu and
Michael Auli and
Armand Joulin},
title = {Beyond English-Centric Multilingual Machine Translation},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/2010.11125},
year = {2020}
}
......@@ -12102,13 +12038,13 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:journals/ejasmp/RadzikowskiNWY19,
@article{DBLP:journals/ejasmp/RadzikowskiNWY19,
author = {Kacper Radzikowski and
Robert Nowak and
Le Wang and
Osamu Yoshie},
title = {Dual supervised learning for non-native speech recognition},
publisher = {EURASIP Journal on Audio, Speech, and Music Processing},
journal = {{EURASIP} J. Audio Speech Music. Process.},
volume = {2019},
pages = {3},
year = {2019}
......@@ -12130,13 +12066,13 @@ author = {Zhuang Liu and
publisher = {{IEEE} Computer Society},
year = {2017}
}
@inproceedings{DBLP:journals/access/DuRZH20,
@article{DBLP:journals/access/DuRZH20,
author = {Liang Du and
Xin Ren and
Peng Zhou and
Zhiguo Hu},
title = {Unsupervised Dual Learning for Feature and Instance Selection},
publisher = {{IEEE} Access},
journal = {{IEEE} Access},
volume = {8},
pages = {170248--170260},
year = {2020}
......@@ -12150,7 +12086,6 @@ author = {Zhuang Liu and
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2020}
}
@inproceedings{DBLP:conf/nips/YangDYCSL19,
author = {Zhilin Yang and
Zihang Dai and
......@@ -12159,14 +12094,13 @@ author = {Zhuang Liu and
Ruslan Salakhutdinov and
Quoc V. Le},
title = {XLNet: Generalized Autoregressive Pretraining for Language Understanding},
publisher = {Annual Conference on Neural Information Processing Systems},
pages = {5754--5764},
year = {2019}
}
@inproceedings{lewis2019bart,
@article{lewis2019bart,
title={Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension},
author={Lewis, Mike and Liu, Yinhan and Goyal, Naman and Ghazvininejad, Marjan and Mohamed, Abdelrahman and Levy, Omer and Stoyanov, Ves and Zettlemoyer, Luke},
publisher={arXiv preprint arXiv:1910.13461},
journal={arXiv preprint arXiv:1910.13461},
year={2019}
}
@inproceedings{DBLP:conf/iclr/LanCGGSS20,
......@@ -12218,7 +12152,7 @@ author = {Zhuang Liu and
publisher = {International Conference on Computer Vision},
year = {2019}
}
@inproceedings{DBLP:journals/corr/abs-2010-12831,
@article{DBLP:journals/corr/abs-2010-12831,
author = {Liunian Harold Li and
Haoxuan You and
Zhecan Wang and
......@@ -12227,7 +12161,7 @@ author = {Zhuang Liu and
Kai-Wei Chang},
title = {Weakly-supervised VisualBERT: Pre-training without Parallel Images
and Captions},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/2010.12831},
year = {2020}
}
......@@ -12277,18 +12211,18 @@ author = {Zhuang Liu and
@inproceedings{shen2020q,
title={Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT.},
author={Shen, Sheng and Dong, Zhen and Ye, Jiayu and Ma, Linjian and Yao, Zhewei and Gholami, Amir and Mahoney, Michael W and Keutzer, Kurt},
publisher={AAAI Conference on Artificial Intelligence},
booktitle={AAAI Conference on Artificial Intelligence},
pages={8815--8821},
year={2020}
}
@inproceedings{DBLP:journals/corr/abs-1910-01108,
@article{DBLP:journals/corr/abs-1910-01108,
author = {Victor Sanh and
Lysandre Debut and
Julien Chaumond and
Thomas Wolf},
title = {DistilBERT, a distilled version of {BERT:} smaller, faster, cheaper
and lighter},
publisher = {CoRR},
journal = {CoRR},
volume = {abs/1910.01108},
year = {2019}
}
......@@ -13248,6 +13182,728 @@ author = {Zhuang Liu and
publisher={电子工业出版社},
year={2020}
}
%%%%%%%%%%%%%%%%%王屹超部分,孟霞加%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@inproceedings{DBLP:conf/mm/LinMSYYGZL20,
author = {Huan Lin and
Fandong Meng and
Jinsong Su and
Yongjing Yin and
Zhengyuan Yang and
Yubin Ge and
Jie Zhou and
Jiebo Luo},
title = {Dynamic Context-guided Capsule Network for Multimodal Machine Translation},
pages = {1320--1329},
publisher = { ACM Multimedia},
year = {2020}
}
@inproceedings{DBLP:conf/wmt/SpeciaFSE16,
author = {Lucia Specia and
Stella Frank and
Khalil Sima'an and
Desmond Elliott},
title = {A Shared Task on Multimodal Machine Translation and Crosslingual Image
Description},
pages = {543--553},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
}
@inproceedings{DBLP:conf/wmt/ElliottFBBS17,
author = {Desmond Elliott and
Stella Frank and
Lo{\"{\i}}c Barrault and
Fethi Bougares and
Lucia Specia},
title = {Findings of the Second Shared Task on Multimodal Machine Translation
and Multilingual Image Description},
pages = {215--233},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{DBLP:conf/wmt/BarraultBSLEF18,
author = {Lo{\"{\i}}c Barrault and
Fethi Bougares and
Lucia Specia and
Chiraag Lala and
Desmond Elliott and
Stella Frank},
title = {Findings of the Third Shared Task on Multimodal Machine Translation},
pages = {304--323},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/wmt/CaglayanABGBBMH17,
author = {Ozan Caglayan and
Walid Aransa and
Adrien Bardet and
Mercedes Garc{\'{\i}}a-Mart{\'{\i}}nez and
Fethi Bougares and
Lo{\"{\i}}c Barrault and
Marc Masana and
Luis Herranz and
Joost van de Weijer},
title = {{LIUM-CVC} Submissions for {WMT17} Multimodal Translation Task},
pages = {432--439},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@inproceedings{DBLP:conf/wmt/LibovickyHTBP16,
author = {Jindrich Libovick{\'{y}} and
Jindrich Helcl and
Marek Tlust{\'{y}} and
Ondrej Bojar and
Pavel Pecina},
title = {{CUNI} System for {WMT16} Automatic Post-Editing and Multimodal Translation
Tasks},
pages = {646--654},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
}
@inproceedings{DBLP:conf/emnlp/CalixtoL17,
author = {Iacer Calixto and
Qun Liu},
title = {Incorporating Global Visual Features into Attention-based Neural Machine
Translation},
pages = {992--1003},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2017}
}
@inproceedings{DBLP:conf/wmt/HuangLSOD16,
author = {Po-Yao Huang and
Frederick Liu and
Sz-Rung Shiang and
Jean Oh and
Chris Dyer},
title = {Attention-based Multimodal Neural Machine Translation},
pages = {639--645},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2016}
}
@article{Elliott2015MultilingualID,
title={Multilingual Image Description with Neural Sequence Models},
author={Desmond Elliott and
Stella Frank and
Eva Hasler},
journal={arXiv: Computation and Language},
year={2015}
}
@inproceedings{DBLP:conf/wmt/MadhyasthaWS17,
author = {Pranava Swaroop Madhyastha and
Josiah Wang and
Lucia Specia},
title = {Sheffield MultiMT: Using Object Posterior Predictions for Multimodal
Machine Translation},
pages = {470--476},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@article{DBLP:journals/corr/CaglayanBB16,
author = {Ozan Caglayan and
Lo{\"{\i}}c Barrault and
Fethi Bougares},
title = {Multimodal Attention for Neural Machine Translation},
journal = {CoRR},
volume = {abs/1609.03976},
year = {2016}
}
@inproceedings{DBLP:conf/acl/CalixtoLC17,
author = {Iacer Calixto and
Qun Liu and
Nick Campbell},
title = {Doubly-Attentive Decoder for Multi-modal Neural Machine Translation},
pages = {1913--1924},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@article{DBLP:journals/corr/DelbrouckD17,
author = {Jean-Benoit Delbrouck and
St{\'{e}}phane Dupont},
title = {Multimodal Compact Bilinear Pooling for Multimodal Neural Machine
Translation},
journal = {CoRR},
volume = {abs/1703.08084},
year = {2017}
}
@inproceedings{DBLP:conf/acl/LibovickyH17,
author = {Jindrich Libovick{\'{y}} and
Jindrich Helcl},
title = {Attention Strategies for Multi-Source Sequence-to-Sequence Learning},
pages = {196--202},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2017}
}
@article{DBLP:journals/corr/abs-1712-03449,
author = {Jean-Benoit Delbrouck and
St{\'{e}}phane Dupont},
title = {Modulating and attending the source image during encoding improves
Multimodal Translation},
journal = {CoRR},
volume = {abs/1712.03449},
year = {2017}
}
@article{DBLP:journals/corr/abs-1807-11605,
author = {Hasan Sait Arslan and
Mark Fishel and
Gholamreza Anbarjafari},
title = {Doubly Attentive Transformer Machine Translation},
journal = {CoRR},
volume = {abs/1807.11605},
year = {2018}
}
@inproceedings{DBLP:conf/wmt/HelclLV18,
author = {Jindrich Helcl and
Jindrich Libovick{\'{y}} and
Dusan Varis},
title = {{CUNI} System for the {WMT18} Multimodal Translation Task},
pages = {616--623},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2018}
}
@inproceedings{DBLP:conf/ijcnlp/ElliottK17,
author = {Desmond Elliott and
{\'{A}}kos K{\'{a}}d{\'{a}}r},
title = {Imagination Improves Multimodal Translation},
pages = {130--141},
publisher = {International Joint Conference on Natural Language Processing},
year = {2017}
}
@inproceedings{DBLP:conf/emnlp/ZhouCLY18,
author = {Mingyang Zhou and
Runxiang Cheng and
Yong Jae Lee and
Zhou Yu},
title = {A Visual Attention Grounding Neural Model for Multimodal Machine Translation},
pages = {3643--3653},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2018}
}
@inproceedings{DBLP:conf/acl/CalixtoRA19,
author = {Iacer Calixto and
Miguel Rios and
Wilker Aziz},
title = {Latent Variable Model for Multi-modal Translation},
pages = {6392--6405},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2019}
}
@inproceedings{DBLP:conf/acl/YinMSZYZL20,
author = {Yongjing Yin and
Fandong Meng and
Jinsong Su and
Chulun Zhou and
Zhengyuan Yang and
Jie Zhou and
Jiebo Luo},
title = {A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine
Translation},
pages = {3025--3035},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2020}
}
@inproceedings{DBLP:conf/acl/YaoW20,
author = {Shaowei Yao and
Xiaojun Wan},
title = {Multimodal Transformer for Multimodal Machine Translation},
pages = {4346--4350},
publisher = {Annual Meeting of the Association for Computational Linguistics},
year = {2020}
}
@inproceedings{DBLP:conf/nips/LuYBP16,
author = {Jiasen Lu and
Jianwei Yang and
Dhruv Batra and
Devi Parikh},
title = {Hierarchical Question-Image Co-Attention for Visual Question Answering},
booktitle = {Conference on Neural Information Processing Systems},
pages = {289--297},
year = {2016}
}
@inproceedings{DBLP:conf/cvpr/VinyalsTBE15,
author = {Oriol Vinyals and
Alexander Toshev and
Samy Bengio and
Dumitru Erhan},
title = {Show and tell: {A} neural image caption generator},
pages = {3156--3164},
publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2015}
}
@inproceedings{DBLP:conf/icml/XuBKCCSZB15,
author = {Kelvin Xu and
Jimmy Ba and
Ryan Kiros and
Kyunghyun Cho and
Aaron C. Courville and
Ruslan Salakhutdinov and
Richard S. Zemel and
Yoshua Bengio},
title = {Show, Attend and Tell: Neural Image Caption Generation with Visual
Attention},
volume = {37},
pages = {2048--2057},
publisher = {International Conference on Machine Learning},
year = {2015}
}
@inproceedings{DBLP:conf/cvpr/YouJWFL16,
author = {Quanzeng You and
Hailin Jin and
Zhaowen Wang and
Chen Fang and
Jiebo Luo},
title = {Image Captioning with Semantic Attention},
pages = {4651--4659},
publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2016}
}
@inproceedings{DBLP:conf/cvpr/ChenZXNSLC17,
author = {Long Chen and
Hanwang Zhang and
Jun Xiao and
Liqiang Nie and
Jian Shao and
Wei Liu and
Tat-Seng Chua},
title = {{SCA-CNN:} Spatial and Channel-Wise Attention in Convolutional Networks
for Image Captioning},
pages = {6298--6306},
publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2017}
}
@article{DBLP:journals/pami/FuJCSZ17,
author = {Kun Fu and
Junqi Jin and
Runpeng Cui and
Fei Sha and
Changshui Zhang},
title = {Aligning Where to See and What to Tell: Image Captioning with Region-Based
Attention and Scene-Specific Contexts},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
volume = {39},
number = {12},
pages = {2321--2334},
year = {2017}
}
@inproceedings{DBLP:conf/eccv/YaoPLM18,
author = {Ting Yao and
Yingwei Pan and
Yehao Li and
Tao Mei},
title = {Exploring Visual Relationship for Image Captioning},
series = {Lecture Notes in Computer Science},
volume = {11218},
pages = {711--727},
publisher = {European Conference on Computer Vision},
year = {2018}
}
@inproceedings{DBLP:conf/ijcai/LiuSWWY17,
author = {Chang Liu and
Fuchun Sun and
Changhu Wang and
Feng Wang and
Alan L. Yuille},
title = {{MAT:} {A} Multimodal Attentive Translator for Image Captioning},
pages = {4033--4039},
publisher = {International Joint Conference on Artificial Intelligence},
year = {2017}
}
@article{DBLP:journals/corr/abs-1804-02767,
author = {Joseph Redmon and
Ali Farhadi},
title = {YOLOv3: An Incremental Improvement},
journal = {CoRR},
volume = {abs/1804.02767},
year = {2018}
}
@article{DBLP:journals/corr/abs-2004-10934,
author = {Alexey Bochkovskiy and
Chien-Yao Wang and
Hong-Yuan Mark Liao},
title = {YOLOv4: Optimal Speed and Accuracy of Object Detection},
journal = {CoRR},
volume = {abs/2004.10934},
year = {2020}
}
@inproceedings{DBLP:conf/cvpr/LuXPS17,
author = {Jiasen Lu and
Caiming Xiong and
Devi Parikh and
Richard Socher},
title = {Knowing When to Look: Adaptive Attention via a Visual Sentinel for
Image Captioning},
pages = {3242--3250},
publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2017}
}
@inproceedings{DBLP:conf/cvpr/00010BT0GZ18,
author = {Peter Anderson and
Xiaodong He and
Chris Buehler and
Damien Teney and
Mark Johnson and
Stephen Gould and
Lei Zhang},
title = {Bottom-Up and Top-Down Attention for Image Captioning and Visual Question
Answering},
pages = {6077--6086},
publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2018}
}
@inproceedings{DBLP:conf/mm/ZhouXKC17,
author = {Luowei Zhou and
Chenliang Xu and
Parker A. Koch and
Jason J. Corso},
title = {Watch What You Just Said: Image Captioning with Text-Conditional Attention},
pages = {305--313},
publisher = {ACM Multimedia},
year = {2017}
}
@article{DBLP:journals/mta/FangWCT18,
author = {Fang Fang and
Hanli Wang and
Yihao Chen and
Pengjie Tang},
title = {Looking deeper and transferring attention for image captioning},
journal = {Multimedia Tools Applications},
volume = {77},
number = {23},
pages = {31159--31175},
year = {2018}
}
@inproceedings{DBLP:conf/cvpr/AnejaDS18,
author = {Jyoti Aneja and
Aditya Deshpande and
Alexander G. Schwing},
title = {Convolutional Image Captioning},
pages = {5561--5570},
publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2018}
}
@article{DBLP:journals/corr/abs-1805-09019,
author = {Qingzhong Wang and
Antoni B. Chan},
title = {{CNN+CNN:} Convolutional Decoders for Image Captioning},
journal = {CoRR},
volume = {abs/1805.09019},
year = {2018}
}
@inproceedings{DBLP:conf/eccv/DaiYL18,
author = {Bo Dai and
Deming Ye and
Dahua Lin},
title = {Rethinking the Form of Latent States in Image Captioning},
volume = {11209},
pages = {294--310},
publisher = {European Conference on Computer Vision},
year = {2018}
}
@inproceedings{DBLP:conf/iccv/AntolALMBZP15,
author = {Stanislaw Antol and
Aishwarya Agrawal and
Jiasen Lu and
Margaret Mitchell and
Dhruv Batra and
C. Lawrence Zitnick and
Devi Parikh},
title = {{VQA:} Visual Question Answering},
pages = {2425--2433},
publisher = {International Conference on Computer Vision},
year = {2015}
}
@inproceedings{DBLP:conf/eccv/CarionMSUKZ20,
author = {Nicolas Carion and
Francisco Massa and
Gabriel Synnaeve and
Nicolas Usunier and
Alexander Kirillov and
Sergey Zagoruyko},
title = {End-to-End Object Detection with Transformers},
volume = {12346},
pages = {213--229},
publisher = {European Conference on Computer Vision},
year = {2020}
}
@article{DBLP:journals/tcsv/YuLYH20,
author = {Jun Yu and
Jing Li and
Zhou Yu and
Qingming Huang},
title = {Multimodal Transformer With Multi-View Visual Representation for Image
Captioning},
journal = {IEEE Transactions on Circuits and Systems for Video Technology},
volume = {30},
number = {12},
pages = {4467--4480},
year = {2020}
}
@article{Huasong2020SelfAdaptiveNM,
title={Self-Adaptive Neural Module Transformer for Visual Question Answering},
author={Zhong Huasong and Jingyuan Chen and Chen Shen and Hanwang Zhang and Jianqiang Huang and Xian-Sheng Hua},
journal={IEEE Transactions on Multimedia},
year={2020},
pages={1-1}
}
@inproceedings{DBLP:conf/emnlp/GokhaleBBY20,
author = {Tejas Gokhale and
Pratyay Banerjee and
Chitta Baral and
Yezhou Yang},
title = {{MUTANT:} {A} Training Paradigm for Out-of-Distribution Generalization
in Visual Question Answering},
pages = {878--892},
publisher = {Conference on Empirical Methods in Natural Language Processing},
year = {2020}
}
@inproceedings{DBLP:conf/eccv/Tang0ZWY20,
author = {Ruixue Tang and
Chao Ma and
Wei Emma Zhang and
Qi Wu and
Xiaokang Yang},
title = {Semantic Equivalent Adversarial Data Augmentation for Visual Question
Answering},
volume = {12364},
pages = {437--453},
publisher = { European Conference on Computer Vision},
year = {2020}
}
@inproceedings{DBLP:conf/eccv/Li0LZHZWH0WCG20,
author = {Xiujun Li and
Xi Yin and
Chunyuan Li and
Pengchuan Zhang and
Xiaowei Hu and
Lei Zhang and
Lijuan Wang and
Houdong Hu and
Li Dong and
Furu Wei and
Yejin Choi and
Jianfeng Gao},
title = {Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks},
volume = {12375},
pages = {121--137},
publisher = { European Conference on Computer Vision},
year = {2020}
}
@inproceedings{DBLP:conf/aaai/ZhouPZHCG20,
author = {Luowei Zhou and
Hamid Palangi and
Lei Zhang and
Houdong Hu and
Jason J. Corso and
Jianfeng Gao},
title = {Unified Vision-Language Pre-Training for Image Captioning and {VQA}},
pages = {13041--13049},
publisher = {AAAI Conference on Artificial Intelligence},
year = {2020}
}
@inproceedings{DBLP:conf/iclr/SuZCLLWD20,
author = {Weijie Su and
Xizhou Zhu and
Yue Cao and
Bin Li and
Lewei Lu and
Furu Wei and
Jifeng Dai},
title = {{VL-BERT:} Pre-training of Generic Visual-Linguistic Representations},
publisher = {International Conference on Learning Representations},
year = {2020}
}
@inproceedings{DBLP:conf/nips/GoodfellowPMXWOCB14,
author = {Ian J. Goodfellow and
Jean Pouget-Abadie and
Mehdi Mirza and
Bing Xu and
David Warde-Farley and
Sherjil Ozair and
Aaron C. Courville and
Yoshua Bengio},
title = {Generative Adversarial Nets},
publisher = {Conference on Neural Information Processing Systems},
pages = {2672--2680},
year = {2014}
}
@inproceedings{DBLP:conf/nips/ZhuZPDEWS17,
author = {Jun-Yan Zhu and
Richard Zhang and
Deepak Pathak and
Trevor Darrell and
Alexei A. Efros and
Oliver Wang and
Eli Shechtman},
title = {Toward Multimodal Image-to-Image Translation},
publisher = {Conference on Neural Information Processing Systems},
pages = {465--476},
year = {2017}
}
@article{DBLP:journals/corr/abs-1908-06616,
author = {Hajar Emami and
Majid Moradi Aliabadi and
Ming Dong and
Ratna Babu Chinnam},
title = {{SPA-GAN:} Spatial Attention {GAN} for Image-to-Image Translation},
journal = {CoRR},
volume = {abs/1908.06616},
year = {2019}
}
@article{DBLP:journals/access/XiongWG19,
author = {Feng Xiong and
Qianqian Wang and
Quanxue Gao},
title = {Consistent Embedded {GAN} for Image-to-Image Translation},
journal = {International Conference on Access Networks},
volume = {7},
pages = {126651--126661},
year = {2019}
}
@inproceedings{DBLP:conf/iccv/ZhuPIE17,
author = {Jun-Yan Zhu and
Taesung Park and
Phillip Isola and
Alexei A. Efros},
title = {Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial
Networks},
pages = {2242--2251},
publisher = {International Conference on Computer Vision},
year = {2017}
}
@inproceedings{DBLP:conf/iccv/YiZTG17,
author = {Zili Yi and
Hao (Richard) Zhang and
Ping Tan and
Minglun Gong},
title = {DualGAN: Unsupervised Dual Learning for Image-to-Image Translation},
pages = {2868--2876},
publisher = {International Conference on Computer Vision},
year = {2017}
}
@inproceedings{DBLP:conf/nips/LiuBK17,
author = {Ming-Yu Liu and
Thomas Breuel and
Jan Kautz},
title = {Unsupervised Image-to-Image Translation Networks},
publisher = {Conference on Neural Information Processing Systems},
pages = {700--708},
year = {2017}
}
@inproceedings{DBLP:conf/cvpr/IsolaZZE17,
author = {Phillip Isola and
Jun-Yan Zhu and
Tinghui Zhou and
Alexei A. Efros},
title = {Image-to-Image Translation with Conditional Adversarial Networks},
pages = {5967--5976},
publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2017}
}
@inproceedings{DBLP:conf/icml/ReedAYLSL16,
author = {Scott E. Reed and
Zeynep Akata and
Xinchen Yan and
Lajanugen Logeswaran and
Bernt Schiele and
Honglak Lee},
title = {Generative Adversarial Text to Image Synthesis},
volume = {48},
pages = {1060--1069},
publisher = {International Conference on Machine Learning},
year = {2016}
}
@article{DBLP:journals/corr/DashGALA17,
author = {Ayushman Dash and
John Cristian Borges Gamboa and
Sheraz Ahmed and
Marcus Liwicki and
Muhammad Zeshan Afzal},
title = {{TAC-GAN} - Text Conditioned Auxiliary Classifier Generative Adversarial
Network},
journal = {CoRR},
volume = {abs/1703.06412},
year = {2017}
}
@inproceedings{DBLP:conf/nips/ReedAMTSL16,
author = {Scott E. Reed and
Zeynep Akata and
Santosh Mohan and
Samuel Tenka and
Bernt Schiele and
Honglak Lee},
title = {Learning What and Where to Draw},
publisher = {Conference on Neural Information Processing Systems},
pages = {217--225},
year = {2016}
}
@inproceedings{DBLP:conf/cvpr/ZhangXY18,
author = {Zizhao Zhang and
Yuanpu Xie and
Lin Yang},
title = {Photographic Text-to-Image Synthesis With a Hierarchically-Nested
Adversarial Network},
pages = {6199--6208},
publisher = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2018}
}
%%%%% chapter 17------------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论