注意力机制在深度学习中的研究进展

朱张莉,饶元,吴渊,祁江楠,张钰

PDF(2547 KB)
PDF(2547 KB)
中文信息学报 ›› 2019, Vol. 33 ›› Issue (6) : 1-11.
综述

注意力机制在深度学习中的研究进展

  • 朱张莉1,饶元1,吴渊1,祁江楠1,张钰2
作者信息 +

Research Progress of Attention Mechanism in Deep Learning

  • ZHU Zhangli1, RAO Yuan1, WU Yuan1, QI Jiangnan1, ZHANG Yu2
Author information +
History +

摘要

注意力机制逐渐成为目前深度学习领域的主流方法和研究热点之一,它通过改进源语言表达方式,在解码中动态选择源语言相关信息,从而极大改善了经典Encoder-Decoder框架的不足。该文在提出传统基于Encoder-Decoder框架中存在的长程记忆能力有限、序列转化过程中的相互关系、模型动态结构输出质量等问题的基础上,描述了注意力机制的定义和原理,介绍了多种不同的分类方式,分析了目前的研究现状,并叙述了目前注意力机制在图像识别、语音识别和自然语言处理等重要领域的应用情况。同时,进一步从多模态注意力机制、注意力的评价机制、模型的可解释性及注意力与新模型的融合等方面进行了探讨,从而为注意力机制在深度学习中的应用提供新的研究线索与方向。

Abstract

The attention mechanism has gradually become one of the popular methods and research issues in deep learning. By improving the source language expression, it dynamically selects the related information of the source language in decoding, which greatly improves the insufficiency issue of the classic Encoder-Decoder framework. On the basis of the issues in the conventional Encoder-Decoder framework such as long-term memory limitation, interrelationships in sequence transformation, and output quality of model dynamic structure, this paper describes a varied aspects on attention mechanism, including the definition, the principle, the classification, state-of-the-art researches as well as the applications of attention mechanism in image recognition, speech recognition, and natural language processing. Meanwhile, this paper further discusses the multi-modal attention mechanism, evaluation mechanism of attention, interpretability of the model and integration of attention with the new model, providing new research issues and directions for the development of attention mechanism in deep learning.

关键词

深度学习 / 注意力机制 / 编码器—解码器

Key words

deep learning / attention mechanism / Encoder-Decoder

引用本文

导出引用
朱张莉,饶元,吴渊,祁江楠,张钰. 注意力机制在深度学习中的研究进展. 中文信息学报. 2019, 33(6): 1-11
ZHU Zhangli, RAO Yuan, WU Yuan, QI Jiangnan, ZHANG Yu. Research Progress of Attention Mechanism in Deep Learning. Journal of Chinese Information Processing. 2019, 33(6): 1-11

参考文献

[1] Kim Y.Convolutional neural networks for sentence classification[J].arXiv preprint arXiv:1408.5882,2014.
[2] Sutskever I,Martens J,Hinton G E.Generating text with recurrent neural Networks[C]//Proceedings of International Conference on Machine Learning.Bellevue,Washington:DBLP,2016:1017-1024.
[3] Ma X Z,Hovy E.End-to-end sequence labeling via bi-directional lstm-cnns-crf[J].arXiv preprint arXiv:1603.01354,2016.
[4] Mnih,Volodymyr,Heess,et al.Recurrent models of visual attention[J].arXiv preprint arXiv:1406-6247,2014.
[5] Ba J,Mnih V,Kavukcuoglu K.Multiple object recognition with visual attention[J].arXiv preprint arXiv:1412.7755,2014.
[6] Bahdanau D,Cho K,Bengio Y.Neural machine translation by jointly learning to align and translate[J].arXiv peprint arXiv:1409,0473,2014.
[7] Cho K,Van Merrienboer B,Gulcehre C,et al.Learning phrase representations using RNN Encoder-Decoder for statistical machine translation[J].arXiv:1406.1078v3.2014,2(11):23-37.
[8] Yin W,Schütze H.Convolutional neural network for paraphrase identification[C]//Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics.North American:Human Language Technologies,2015:901-911.
[9] Yin W,Schütze H,Xiang B,et al.ABCNN:Attention-based convolutional neural network for modeling sentence pairs[J].arXiv preprint arXiv:1512.05193,2015.
[10] Vaswani A,Shazeer N,Parmar N,et al.Attention is all you need[J].arXiv preprint arXiv:1706.03762,2017.
[11] Luong M T,Pham H,Manning C D.Effective approaches to attention-based neural machine translation[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Lisbon,Portugal:ACL,2015:1412-1421.
[12] Lin Z,Feng M,Santos C N D,et al.A structured self-attentive sentence embedding[J].arXiv:1703.03130,2017.
[13] Daniluk M,Rocktschel T,Welbl J,et al.Frustratingly short attention spans in neural language modeling[J].arXiv preprint arXiv:1702.04521,2017.
[14] Xu K,Ba J,Kiros R,et al.Show,attend and tell:Neural image caption generation with visual attention[J].arXiv:1502.03044v1.2015:2048-2057.
[15] Tang D Y,Qin B,Liu T.Aspect level sentiment classification with deep memory network[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing .Austin,Texas:ACL,2016:214-224.
[16] Wang L L,Cao Z,Melo G D,et al.Relation classification via multi-level attention CNNs[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Berlin,Germany:ACL,2016:1298-1307.
[17] Yu A W,Dohan D,Luong M T,et al.QANet:Combining local convolution with global self-attention for reading comprehension[J].arXiv preprint arXiv:1804.09541,2018.
[18] Golub D,He X.Character-level question answering with attention[J].arXiv preprint arXiv:1604.00727,2017.
[19] Lu J,Yang J,Batra D,et al.Hierarchical question-image co-attention for visual question answering[J].arXiv preprint arXiv:1606.00061,2016.
[20] Xiong C M,Zhong V,Socher R.Dynamic coattention networks for question answering[J].arXiv preprint arXiv:1611.01604,2016.
[21] Cui Y,Chen Z,Wei S,et al.Attention-over-attention neural networks for reading comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Vancouver,Canada:ACL,2017:593-602.
[22] Cui Y,Liu T,Chen Z,et al.Consensus attention-based neural networks for Chinese reading comprehension[C]//Proceedings of COLING 2016.Osaka,Japan:The COLING 2016 Organizing Commitee,2016:1777-1786.
[23] Huang B,Ou Y,Carley K M.Aspect level sentiment classification with attention-over-attention neural networks[J].arXiv:1804.06536,2018.
[24] Graves A,Wayne G,Reynolds M,et al.Hybrid computing using a neural network with dynamic external memory[J].Nature,2016,538(7626):471-476.
[25] Zhang H,Goodfellow I,Metaxas D,et al.Self-attention generative adversarial networks[J].arXiv:1805.08318,2018.
[26] Shen T,Zhou T,Long G,et al.Bi-directional block self-attention for fast and memory-efficient sequence modeling[C]//Proceedings of International Conference on Learning Representations,2018.
[27] Zhou C,Bai J,Song J,et al.ATRank:An attention-based user behavior modeling framework for recommendation[J].arXiv:1711.06632,2017.
[28] Chorowski J,Bahdanau D,Serdyuk D,et al.Attention-based models for speech recognition[J].Computer Science,2015,10(4):429-439.
[29] Bahdanau D,Chorowski J,Serdyuk D,et al.End-to-end attention-based large vocabulary speech recognition[J].Computer Science,2015:4945-4949.
[30] Kim S,Hori T,Watanabe S.Joint CTC-attention based end-to-end speech recognition using multi-task learning[C]//Proceedings of IEEE International Conference on Acoustics,Speech and Signal Processing,2017:4835-4839.
[31] Junczys-Dowmunt M,Dwojak T,Hoang H.Is neural machine translation ready for deployment? A case study on 30 translation directions[J].arXiv preprint arXiv:1610.01108,2016.
[32] Tu Z,Lu Z,Liu Y,et al.Modeling coverage for neural machine translation[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Berlin,Germany:ACL,2016:76-85.
[33] Cheng Y,Wu H,Wu H,et al.Agreement-based joint training for bidirectional attention-based neural machine translation[C]//Proceedings of IJCAI,New York,USA,2016.
[34] 刘洋.神经机器翻译前沿进展[J].计算机研究与发展,2017,54(6):1144-1149.
[35] Radev,Dragomir R,Hovy,et al.Introduction to the special issue on summarization[J].Computational Linguistics,2002,28(28):399-408.
[36] 庞超,尹传环.基于分类的中文文本摘要方法[J].计算机科学,2018,45(1):144-147.
[37] 周博通,孙承杰,林磊,等.InsunKBQA:一个基于知识库的问答系统[J].智能计算机与应用,2017,7(5):150-154.
[38] Lipton Z C.The mythos of model interpretability[J].arXiv preprint arXiv:1606.03490,2016.
[39] Sabour S,Frosst N,Hinton G E.Dynamic routing between capsules[J].arXiv preprint arXiv:1710.09829,2017.

基金

国家自然科学基金(61741208);教育部“云数融合”基金(2017B00030);中央高校基本科研业务费(zdyf2017006);陕西省协同创新计划(2015XT-21);西安市碑林区科技创新计划项目(GX1803);陕西烟草公司科技攻关项目(ST2017-R011);中央高校建设世界一流大学(学科)和特色发展引导专项资金(PY3A022);深圳市科技项目(JCYJ20180306170836595)
PDF(2547 KB)

2959

Accesses

0

Citation

Detail

段落导航
相关文章

/