对话是一个交互的过程,回应选择旨在根据已有对话上文选择合适的回应,是自然语言处理领域的研究热点。已有研究取得了一定的成功,但仍存在两个突出的问题: 一是历史信息与备选回应间的关联关系未得到充分利用;二是对话历史的潜在语义信息挖掘不够。针对问题一,该文同时考虑对话的历史信息与备选回应信息,借助交叉注意力机制实现两者的软对齐,从而对它们之间的关联关系进行有效捕捉;针对问题二,一方面借助多头自注意力机制从多个不同视角捕获对话历史的潜在语义信息,另一方面借助高速路神经网络实现多种信息的有效桥接,在深度挖掘语义信息的同时保证信息的完整。在Ubuntu Corpus V1数据集上的对比实验表明了该方法的有效性,模型取得了88.66%的R10@1值,90.06%的R10@2值和95.15%的R10@5值。
Abstract
Response selection for dialogue is a popular research issue in the field of NLP, which is aimed at selecting appropriate responses based on the existing dialogue. Existing researches are defected in two aspects: 1) insufficient utilization of the correlation between historical information and alternative responses and, 2) insufficient mining of potential semantic information in dialogue history. To deal with the first issue, this paper considers both the historical information and the alternative response information in the dialogue, by the cross-attention mechanism to effectively capture the relationship between them. For the second issue, this paper employs the multi-head self-attention mechanism to capture the latent semantic information of the conversation history from multiple different perspectives, and the highway network to effectively bridge a variety of information to ensure the integrity of the information. Experiments show the proposed method achieves a 88.66% R10@1-score, a 90.06% R10@2-score and a 95.15% R10@5-score on the Ubuntu Corpus V1 dataset.
关键词
回应选择 /
交叉注意力机制 /
自注意力机制 /
高速路神经网络
{{custom_keyword}} /
Key words
response selection /
cross-attention mechanism /
self-attention mechanism /
highway network
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Saygin A P, Cicekli I. Pragmatics in human-computer conversations[J]. Journal of Pragmatics, 2002,34(3): 227-258.
[2] Manning C D, Manning C D, Schütze H. Foundations of statistical natural language processing[M]. NY: MIT press, 1999.
[3] Wen T H, Vandyke D, Mrkíc N, et al. A network-based end-to-end trainable task-oriented dialogue system[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017, 1: 438-449.
[4] Higashinaka R, Funakoshi K, Araki M, et al. Towards taxonomy of errors in chat-oriented dialogue systems[C]//Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2015: 87-95.
[5] Guttormsen M, Bürger A, Hansen T E, et al. The SiRi particle-telescope system[J]. Nuclear Instruments and Methods in Physics Research A, 2011, 648: 168-173.
[6] Hoy M B. Alexa, Siri, Cortana, and more: an introduction to voice assistants[J]. Medical Reference Services Quarterly, 2018, 37(1): 81-88.
[7] Shum H, He X, Li D. From Eliza to XiaoIce: challenges and opportunities with social chatbots[J]. Frontiers of Information Technology and Electronic Engineering, 2018, 1(19): 10-26.
[8] Hao Y, Zhang Y, Liu K, et al. An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 221-231.
[9] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, 2017: 5998-6008.
[10] Srivastava R K, Greff K, Schmidhuber J. Highway networks[J]. arXiv preprint arXiv:1505.00387, 2015.
[11] Wang H, Lu Z, Li H, et al. A dataset for research on short-text conversations[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2013: 935-945.
[12] Ji Z, Lu Z, Li H. An information retrieval approach to short text conversation[J]. arXiv preprint arXiv:1408.6988, 2014.
[13] Wu Y, Wu W, Li Z, et al. Topic augmented neural network for short text conversation[J]. arXiv preprint.arXiv: 1065.00090v2, 2016.
[14] Lowe R, Pow N, Serban I, et al. The ubuntu dialogue corpus: a large dataset for research in unstructured multi-turn dialogue systems[C]//Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Prague: the Association for Computer Linguistics,2015: 285-294.
[15] Kadlec R, Schmid M, Kleindienst J. Improved deep learning baselines for Ubuntu corpus dialogs[J]. arXiv preprint arXiv:1510.03753, 2015.
[16] Yan R, Song Y, Wu H. Learning to respond with deep neural networks for retrieval-based human-computer conversation system[C]//Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2016: 55-64.
[17] Zhou X, Dong D, Wu H, et al. Multi-view response selection for human-computer conversation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Austin: the Association for Computational Linguistics, 2016: 372-381.
[18] Wang S, Jiang J. Learning natural language inference with LSTM[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 1442-1451.
[19] Tan M, dos Santos C, Xiang B, et al. LSTM-based deep learning models for non-factoid answer selection[J]. arXiv preprint arXiv:1511.04108, 2015.
[20] Wu Y, Wu W, Li Z, et al. Sequential matching network: a new architecture for multi-turn response selection in retrieval-based chatbots[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver: Association for Computational Linguistics,2017: 496-505.
[21] Zhang Z, Li J, Zhu P, et al. Modeling multi-turn conversation with deep utterance aggregation[C]//Proceedings of the 27th International Conference on Computational Linguistics. 2018: 3740-3752.
[22] Zhou X, Li L, Dong D, et al. Multi-turn response selection for chatbots with deep attention matching network[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 1118-1127.
[23] Tao C, Wu W, Xu C, et al. One time of interaction may not be enough: go deep with an interaction-over-interaction network for response selection in dialogues[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 1-11.
[24] Dey R, Salem F M. Gate-variants of gated recurrent unit (GRU) neural networks[C]//Proceedings of the IEEE 60th International Midwest Symposium on Circuits and Systems. IEEE, 2017: 1597-1600.
[25] 韩松伯. 基于深度神经网络的英文文本蕴含识别研究[D]. 北京: 北京邮电大学硕士学位论文, 2018.
[26] Kingma D P, Ba J L. Adam: a method for stochastic optimization[C]//Proceedings of the 3rd International Conference on Learning Representations. San Diego,2015.
[27] Wan S, Lan Y, Guo J, et al.A deep architecture for semantic matching with multiple positional sentence representations[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. Phoenix :AAAI Press, 2016: 2835-2841.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61876118,61751206)
{{custom_fund}}