李少博,孙承杰,徐振,刘秉权,季振洲,王明江. 基于知识拷贝机制的生成式对话模型[J]. 中文信息学报, 2021, 35(2): 107-115.
LI Shaobo, SUN Chengjie, XU Zhen, LIU Bingquan, JI Zhenzhou, WANG Mingjiang. Knowledge Copying Mechanism for Dialog Generation. , 2021, 35(2): 107-115.
基于知识拷贝机制的生成式对话模型
李少博,孙承杰,徐振,刘秉权,季振洲,王明江
哈尔滨工业大学 计算机科学与技术学院,黑龙江 哈尔滨 150001
Knowledge Copying Mechanism for Dialog Generation
LI Shaobo, SUN Chengjie, XU Zhen, LIU Bingquan, JI Zhenzhou, WANG Mingjiang
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
Abstract:The generative end-to-end dialog models suffer from generating monotonous and non-informative responses, which is a challenge in the dialog technology. As a highly structured knowledge source, knowledge graph can provide relevant information and topic transfer relationships that is essential to the continuation of the dialog. In this paper, a generative dialog model based on knowledge graph is proposed. First, a knowledge graph based mapping mechanism is used to process the dialog, then a copying mechanism is used to directly introduce the utterances contained in the knowledge graph into the generated responses and the information contained in knowledge graph is also used to guide the generation of responses by attention mechanism. On the "Knowledge-Driven Dialogue" dataset presented in the "2019 Language and Intelligent Competition", the proposed model outperforms the baseline model provided by competition organizer by 10.47% in character-level F1 and 4.6% in DISTNCT-1, respectively.
[1] Sutskever, Ilya, Oriol Vinyals, et al. Sequence to sequence learning with neural networks [C]//Proceedings of Advances in Neural Information Processing Systems. New York, USA: Curran Associates, 2014: 3104-3112. [2] Powers, David Mw. Applications and explanations of Zipfs law [C]//Proceedings of the Joint Conference on New Methods in Language Processing and Computational Natural Language Learning, Copenhagen, Denmark: Association for Computational Linguistics, 1998: 151-160. [3] Li J, Galley M, Brockett C, et al. A diversity-promoting objective function for neural conversation models [C]//Proceedings of Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Denver, USA, Association of Computational Linguistics, 2016: 110-119. [4] G Shao L, Gouws S, Britz D, et al. Generating long and diverse responses with neural conversation models. [C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark: Association for Computational Linguistics, 2017: 2210-2219. [5] Wang D, Jojic N, Brockett C, et al. Steering output style and topic in neural response generation [C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark: Association for Computational Linguistics, 2017: 2140-2150. [6] Zhou H, Young T, Huang M, et al. Commonsense knowledge aware conversation generation with graph attention [C]//Proceedings of International Joint Conference on Artificial Intelligence, Stockholm, Sweden: 2018: 4623-4629. [7] Marková I, Linell P, Grossen M, et al. Dialogue in focus groups: exploring socially shared knowledge[M]. London: Equinox Publishing, 2007: 256. [8] Fabian M S, Kasneci Gjergji, Weikum Gerhard. Yago: A core of semantic knowledge unifying wordnet and wikipedia [C]//Proceedings of the International World Wide Web Conference, New York, USA: ACM 2007: 697-706. [9] Wang Q, Mao Z, Wang B, et al. Knowledge graph embedding: A survey of approaches and applications[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(12): 2724-2743. [10] Ritter A, Cherry C, Dolan W B. Data-driven response generation in social media [C]//Proceedings of the 2011 Conference on Empirical Methods on Natural Language Processing, Copenhagen, Denmark: Association for Computational Linguistics, 2011: 583-593. [11] Vinyals, Oriol, Quoc Le. A Neural conversational model[J]. arXiv preprint arXiv: 1506.05869, 2015. [12] Cho K, Gulcehre B V M C, Bahdanau, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation [C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark: Association for Computational Linguistics, 2014: 1724-1734. [13] Shang L, Lu Z, Li H. Neural responding machine for short-text conversation [C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Copenhagen, Denmark: Association for Computational Linguistics, 2015: 1577-1586. [14] Sordoni A, Galley M, Auli M, et al. A neural network approach to context sensitive generation of conversational responses [C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics, Copenhagen, Denmark: Association for Computational Linguistics, 2015: 196-205. [15] Serban I V, Sordoni A, Lowe R, et al. A hierarchical latent variable encoder-decoder model for generating dialogues [C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, New Orleans, USA: AAAI Press, 2017: 3295-3301. [16] Vougiouklis P, Hare J, Simperl E. A neural network approach for knowledge-driven response generation [C]//Proceedings of the 26th International Conference on Computational Linguistics, Vancouver, Canada, Association for Computational Linguistics, 2016: 3370-3380. [17] Ghazvininejad M, Brockett C, Chang M W, et al. A knowledge-grounded neural conversation model [C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA: AAAI Press, 2018: 5110-5117. [18] Weston J, Chopra S, Bordes A. Memory networks[J]. arXiv preprint arXiv: 1410.3916, 2014. [19] Wu W, Guo Z, Zhou X, et al. Proactive human-machine conversation with explicit conversation goal, [C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Copenhagen, Denmark: Association for Computational Linguistics, 2019. [20] Schuster M, Paliwal K K, Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681. [21] Gu J, Lu Z, Li H. et al. Incorporating copying mechanism in sequence-to-sequence learning [C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Copenhagen, Denmark: Association for Computational Linguistics, 2016: 1631-1640. [22] Radziwill N, Benton M. Evaluating quality of chatbots and intelligent conversational agents[J]. Software Quality Professional, 2017,19(3): 25. [23] Papineni K, Roukos S, Ward T, et al. BLEU: A method for automatic evaluation of machine translation [C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Copenhagen, Denmark: Association for Computational Linguistics, 2002: 311-318.