知识对话任务旨在借助外部知识生成富信息的回复,主要包括用于知识检索的Query生成和融合知识的对话生成两方面。如何有效地生成知识检索Query以及高效地利用所检索到的知识生成对话仍是一个挑战。为了解决以上的问题,该文提出了一种基于级联式架构和多模型融合的知识型对话系统。针对知识检索Query生成任务,为了高精确率地检索知识,提出级联式解耦策略,即将知识检索Query生成任务划分为知识检索判别任务和检索Query生成任务。针对融合知识的对话生成任务,为了提高对话的一致性和多样性,首先进行了对话任务预训练,然后引入了多种对话训练策略进行训练,得到了多个高质量对话生成模型。基于不同对话模型产生的回复,提出了一种基于互投票的重排序策略。最终,该文所介绍的系统在“2022语言与智能技术竞赛: 知识对话任务”中取得了自动评估第一名、人工评估第三名的成绩。
Abstract
Knowledge-grounded dialogue is a task of generating an informative response based on external knowledge, mainly including query generation for knowledge retrieval and knowledge-grounded dialogue generation. This paper presents a cascading-architecture and multi-model fusion based knowledge-grounded dialogue system. For query generation task, this paper proposes a cascade decoupling strategy, that is, knowledge retrieval query generation task is divided into a knowledge retrieval discrimination task and a retrieval query generation task. In order to improve the consistency and diversity of dialogue, this paper introduces a variety of dialogue training strategies after dialogue pre-training to establish several competitive dialogue models with high qualities. For the responses generated by different dialogue models, this paper proposes a re-ranking strategy based on mutual voting. The system proposed in this paper ranked the 1st in the automatic evaluation and 3rd in the human evaluation in the "2022 language and intelligent technology competition: knowledge dialogue track".
关键词
知识检索 /
知识型对话 /
重排序
{{custom_keyword}} /
Key words
knowledge retrieval /
knowledge-grounded dialogue /
re-rank
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] LIAN R,XIE M, WANG F, et al. Learning to select knowledge for response generation in dialog systems[J]. arXiv preprint arXiv: 1902.04911, 2019.
[2] KIM B,AHN J, KIM G. Sequential latent knowledge selection for knowledge-grounded dialogue[J]. arXiv preprint arXiv: 2002.07510, 2020.
[3] CHEN X, MENG F, LI P, et al. Bridging the gap between prior and posterior knowledge selection for knowledge-grounded dialogue generation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2020: 3426-3437.
[4] LI Z,NIU C, MENG F, et al. Incremental transformer with deliberation decoder for document grounded conversations[J]. arXiv preprint arXiv: 1907.08854, 2019.
[5] LIU S, CHEN H, REN Z, et al. Knowledge diffusion for neural dialogue generation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018: 1489-1498.
[6] ZHOU H, XU X CH, WU W Q, et al. SINC: Service information augmented open-domain conversation[J]. arXiv preprint arXiv: 2206.14000.
[7] IZACARD G, GRAVE E. Leveraging passage retrieval with generative models for open domain question answering[J]. arXiv preprint arXiv: 2007.01282, 2020.
[8] SEE A, LIU P J, MANNING C D. Get to the point: Summarization with pointer-generator networks[J].arXiv preprint arXiv: 1704.04368, 2017.
[9] BENGIO S, VINYALS O, JAITLY N, et al. Scheduled sampling for sequence prediction with recurrent neural network[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015: 1171-1179.
[10] ZHANG W, FENG Y, MENG F, et al. Bridging the gap between training and inference for neural machine translation[J].arXiv preprint arXiv: 1906.02448, 2019.
[11] MIHAYLOVA T, MARTINS A F T. Scheduled sampling for transformers[J]. arXiv preprint arXiv: 1906.07651, 2019.
[12] LIU Y, MENG F, CHEN Y, et al. Confidence-aware scheduled sampling for neural machine translation[J].arXiv preprint arXiv: 2107.10427, 2021.
[13] LIU Y, MENG F, CHEN Y, et al. Scheduled sampling based on decoding steps for neural machine translation[J].arXiv preprint arXiv: 2108.12963, 2021.
[14] SHEN S, CHENG Y, HE Z, et al. Minimum risk training for neural machine translation[J].arXiv preprint arXiv: 1512.02433, 2015.
[15] RANZATO M A, CHOPRA S, AULI M, et al. Sequence level training with recurrent neural networks[J]. arXiv preprint arXiv: 1511.06732, 2015.
[16] MIKE L, LIU Y H, NAMAN G, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 7871-7880.
[17] SHAO Y F, GENG ZH CH, LIU Y T, et al. CPT: A pre-trained unbalanced transformer for both Chinese language understanding and generation [J]. arXiv preprint arXiv: 2109.05729.
[18] GOODMAN S,NAN D,RADU S.TeaForN: Teacher-forcing with N-grams[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, Online. Association for Computational Linguistics, 2020: 8704-8717.
[19] LIANG X B, WU L J,LI J T, et al. R-drop: Regularized dropout for neural networks[J].arXiv preprint arXiv: 2106.14448.
[20] ABIGAIL S, PETER J L, CHRISTOPHER D M. Get to the point: Summarization with pointer-generator networks[J]. arXiv preprint arXiv: 1704.04368.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61976016,61976015,61876198);国家重点研究与发展计划(2020AAA0108001)
{{custom_fund}}