任务型对话系统中的自然语言生成研究进展综述

覃立波,黎州扬,娄杰铭,禹棋赢,车万翔

PDF(2606 KB)
PDF(2606 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (1) : 1-11,20.
综述

任务型对话系统中的自然语言生成研究进展综述

  • 覃立波,黎州扬,娄杰铭,禹棋赢,车万翔
作者信息 +

A Survey of Natural Language Generation in Task-Oriented Dialogue System

  • QIN Libo, LI Zhouyang, LOU Jieming, YU Qiying, CHE Wanxiang
Author information +
History +

摘要

任务型对话系统中的自然语言生成模块(ToDNLG)旨在将系统的对话动作转换为自然语言回复,其受到研究者的广泛关注。随着深度神经网络的发展和预训练语言模型的爆发,ToDNLG的研究已经获得了重大突破。然而,目前仍然缺乏对现有方法和最新趋势的全面调研。为了填补这个空白,该文全面调研了ToDNLG的最新进展和前沿领域,包括: (1)系统性回顾: 回顾和总结了ToDNLG近10年的发展脉络和方法,包括非神经网络时代和基于深度学习的ToDNLG工作; (2)前沿与挑战: 总结了复杂ToDNLG等一些新兴领域及其相应的挑战; (3)丰富的开源资源: 该文在一个公共网站上收集、整理了相关的论文、基线代码和排行榜,供ToDNLG的研究人员直接了解最新进展,希望该文的调研工作能够促进ToDNLG领域的研究工作。

Abstract

Natural Language Generation in a task-oriented dialogue system (ToDNLG) aims to generate natural language responses given the corresponding dialogue acts, which has attracted increasing research interest. With the development of deep neural networks and pre-trained language models, great success has been witnessed in the research of ToDNLG field. We present a comprehensive survey of the research field, including: (1) a systematical review on the development of NLG in the past decade, covering the traditional methods and deep learning-based methods; (2) new frontiers in emerging areas of complex ToDNLG as well as the corresponding challenges; (3) rich open-source resources, including the related papers, baseline codes and the leaderboards on a public website. We hope the survey can promote future research in ToDNLG.

关键词

任务型对话系统 / 自然语言生成模块 / 预训练模型

Key words

task-oriented dialogue system / natural language generation module / pre-trained model

引用本文

导出引用
覃立波,黎州扬,娄杰铭,禹棋赢,车万翔. 任务型对话系统中的自然语言生成研究进展综述. 中文信息学报. 2022, 36(1): 1-11,20
QIN Libo, LI Zhouyang, LOU Jieming, YU Qiying, CHE Wanxiang. A Survey of Natural Language Generation in Task-Oriented Dialogue System. Journal of Chinese Information Processing. 2022, 36(1): 1-11,20

参考文献

[1] Young S, Gasic M, Thomson B, et al. POMDP-based statistical spoken dialog systems: a review[J]. Proceedings of the IEEE, 2013, 101(5): 1160-1179.
[2] Yao K, Zweig G, Hwang M Y, et al. Recurrent neural networks for language understanding[C]//Proceedings of the Interspeech,2013: 2524-2528.
[3] Hakkani Tür D, Tür G, Celikyilmaz A, et al. Multi-domain joint semantic frame parsing using bi-directional RNn-LSTM[C]//Proceedings of the Interspeech,2016: 715-719.
[4] Qin L, Che W, Li Y, et al. A stack-propagation framework with token-level intent detection for spoken language understanding[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing,2019: 2078-2087.
[5] Young S. Using POMDPs for dialog management[C]//Proceedings of the IEEE Spoken Language Technology Workshop. IEEE, 2006: 8-13.
[6] Henderson M, Thomson B, Young S. Word-based dialog state tracking with recurrent neural networks[C]//Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue,2014: 292-299.
[7] Mrki N, Séaghdha D , Wen T H, et al. Neural belief tracker: data-driven dialogue state tracking[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,2017: 1777-1788.
[8] Schatzmann J, Thomson B, Weilhammer K, et al. Agenda-based user simulation for bootstrapping a POMDP dialogue system[C]//Proceedings of the Conference of the North American Chapter on the Association for Computational Linguistics 2007: 149-152.
[9] Lipton Z, Li X, Gao J, et al. Bbq-networks: efficient exploration in deep reinforcement learning for task-oriented dialogue systems[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2018.
[10] Peng B, Li X, Li L, et al. Composite task-completion dialogue policy learning via hierarchical deep reinforcement learning[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2017: 2231-2240.
[11] Wen T H, Gasic M, Mrki N, et al. Semantically conditioned LSTM-based natural language generation for spoken dialogue systems[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 1711-1721.
[12] Wen T H, Gasic M, Mrki N, et al. Multi-domain neural Network Language Generation for Spoken Dialogue Systems[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 120-129.
[13] Li Y, Yao K, Qin L, et al. Slot-consistent NLG for task-oriented dialogue systems with iterative rectification network[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 97-106.
[14] Novikova J, Duek O, Rieser V. The E2E dataset: new challenges for end-to-end generation[C]//Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, 2017: 201-206.
[15] Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002: 311-318.
[16] Reiter E. NLG vs. Templates[C]//Proceedings of the 5th Europear Workshop on Natural Language Gereration, 1995: 95-106.
[17] Axelrod, Scott. Natural language generation in the IBM flight information system[C]//Proceedings of the ANLP-NAACL Workshop: Conversational Systems, 2000.
[18] Baptist L, Seneff S. Genesis-II: a versatile system for language generation in conversational system applications[C]//Proceedings of the 6th International Conference on Spoken Language Processing, 2000.
[19] Becker T. Practical, template-based natural language generation with tag[C]//Proceedings of the 6th International Workshop on Tree Adjoining Grammar and Related Frameworks, 2002: 80-83.
[20] Walker M, Rambow O, Rogati M. SPoT: a trainable sentence planner[C]//Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics, 2001.
[21] Reed C, Long D. Content ordering in the generation of persuasive discourse[C]//Proceedings of the IJCAI, 1997: 1022-1029.
[22] Reed C. Saliency and the attentional state in natural language generation[C]//Proceedings of the 15th European Conference on Artificial Intelligence, 2002: 440-444.
[23] Stent A, Prasad R, Walker M. Trainable sentence planning for complex information presentations in spoken dialog systems[C]//Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004: 79-86.
[24] Francois M, Prasad R, Stent A, et al. Individual and domain adaptation in sentence planning for dialogue[J]. Journal of Artificial Intelligence Research, 2007, 30(30): 413-456.
[25] Oh A H, Rudnicky A I. Stochastic language generation for spoken dialogue systems[C]//Proceedings of the ANLP/NAACL Workshop. on Conversational Systems, 2000.
[26] Barzilay R, Lee L. Bootstrapping lexical choice via multiple-sequence alignment[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2002: 164-171.
[27] Belz A. Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models[J]. Natural Language Engineering, 2008, 14(04): 431-455.
[28] F Mairesse, Gasic M, F Jurcícek, et al. Phrase-based statistical language generation using graphical models and active learning[C]//Proceedings of the Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2010.
[29] Duek O, Jurcicek F. Training a natural language generator from unaligned data[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 451-461.
[30] Lampouras G, Vlachos A. Imitation learning for language generation from unaligned data[C]//Proceedings of the International Conference on Computational Linguistics, 2016.
[31] Mikolov T, Karafiát M, Burget L, et al. Recurrent neural network based language model[C]//Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010.
[32] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[33] Chung J, Gulcehre C, Cho K, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[C]//Proceedings of the NIPS Workshop on Deep Learning, 2014.
[34] Wen T H, Gasic M, Kim D, et al. Stochastic language generation in dialogue using recurrent neural networks with convolutional sentence reranking[C]//Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2015: 275-284.
[35] Pennington J, Socher R, Manning C D. Glove: Global vectors for word representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1532-1543.
[36] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International conference on Neural Information Processing Systems, 2014: 3104-3112.
[37] Cho K, van Merrinboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1724-1734.
[38] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, 2015.
[39] Duek O, Jurcicek F. Sequence-to-sequence generation for spoken dialogue via deep syntax trees and strings[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 45-51.
[40] Wen T H, Gaic M, Mrkic N, et al. Toward multi-domain language generation using recurrent neural networks[C]//Proceedings of the NIPS Workshop on Machine Learning for Spoken Language Understanding and Interaction, 2015.
[41] Duek O, Jurccek F. A context-aware natural language generator for dialogue Systems[C]//Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2016: 185.
[42] Tran V K, Le Nguyen M, Tojo S. Neural-based natural language generation in dialogue using RNN Encoder-Decoder with Semantic Aggregation[C]//Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, 2017: 231-240.
[43] Agarwal S, Dymetman M. A surprisingly effective out-of-the-box Char2Char model on the E2E NLG Challenge dataset[C]//Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, 2017: 158-163.
[44] Juraska J, Karagiannis P, Bowden K, et al. A deep ensemble model with slot alignment for sequence-to-sequence natural language generation[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 152-162.
[45] Chen W, Chen J, Qin P, et al. Semantically conditioned dialog response generation via hierarchical disentangled self-attention[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 3696-3709.
[46] Peng B, Zhu C, Li C, et al. Few-shot natural language generation for task-oriented dialog[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing: Findings, 2020: 172-182.
[47] Kale M, Rastogi A. Template guided text generation for task oriented dialogue[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2020: 6505-6520.
[48] Tran V K, Le Nguyen M. Adversarial domain adaptation for variational neural language generation in dialogue systems[C]//Proceedings of the 27th International Conference on Computational Linguistics, 2018: 1205-1217.
[49] Tseng B H, Kreyssig F, Budzianowski P, et al. Variational cross-domain natural language generation for spoken dialogue systems[C]//Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018: 338-343.
[50] Li Y, Yao K. Interpretable NLG for task-oriented dialogue systems with heterogeneous rendering machines[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(15): 13306-13314.
[51] Zhu C, Zeng M, Huang X. Multi-task learning for natural language generation in task-oriented dialogue[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 1261-1266.
[52] Su S Y, Huang C W, Chen Y N. Dual supervised learning for natural language understanding and generation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 5472-5477.
[53] Xu X, Wang G, Kim Y B, et al. AUGNLG: few-shot natural language generation using self-trained data augmentation [C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics. 2021: 1183-1195.
[54] Balakrishnan A, Rao J, Upasani K, et al. Constrained decoding for neural NLG from compositional representations in task-oriented dialogue[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 831-844.
[55] Deemter K V, Krahmer E, Theune M, et al. Plan-based vs. template-based NLG: a false opposition?[C]//Proceedings of the German Conference on Artificial Intelligence, 1999.

基金

国家重点研发项目(2020AAA0106501);国家自然科学基金(61976072,61772153)
PDF(2606 KB)

2623

Accesses

0

Citation

Detail

段落导航
相关文章

/