问题生成研究综述

PDF(1053 KB)

中文信息学报 ›› 2021, Vol. 35 ›› Issue (7) : 1-9.

综述

问题生成研究综述

吴云芳¹,张仰森²

作者信息 +

A Survey of Question Generation

WU Yunfang¹, ZHANG Yangsen²

Author information +

History +

摘要

问题生成是给定文本,自动生成内容通顺、语义相关的自然语言问题。问题生成可应用于教育领域的阅读理解、辅助问答系统和对话系统,因此近年来引起了研究者的广泛关注和兴趣。该文对问题生成的相关研究进行了综述。首先阐释了问题生成的研究意义与应用场景,继而简略概述了基于规则的问题生成方法,然后从输入文本是句子/段落、有/无答案信息等不同角度全面阐述了基于神经网络的问题生成模型。该文还介绍了问题生成的评价方法,分析讨论了现有工作的不足,并展望了未来可能的研究方向。

Abstract

Question generation (QG) aims to automatically generate fluent and semantically related questions for a given text. QG can be applied to generate questions for reading comprehension tests in the education field, and to enhance question answering and dialog systems. This paper presents a comprehensive survey of related researches on QG. We first describe the significance of QG and its applications, especially in the education field. Then we outline the traditional rule-based methods on QG, and make a detailed description on the neural network based models from different views. We also introduce the evaluation metrics of generated questions. Finally, we discuss the limitations of previous studies and suggest future works.

导出引用

吴云芳,张仰森. 问题生成研究综述. 中文信息学报. 2021, 35(7): 1-9

WU Yunfang, ZHANG Yangsen. A Survey of Question Generation. Journal of Chinese Information Processing. 2021, 35(7): 1-9

参考文献

[1] Nan Duan, Duyu Tang. Overview of the NLPCC-2017 shared task: Open domain Chinese question answering[C]//Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing, 2017.
[2] Guokun Lai, Qizhe Xie, Hanxiao Liu,et al. Race: Large-scale reading comprehension dataset from examinations[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017: 785-794.
[3] Mitkov Ruslan, Ha Le An. Computer-aided generation of multiple-choice tests[C]//Proceedings of the HLTNAACL 03 Workshop on Building Educational Applications using Natural Language Processing, 2003.
[4] Michael Heilman, Noah A Smith. Good question! Statistical ranking for question generation [C]//Proceedings of Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2010: 609-617.
[5] David Lindberg, Fred Popowich, John Nesbit, et al. Generating natural language questions to support learning on-line[C]//Proceedings of the 14th European Workshop on Natural Language Generation, 2013: 105-114.
[6] Vasile Rus, Brendan Wyse, Paul Piwek,et al. A detailed account of the first question generation shared task evaluation challenge [J]. Dialogue and Discourse, 2012, 3(2):177-204.
[7] Yifan Gao, Lidong Bing, Piji Li, et al. Generating distractors for reading comprehension questions from real examinations[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence,2019: 6423-6430.
[8] Xiaorui Zhou, Senlin Luo, Yunfang Wu. Co-attention hierarchical network: Generating coherent long distractors for reading comprehension[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020: 9725-9732.
[9] Guanliang Chen, Jie Yang, Claudia Hauff, et al. LearningQ: A Large-scale dataset for educational question generation[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence,2018.
[10] Angelica Willis, Glenn Davis, Sherry Ruan, et al. Key phrase extraction for generating educational question-answer pairs[C]//Proceedings of the 6th ACM Conference on Learning@scale, 2019.
[11] Duyu Tang, Nan Duan, Tao Qin, et al. Question answering and question generation as dual tasks[EB/OL]. CoRR, abs/1706.02027, 2017.
[12] Nan Duan, Duyu Tang,Peng Chen, et al. Question generation for question answering[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017: 866-874.
[13] Shiyue Zhang and Mohit Bansal. Addressing semantic drift in question generation for semi-supervised question answering [C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, 2019: 2495-2509.
[14] Yansen Wang, Chenyi Liu, Minlie Huang, et al. Learning to ask questions in open domain conversational systems with typed decoders[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018:2193-2203.
[15] Aliannejadi Mohammad, Zamani Hamed, Crestani Fabio, et al. Asking clarifying questions in open-domain information-seeking conversations[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019.
[16] George A Miller.WordNet: An electronic lexical database[M]. MIT Press, 1998.
[17] Dhole K D, Manning C D. Syn-QG: Syntactic and shallow semantic rules for question generation[C]//Proceedings of the 2020 Annual Conference of the Association for Computational Linguistics, 2020.
[18] Karin Kipper Schuler. VerbNet: Abroad coverage, comprehensive verb lexicon [D]. Ph D. Thesis, University of Pennsylvania, 2005.
[19] Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev,et al. Squad: 100,000+ questions for machine comprehension of text[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016.
[20] Tri Nguyen, Mir Rosenberg, Xia Song,et al. Ms marco: A human generated machine reading comprehension dataset[J/OL]. ArXiv, preprint arXiv: 1611.09268, 2016.
[21] Xinya Du, Junru Shao, Claire Cardie. Learning to ask: Neural question generation for reading comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017:1342-1352.
[22] Qingyu Zhou, Nan Yang, Furu Wei,et al. Neural question generation from text: A preliminary study [C]//Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing, 2017:662-671.
[23] Xingdi Yuan, Tong Wang, Caglar Gulcehre, Aless. Machine comprehension by text-to-text neural question generation [C]//Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017: 15-25.
[24] Linfeng Song, Zhiguo Wang, Wael Hamza,et al. Leveraging context information for natural question generation[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, 2018:569-574.
[25] Abigail See, Peter J Liu, Christopher D Manning. Get to the point: Summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017:1073-1083.
[26] Xingwu Sun, Jing Liu, Yajuan Lyu,et al. Answer-focused and position-aware neural question generation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 3930- 3939.
[27] 董孝政, 洪宇, 朱芬红, 等. 基于密令位置信息特征的问题生成[J].中文信息学报2019, 33(8): 93-100.
[28] Yanghoon Kim, Hwanhee Lee, Joongbo Shin, et al. Improving neural question generation using answer separation[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence,2019:6602-6609.
[29] Bang Liu, Mingjun Zhao, Di Niu,et al. Learning to generate questions by learning what not to generate[C]//Proceedings of the Web Conference, 2019:1106-1118.
[30] Wenjie Zhou, Minghua Zhang, Yunfang Wu. Multi-task learning with language modeling for question generation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, 2019:3392-3397.
[31] Wenjie Zhou, Minghua Zhang, Yunfang Wu. Question-type driven question generation [C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, 2019:6031-6036.
[32] Jingjing Li, Yifan Gao, Lidong Bing,et al. Improving question generation with to the point context[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, 2019:3216-3226.
[33] Xiyao Ma, Qile Zhu, Yanlin Zhou, et al. Improving question generation with sentence-level semantic matching and answer position inferring[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020.
[34] Xin Jia, Wenjie Zhou, Xu Sun, et al. How to ask good questions? Try to leverage paraphrases[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020:6130-6140.
[35] Yao Zhao, Xiaochuan Ni, Yuanyuan Ding, et al. Paragraph-level neural question generation with maxout pointer and gated self-attention networks[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018:3901-3910.
[36] Preksha Nema, Akash Kumar Mohankumar, Mitesh M Khapra, et al. Let’s ask again: Refine network for automatic question generation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, 2019:3314-3323.
[37] Yu Chen, LingfeiWu, Mohammed J Zaki. Natural question generation with reinforcement learning based graph-to-sequence model[C]//Proceedings of the 33rd Conference on Neural Information Processing Systems, 2019.
[38] Luu Anh Tuan, Darsh J Shah, Regina Barzilay. Capturing greater context for question generation[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020.
[39] Wang W, Wei F, Dong L, et al. MiniLM: Deep self-attention distillation for task-agnostic compression of pre-trained transformers[EB/OL]. ArXiv abs/2002.10957, 2020.
[40] Bao H, Dong L, Wei F, et al. UniLMv2: Pseudo-masked language models for unied language model pre-training[J/OL]. ArXiv preprint arXiv: 2002.12804. 2020.
[41] Yan Y, Qi W, Gong Y, et al. ProphetNet: predicting future n-gram for sequence-to-sequence pre-training[C]//Proceedings of the Association for Computational Linguistics: EMNLP 2020:2401-2410.
[42] Xiao D, Zhang H, Li Y, et al. ERNIE-GEN: An enhanced multi-flow pre-training and fine-tuning framework for natural language generation[C]//Proceedings of the 29th International Joint Conference on Artificial Intelligence, 2020:3997-4003.
[43] Jinwen Ma, Wenpeng Hu, Bing Liu,et al. Aspect-based question generation[C]//Proceedings of the International Conference on Learning Representations, 2018.
[44] Siyuan Wang, Zhongyu Wei, Zhihao Fan,et al.A multi-agent communication framework for question-worthy phrase extraction and question generation[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence,2019:7168-7175.
[45] Thomas Scialom, Benjamin Piwowarski, Jacopo Staiano. Self-attention architectures for answer-agnostic neural question generation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2019:6027-6032.
[46] Xinya Du, Claire Cardie. Identifying where to focus in reading comprehension for neural question generation[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017:2067-2073.
[47] Sandeep Subramanian, Wang Tong, XingdiYuan, et al. Neural models for key phrase detection and question generation[C]//Proceedings of the Workshop on Machine Reading for Question Answering, 2017:78-88.
[48] Vishwajeet Kumar, Nitish Joshi, Arijit Mukherjee,et al. Cross-lingual training for automatic question generation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2019:4863-4872.
[49] Wei He, Kai Liu, Jing Liu,et al. DuReader: A Chinese machine reading comprehension dataset from real-world applications[C]//Proceedings of the Workshop on Machine Reading for Question Answering, 2018: 37-46.
[50] Zewen Chi, Li Dong, Furu Wei, et al. Cross-lingual natural language generation via pre-training[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020.
[51] Peng Li, Wei Li, Zhengyan He,et al.Dataset and neural recurrent sequence labeling model for open-domain factoid question answering[EB/OL]. arXiv:1607.06275. 2016.
[52] Kishore Papineni, Salim Roukos, ToddWard, et al. BLEU: A method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002:311-318.
[53] Michael Denkowski, Alon Lavie. Meteor universal: Language specific translation evaluation for any target language[C]//Proceedings of the 9th Workshop on Statistical Machine Translation. Association for Computational Linguistics, 2014: 376- 380.
[54] Chin Yew Lin.Rouge: A package for automatic evaluation of summaries[C]//Proceedings of Text Summarization Workshop of the 42th Annual Meeting of the Association for Computational Linguistics, 2004:74-81.
[55] Preksha Nema, Mitesh M Khapra. Towards a better metric for evaluating question generation systems[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018.
[56] Md Arafat Sultan, Shubham Chandel, Ramón F Astudillo, et al. On the importance of diversity in question generation for QA[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020.
[57] Qian Yu, Lidong Bing, Qiong Zhang, et al. Review-based question generation with adaptive instance transfer and augmentation[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020:280-290.
[58] Pan L, Xie Y, Feng Y, et al. Semantic graphs for generating deep questions[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020.
[59] Ko W J, Chen T Y, Huang Y, et al. Inquisitive question generation for high level text comprehension[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020:6544-6555.

基金

国家自然科学基金(62076008);科技创新2030—“新一代人工智能”重大项目(2020AAA0106600)

PDF(1053 KB)

2523

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

Received	Published
2020-10-04	2021-07-30
Issue Date
2021-07-30

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金