问题生成的核心任务是“在给定上下文语境的前提下,对目标答案自动生成相应的疑问句”。问题生成是自然语言处理领域中富有挑战性的任务之一,其对可靠的语义编码和解码技术有着极高的要求。目前,预训练语言模型已在不同自然语言处理任务中得到广泛应用,并取得了较好的应用效果。该文继承这一趋势,尝试将预训练语言模型UNILM应用于现有“基于编码和解码架构”的问题生成系统中,并集中在其适应性问题上开展研究。该文针对预训练模型在解码阶段频繁出现的“暴露偏差”和“掩码异构”问题,分别研究了基于随机抗噪和迁移学习的训练方法,借以提升UNILM在问题生成过程中的适应能力。同时,该文在SQuAD数据集上开展问题生成实验,实验结果证明,随机抗噪和迁移学习都能优化UNILM的解码性能,使之在答案可知场景的数据划分split1和split2上,分别将BLEU4指标提升到20.31%和 21.95%;并在答案不可知场景的split1数据集上将BLEU4指标提升到17.90%。
Abstract
Automatically question generation (QG for short) is to automatically generate the corresponding interrogative sentence of the target answer under the given context. . In this paper, we take advantage of pre-trained language model and apply the UNILM on encoder-decoder framework of question generation. In particular, in order to solve the problems of "exposure bias" and "mask heterogeneity" in the decoding phase of model, we examine the noise-aware training method and transfer learning on UNILM to raise its adaptability Experiments on SQuAD show that our best model yields state-of-the-art performance in answer-aware QG task with up to 20.31% and 21.95% BLEU score for split1 and split2, respectively, and in answer-agnostic QG task with 17.90% BLEU score for split1.
关键词
问题生成 /
暴露偏差 /
问答数据集 /
迁移学习
{{custom_keyword}} /
Key words
question generation /
exposure bias /
question-answering dataset /
transfer learning
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Rajpurkar P, Zhang J, Lopyrev K, et al. Squad: 100,000+ questions for machine comprehension of text[J]. arXiv preprint arXiv: 1606.05250, 2016.
[2] Cho K, Van Merrinboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv preprint arXiv: 1406.1078, 2014.
[3] Gu J, Lu Z, Li H, et al. Incorporating copying mechanism in sequence-to-sequence learning[J]. arXiv preprint arXiv: 1603.06393, 2016.
[4] Gulcehre C, Ahn S, Nallapati R, et al. Pointing the unknown words[J]. arXiv preprint arXiv: 1603.08148, 2016.
[5] Bao H, Dong L, Wei F, et al. Unilmv2: Pseudo-masked language models for unified language model pre-training[J]. arXiv preprint arXiv: 2002.12804, 2020.
[6] Chan Y H, Fan Y C. A recurrent BERT-based model for question generation[C]//Proceedings of the 2nd Workshop on Machine Reading for Question Answering, 2019: 154-162.
[7] Ranzato M A, Chopra S, Auli M, et al. Sequence level training with recurrent neural networks[J]. arXiv preprint arXiv: 1511.06732, 2015.
[8] Du X, Shao J, Cardie C. Learning to ask: Neural question generation for reading comprehension[J]. arXiv preprint arXiv: 1705.00106, 2017.
[9] Scialom T, Piwowarski B, Staiano J. Self-attention architectures for answer-agnostic neural question generation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 6027-6032.
[10] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 5998-6008.
[11] Zhou Q, Yang N, Wei F, et al. Neural question generation from text: A preliminary study[C]//Proceedings of the National CCF Conference on Natural Language Processing and Chinese Computing. Springer, Cham, 2017: 662-671.
[12] Dong X, Hong Y, Chen X, et al. Neural question generation with semantics of question type[C]//Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing. Springer, Cham, 2018: 213-223.
[13] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv: 1810.04805, 2018.
[14] Yang Z, Dai Z, Yang Y, et al. Xlnet: Generalized autoregressive pretraining for language understanding[C]//Proceedings of the Advances in Neural Information Processing Systems, 2019: 5753-5763.
[15] Song K, Tan X, Qin T, et al. Mass: Masked sequence to sequence pre-training for language generation[J]. arXiv preprint arXiv: 1905.02450, 2019.
[16] Dong L, Yang N, Wang W, et al. Unified language model pre-training for natural language understanding and generation[C]//Proceedings of the 33rd Conference on Neural Information Processing Systems, 2019: 13063-13075.
[17] Xiao D, Zhang H, Li Y, et al. ERNIE-GEN: An Enhanced Multi-Flow pre-training and fine-tuning framework for natural language generation[J]. arXiv preprint arXiv: 2001.11314, 2020.
[18] Wang M, Smith N A, Mitamura T. What is the Jeopardy model? A quasi-synchronous grammar for QA[C]//Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007: 22-32.
[19] Yang Y, Yih W, Meek C. Wikiqa: A challenge dataset for open-domain question answering[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 2013-2018.
[20] Garg S, Vu T, Moschitti A. Tanda: Transfer and adapt pre-trained transformer models for answer sentence selection[J]. arXiv preprint arXiv: 1911.04118, 2019.
[21] Cohen D, Yang L, Croft W B. Wikipassageqa: A benchmark collection for research on non-factoid answer passage retrieval[C]//Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, 2018: 1165-1168.
[22] Trischler A, Wang T, Yuan X, et al. Newsqa: A machine comprehension dataset[J]. arXiv preprint arXiv: 1611.09830, 2016.
[23] Joshi M, Choi E, Weld D S, et al. Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension[J]. arXiv preprint arXiv: 1705.03551, 2017.
[24] Kwiatkowski T, Palomaki J, Redfield O, et al. Natural questions: a benchmark for question answering research[J]. Transactions of the Association for Computational Linguistics, 2019, 7: 453-466.
[25] Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002: 311-318.
[26] Denkowski M, Lavie A. Meteor universal: Language specific translation evaluation for any target language[C]//Proceedings of the 9th Workshop on Statistical Machine Translation, 2014: 376-380.
[27] Lin C Y. Rouge: A package for automatic evaluation of summaries[C]//Proceedings of the Text Summarization Branches Out, 2004: 74-81.
[28] Zhao Y, Ni X, Ding Y, et al. Paragraph-level neural question generation with maxout pointer and gated self-attention networks[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 3901-3910.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(62076174);江苏省研究生科研与实践创新计划项目(SJCX20_1064)
{{custom_fund}}