|
|
Generative Reading Comprehension via Multi-task Learning |
QIAN Jin1, HUANG Rongtao1, ZOU Bowei1,2,HONG Yu1 |
1. School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China;
2. Institute for Infocomm Research, 138632, Singapore |
|
|
Abstract Generative reading comprehension is a novel and challenging issue in machine reading comprehension. Compared with the mainstream extractive reading comprehension, generative reading comprehension model is aimed for combining questions and paragraphs to generate natural and complete statements as answers. To understand of the boundary information of answers in paragraphs and the question type information, this paper proposes a generative reading comprehension model based on multi-task learning. In the training phase, the model takes the answer generation as the main task, and the answer extraction and question classification tasks as auxiliary tasks for multi-task learning. The model simultaneously learns and optimizes the parameters of the encoding layer, then it loads the encoding layer in the test phase to decode and generate the answers. The experimental results show that the answer extraction model and the question classification model can effectively improve the performance of the generative reading comprehension model.
|
Received: 21 February 2021
|
|
|
|
|
[1]Rajpurkar P, Zhang J, Lopyrev K, et al. SQuAD: 100,000+ questions for machine comprehension of text[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 2383-2392.
[2]Yang Z, Qi P, Zhang S, et al. HotpotQA: a dataset for diverse, explainable multi-hop question answering[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 2369-2380.
[3]Reddy S, Chen D, Manning C D. CoQA: A conversational question answering challenge[J]. Transactions of the Association for Computational Linguistics, 2019, 7: 249-266.
[4]Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv: 1810.04805, 2018.
[5]Dong L, Yang N, Wang W, et al. Unified language model pre-training for natural language understanding and generation[J]. arXiv preprint arXiv: 1905.03197, 2019.
[6]Xiao D, Zhang H, Li Y, et al. ERNIE-gen: an enhanced multi-flow pre-training and fine-tuning framework for natural language generation[J]. arXiv preprint arXiv: 2001.11314, 2020.
[7]Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. arXiv preprint arXiv: 1706.03762, 2017.
[8]Bao H, Dong L, Wei F, et al. UniLMV2: pseudo-masked language models for unified language model pre-training[C]//Proceedings of the International Conference on Machine Learning. PMLR, 2020: 642-652.
[9]Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[J]. arXiv preprint arXiv: 1409.3215, 2014.
[10]Vinyals O, Fortunato M, Jaitly N. Pointer networks[J]. arXiv preprint arXiv: 1506.03134, 2015.
[11]Nguyen T, Rosenberg M, Song X, et al. MS MARCO: A human generated machine reading comprehension dataset[J]. arXiv preprint arXiv: 1611.09268, 2016.
[12]Ko〖KG-*4〗c〖DD(-1*4〗〖HT6〗ˇ〖DD)〗 isk〖KG-*4〗y〖DD(-1*4〗〖HT6〗'〖DD)〗 T, Schwarz J, Blunsom P, et al. The narrativeQA reading comprehension challenge[J]. Transactions of the Association for Computational Linguistics, 2018, 6: 317-328.
[13]Joshi M, Choi E, Weld D S, et al. TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension[J]. arXiv preprint arXiv: 1705.03551, 2017.
[14]〖JP2〗Dunn M, Sagun L, Higgins M, et al. SearchQA: a new q&a dataset augmented with context from a search engine[J]. arXiv preprint arXiv: 1704.05179, 2017.〖JP〗
[15]Choi E, He H, Iyyer M, et al. QuAC: question answering in context[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 2174-2184.
[16]McCann B, Keskar N S, Xiong C, et al. The natural language decathlon: multitask learning as question answering[J]. arXiv preprint arXiv: 1806.08730, 2018.
[17]Bauer L, Wang Y, Bansal M. Commonsense for generative multi-hop question answering tasks[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 4220-4230.
[18]Tan C, Wei F, Yang N, et al. S-net: From answer extraction to answer synthesis for machine reading comprehension[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 5940-5947.
[19]Song K, Tan X, Qin T, et al. Mass: masked sequence to sequence pre-training for language generation[J]. arXiv preprint arXiv: 1905.02450, 2019.
[20]Lewis M, Liu Y, Goyal N, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[J]. arXiv preprint arXiv: 1910.13461, 2019.
[21]Wang S, Yu M, Guo X, et al. RΘ3: reinforced reader-ranker for open-domain question answering[J]. arXiv preprint arXiv: 1709.00023, 2017.
[22]Nishida K, Saito I, Nishida K, et al. Multi-style generative reading comprehension[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 2273-2284.
[23]Liu X, He P, Chen W, et al. Multi-task deep neural networks for natural language understanding[J]. arXiv preprint arXiv: 1901.11504, 2019.
[24]Sun Y, Wang S, Li Y, et al. ERNIE 2.0: a continual pre-training framework for language understanding[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020: 8968-8975.
[25]Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002: 311-318.
[26]Lin C Y . Rouge: a package for automatic evaluation 〖JP2〗of summaries[C]//Proceedings of ACL, 2004: 74-81.〖JP〗
|
|
|
|