杜家驹,叶德铭,孙茂松. 中文开放域问答系统数据增广研究[J]. 中文信息学报, 2022, 36(11): 121-130.
DU Jiaju, YE Deming, SUN Maosong. Data Augmentation in Chinese Open-domain Question Answering. , 2022, 36(11): 121-130.
Abstract:Open-domain Question Answering (OpenQA) is an important task in natural language processing. However, OpenQA models tend to match texts on a superficial level between questions and documents and often make stupid errors on some easy questions. Part of the reason for these errors is that reading comprehension datasets lack some common patterns in the actual scenes. To eliminate the effects of these patterns, we propose several methods to improve the robustness of OpenQA models. Besides, we build a new dataset to evaluate the performance of models in the real world. The experimental results show that the proposed methods can improve the performance of OpenQA models on this dataset.
[1] CUI Y, LIU T, CHEN Z, et al. Consensus attention-based neural networks for Chinese reading comprehension[C]//Proceedings of COLING, the 26th International Conference on Computational Linguistics: Technical Papers, 2016: 1777-1786. [2] CUI Y, LIU T, CHEN Z, et al. Dataset for the first evaluation on Chinese machine reading comprehension[C]//Proceedings of the 11th International Conference on Language Resources and Evaluation, 2018: 2721-2725. [3] SHAO C C, LIU T, LAI Y, et al. DRCD: A Chinese machine reading comprehension dataset[J]. arXiv preprint arXiv: 1806.00920, 2018. [4] LI P, LI W, HE Z, et al. Dataset and neural recurrent sequence labeling model for open-domain factoid question answering[J]. arXiv preprint arXiv: 1607.06275, 2016. [5] LIU J, LIN Y, LIU Z, et al. XQA: A cross-lingual open-domain question answering dataset[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 2358-2368. [6] HE W, LIU K, LIU J, et al. DuReader: A Chinese machine reading comprehension dataset from real-world applications[C]//Proceedings of the Workshop on Machine Reading for Question Answering, 2018: 37-46. [7] ZHANG Z, ZHAO H. One-shot learning for question-answering in Gaokao history challenge[C]//Proceedings of the 27th International Conference on Computational Linguistics, 2018: 449-461. [8] CHENG G, ZHU W, WANG Z, et al. Taking up the Gaokao challenge: An information retrieval approach[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence, 2016: 2479-2485. [9] GUO S, ZENG X, HE S, et al. Which is the effective way for Gaokao: Information retrieval or neural networks?[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017: 111-120. [10] GUO S, LIU K, HE S, et al. IJCNLP-2017 Task 5: Multi-choice question answering in examinations[C]//Proceedings of the IJCNLP, Shared Tasks, 2017: 34-40. [11] ZHENG C, HUANG M, SUN A. ChID: a large-scale Chinese IDiom dataset for cloze test[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 778-787. [12] SUN K, YU D, YU D, et al. Investigating prior knowledge for challenging Chinese machine reading comprehension[J]. Transactions of the Association for Computational Linguistics, 2020, 8: 141-155. [13] RAJPURKAR P, JIA R, LIANG P. Know what you don't know: Unanswerable questions for SQuAD[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics,2018: 784-789. [14] JOSHI M, CHOI E, WELD D S, et al. TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 1601-1611. [15] KWIATKOWSKI T, PALOMAKI J, REDFIELD O, et al. Natural questions: A benchmark for question answering research[J]. Transactions of the Association for Computational Linguistics, 2019, 7: 453-466. [16] JIA R, LIANG P. Adversarial examples for evaluating reading comprehension systems[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2017: 2021-2031. [17] ZHU H, DONG L, WEI F, et al. Learning toask unanswerable questions for machine reading comprehension[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 4238-4248. [18] WELBL J, MINERVINI P, BARTOLO M, et al. Undersensitivity in neural reading comprehension[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing: Findings, 2020: 1152-1165. [19] BACK S, CHINTHAKINDI S C, KEDIA A, et al. NeurQuRI: Neural question requirement inspector for answerability prediction in machine reading comprehension[C]//Proceedings of the International Conference on Learning Representations, 2019. [20] CHEN D, FISCH A, WESTON J, et al. Reading wikipedia to answer open-domain questions[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 1870-1879. [21] WANG Z, NG P, MA X, et al. Multi-passage BERT: a globally normalized BERT model for open-domain question answering[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 5881-5885. [22] WANG S, YU M, GUO X, et al. R3: Reinforced reader-ranker for open-domain question answering[J]. arXiv preprint arXiv: 1709.00023, 2017. [23] LEE K, CHANG M W, TOUTANOVA K. Latent retrieval for weakly supervised open domain question answering[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 6086-6096. [24] GUU K, LEE K, TUNG Z, et al. Realm: Retrieval-augmented language model pre-training[J]. arXiv preprint arXiv: 2002.08909, 2020. [25] SEO M, LEE J, KWIATKOWSKI T, et al. Real-time open-domain question answering with dense-sparse phrase index[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 4430-4441. [26] MIN S, CHEN D, ZETTLEMOYER L, et al. Knowledge guided text retrieval and reading for open domain question answering[J]. arXiv preprint arXiv: 1911.03868, 2019. [27] XIONG W, YU M, CHANG S, et al. Improving question answering over incomplete KBs with knowledge-aware reader[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 4258-4264. [28] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186. [29] WU Y, SCHUSTER M, CHEN Z, et al. Google's neural machine translation system: bridging the gap between human and machine translation[J]. arXiv preprint arXiv: 1609.08144, 2016. [30] MANNING C D, SURDEANU M, BAUER J, et al. The Stanford CoreNLP natural language processing toolkit[C]//Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2014: 55-60. [31] JOHNSON J, DOUZE M, JGOU H. Billion-scale similarity search with GPUs[J]. IEEE Transactions on Big Data, 2019. [32] WOLF T, CHAUMOND J, DEBUT L, et al. Transformers: State-of-the-art natural language processing[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020: 38-45.