阅读理解中观点类问题的扩展研究

张兆滨,王素格,陈鑫,赵琳玲,王典

PDF(1615 KB)
PDF(1615 KB)
中文信息学报 ›› 2020, Vol. 34 ›› Issue (6) : 89-96,105.
机器阅读理解

阅读理解中观点类问题的扩展研究

  • 张兆滨1,王素格1,2,陈鑫1,赵琳玲1,王典1
作者信息 +

Question Expansion for Machine Reading Comprehension of Opinion

  • ZHANG Zhaobin1, WANG Suge1,2, CHEN Xin1, ZHAO Linling1, WANG Dian1
Author information +
History +

摘要

在高考语文阅读理解中,观点类问题中的观点表达较为抽象,为了从阅读材料中获取与问题相关的答案信息,需要对问题中的抽象词语进行扩展,达到扩展观点类问题的目的。该文提出了基于多任务层级长短时记忆网络(Multi-HLSTM)的问题扩展建模方法。首先将阅读材料与问题进行交互注意,同时建模问题预测和答案预测两个任务,使模型对问题进一步扩展。最后将扩展后的问题与原问题同时应用于问题的答案候选句抽取中。通过在高考语文观点类的真题、模拟题以及DuReader的描述观点类数据集上进行实验,验证了本文的问题扩展模型对答案候选句的抽取性能具有一定的提升作用。

Abstract

Among the Chinese reading comprehension of the college entrance examination, the opinion questions are rich in abastract viewpoint expressions. In order to obtain the answer information related to the questions from the reading materials, the abstract words in the questions need to be expanded, resulting an expansion of the opinion questions. This paper proposes a question expansion modeling method with the multi-task hierarchical Long Short-Term Memory (Multi-HLSTM). First, the reading materials and the questions are connected with attention mechanism. At the same time, the two tasks of the questions prediction and the answers prediction are modeled to further expand the questions. Finally, the extended questions and the original questions are applied to extract the candidate sentences of the questions as the answers. On the data sets of opinion questions reading comprehensions of the Chinese college entrance examination, its related simulation test and the datasets of description and opinion type of DuReader, the experimental results show that the proposed question expansion model is effective on the extraction of candidate sentences.

关键词

问题扩展 / 高考语文 / 阅读理解 / 观点类题型

Key words

question expansion / Chinese college entrance examination / reading comprehension / opinion questions

引用本文

导出引用
张兆滨,王素格,陈鑫,赵琳玲,王典. 阅读理解中观点类问题的扩展研究. 中文信息学报. 2020, 34(6): 89-96,105
ZHANG Zhaobin, WANG Suge, CHEN Xin, ZHAO Linling, WANG Dian. Question Expansion for Machine Reading Comprehension of Opinion. Journal of Chinese Information Processing. 2020, 34(6): 89-96,105

参考文献

[1] Voorhees E M, Tice D M. Building a question answering test collection[C]//Proceedings of the International ACM SIGIR Conference on Research and Development in the Information Retrieval, 2000: 200-207.
[2] Hermann K M, Koisk, Tomá, Grefenstette E, et al. Teaching machines to read and comprehend[C]//Proceedings of the Advances in Neural Information Processing Systems 28, 2015: 1684-1692.
[3] Hill F, Bordes A, Chopra S, et al. The Goldilocks Principle: Reading children's books with explicit memory representations[J]. arXiv preprint arXiv: 1511.02301,2016.
[4] Rajpurkar P, Zhang J, Lopyrev K, et al. SQuAD: 100,000+ Questions for machine comprehension of text[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016: 2383-2392.
[5] Bajaj P, Campos D, Craswell N, et al. MS MARCO: A human generated machine reading comprehension dataset[J]. arXiv preprint arXiv,: 1611.09268,2016.
[6] Dunn M, Sagun L, Higgins M, et al. SearchQA: A new Q& A dataset augmented with context from a search engine[J]. arXiv preprint arXiv: 1704.05179,2017.
[7] He W, Liu K, Liu J, et al. DuReader: A chinese machine reading comprehension dataset from real-world applications[J]. arXiv preprint arXiv: 1711.05073,2018.
[8] 王素格, 李书鸣, 陈鑫, 等. 面向高考阅读理解观点类问题的答案抽取方法[J]. 郑州大学学报(理学版), 2018,50(01): 57-62.
[9] 张志昌, 张宇, 刘挺, 等. 基于浅层语义树核的阅读理解答案句抽取[J]. 中文信息学报, 2008, 22(1): 80-86.
[10] Wang S G, Jiang J. Machine comprehension using match-LSTM and answer pointer[J]. arXiv preprint arXiv: 1608.07905,2016.
[11] Seo M, Kembhavi A, Farhadi A, et al. Bidirectionalattention flow for machine comprehension[J]. arXiv preprint arXiv: 1611.01603, 2017.
[12] Tany C, Wei F, Wang W, et al. Multiway attention networks for modeling sentence pairs[C]//Proceedings of the International Joint Conferences on Artificial Intelligence Organization, 2018: 4411-4417.
[13] Xiong C, Zhong V, Socher R. Dynamic coattention networks for question answering[J]. arXiv preprint arXiv: 1611.01604,2017.
[14] Wang W, Yang N, Wei F, et al. Gated self-matching networks for reading comprehension and question answering[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 189-198.
[15] Yu A W, Dohan D, Luong M T, et al.QANet: Combining local convolution with global self-attention for reading comprehension[J]. arXiv preprint arXiv: 1804.09541,2018.
[16] 张志昌,张宇,刘挺,等.基于话题和修辞识别的阅读理解why型问题回答[J]. 计算机研究与发展, 2011, 48(2): 216-223.
[17] 张傲. 面向问答系统的问题分类与答案抽取研究[D]. 沈阳: 东北大学硕士学位论文, 2013.
[18] 朱龙霞. 面向中文问答系统问题分析与答案抽取方法研究[D].石家庄: 河北科技大学硕士学位论文, 2019.
[19] Fellbaum C. WordNet[M]. Theory and Applications of Ontology: Computer Applications. Blackwell Publishing Ltd, 2010.
[20] Bin Y, Xiaoran L, Ning L, et al. Using information content to evaluate semantic similarity on HowNet[C]//Proceedings of the 8th International Conference on Computational Intelligence and Security November, 2012: 142-145.
[21] Deerwester S, Dumais S T, Furnas G W, et al. Indexing by latent semantic analysis[J]. Journal of the Association for Information Science & Technology, 2010, 41(6): 391-407.
[22] Sordoni A, Bengio Y, Vahabi H, et al. A hierarchical recurrent encoder-decoder for generative context-aware query suggestion[C]//Proceedings of the Conference on Information and Knowledge Management, 2015: 553-562.
[23] 乔霈, 王素格, 陈鑫, 等. 基于词语关联的散文阅读理解问题答案获取方法[J]. 中文信息学报, 2018, 32(3): 140-147.
[24] Wang T, Yuan X, Trischler A. A joint model for question answering and question generation[J]. arXiv preprint arXiv: 1706.01450, 2017.
[25] Tang D, Duan N, Qin T, et al. Question answering and question generation as dual tasks[J]. arXiv preprint arXiv: 1706.02027,2017.
[26] 武永亮, 赵书良, 李长镜,等. 基于TF-IDF和余弦相似度的文本分类方法[J]. 中文信息学报, 2017, 31(5): 138-145.
[27] Svore K M, Burges C J C. A machine learning approach for improved BM25 retrieval[C]//Proceeding of the 18th ACM Conference on Information and Knowledge Management - CIKM, 2009: 1811.
[28] 熊大平, 王健, 林鸿飞. 一种基于LDA的社区问答问句相似度计算方法[J]. 中文信息学报, 2012, 26(5): 40-46.
[29] Li H. Learning to rank for information retrieval and natural language processing[M]. Morgan & Claypool Publishers, 2011.[30] Chen X, Hao F, Lin T Y, et al. Microsoft COCO Captions: Data collection and evaluation server[J]. arXiv preprint arXiv: 1504.00325, 2015.
[31] Seo M, Kembhavi A, Farhadi A, et al. Bidirectional attention flow for machine comprehension[J]. arXiv preprint arXiv,: 1611.01603,2017.

基金

国家重点研发计划(2018YFB1005103);国家自然科学基金(61573231);山西省重点研发计划项目(201803D421024)
PDF(1615 KB)

874

Accesses

0

Citation

Detail

段落导航
相关文章

/