答案选择是问答系统中的关键技术之一,而长答案选择在社区问答系统、开放域问答系统等非实体问答系统中有着重要地位。该文提出了一个结合粗粒度(句子级别)和细粒度(单词或n元单词级)信息的模型,缓解了传统句子建模方式应用于长答案选择时不能把握住句子的全部重要信息的不足和使用比较-聚合框架处理该类问题时不能利用好序列全局信息的缺点。该融合粗细粒度信息的长答案选择模型在不引入多余训练参数的情况下使用了细粒度信息,有效提升了长答案选择的准确率。在InsuranceQA答案选择数据集上的实验显示,该模型比基于句子建模的当前最高水平模型准确率提高3.30%。同时该文的研究方法可为其他长文本匹配相关研究提供参考。
Abstract
The long answer selection plays an important role in non-factoid question answering systems such as community question answering and open-domain question answering systems. To improve the performance of long answer selection, we propose a novel model which combines coarse (sentence-level) and fine-grained (word-level) information. Our model also alleviates the following two issues: ① not all the important information in a long sequence can be modeled by a single vector, and ② the failure to capture global information under the compare-aggregate framework. Besides, our model uses fine-grained information without extra training parameters. The experiments on InsuranceQA dataset show that the proposed model outperforms the state-of-the-art sequence models by 3.30% in accuracy.
关键词
长答案选择 /
多粒度 /
深度神经网络模型
{{custom_keyword}} /
Key words
long answer selection /
multi-granularity /
deep neural networks
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Feng M, Xiang B, Glass M R, et al. Applying deep learning to answer selection: A study and an open task[C]//Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding(ASRU). Scottsdale, AR: IEEE, 2015; 813-820.
[2] Ruckle A, Moosavi N S, Gurevych I. COALA: A neural coverage-based approach for long answer selection with small data[C]//Proceedings of the 9th AAAI Symposium on Educational Advances in Artificial Intelligence. Honolulu, HI: Assoc Advancement Artificial Intelligence, 2019; 6932-6939.
[3] Hu B, Lu Z, Li H, et al. Convolutional neural network architectures for matching natural language sentences[C]//Proceedings of Advances in Neural Information Processing Systems. Curran Associates, Inc., 2014; 2042-2050.
[4] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the 3rd International Conference on Learning Representations,2015: 1409-1473.
[5] Wang B, Liu K, Zhao J. Inner attention based recurrent neural networks for answer selection[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(ACL). Berlin, Germany: Association for Computational Linguistics, 2016; 1288-1297.
[6] Yang R, Zhang J, Gao X, et al. Simple and effective text matching with richer alignment features[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics(ACL). Florence, Italy: Association for Computational Linguistics, 2019; 4699-4709.
[7] Yin W, Sch U Tze H, Xiang B, et al. ABCNN: Attention-based convolutional neural network for modeling sentence pairs[J]. Transactions of the Association for Computational Linguistics. 2016, 4: 259-272.
[8] Wang S, Jiang J. A compare-aggregate model for matching text sequences[C]//Proceedings of the 5th International Conference on Learning Representations(ICLR). 2017:1-15.
[9] Tay Y, Luu A T, Hui S C. Hermitian co-attention networks for text matching in asymmetrical domains[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence(IJCAI). Stockholm, Sweden, 2018: 4425-4431.
[10] Maia M, Handschuh S, Freitas A E, et al. WWW18 open challenge: Financial opinion mining and question answering[C]//Proceedings of WWW18. Republic and Canton of Geneva, Switzerland, 2018: 1941-1942.
[11] Tran N K, Niedere E E C. Multihop attention networks for question answer matching[C]//Proceedings of SIGIR18. New York, NY, USA, 2018: 325-334.
[12] Cohen D, Yang L, Croft W B. WikiPassageQA: A benchmark collection for research on non-factoid answer passage retrieval[C]//Proceedings of SIGIR18. New York, NY, USA, 2018: 1165-1168.
[13] Tan M, Dos Santos C, Xiang B, et al. Improved representation learning for question answer matching[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany: Association for Computational Linguistics, 2016: 464-473.
[14] Bojanowski P, Grave E, Joulin A, et al. Enriching word vectors with subword information[J]. Transactions of the Association for Computational Linguistics, 2017, 5: 135-146.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(62076046,61632011,62072070)
{{custom_fund}}