机器阅读理解旨在教会机器去理解一篇文章并且回答与之相关的问题。为了解决低资源语言上机器阅读理解模型性能低的问题,该文提出了一种基于注意力机制的藏文机器阅读理解端到端网络模型Ti-Reader。首先,为了编码更细粒度的藏文文本信息,将音节和词相结合进行词表示,然后采用词级注意力机制去关注文本中的关键词,利用重读机制去捕捉文章和问题之间的语义信息,自注意力机制去匹配问题与答案的隐变量本身,为答案预测提供更多的线索。最后,实验结果表明,Ti-Reader模型提升了藏文机器阅读理解的性能,同时在英文数据集SQuAD上也有较好的表现。
Abstract
Machine reading comprehension aims to enable machines to answer questions related to a given article. To address the machine reading comprehension models in low-resource languages, this paper proposes an end-to-end attention based model for Tibetan named Ti-Reader. First, to encode more fine-grained Tibetan text information, this paper combines syllables and words for word embedding, and then uses word-level attention to capture the keywords in the article. Moreover, the re-read mechanism is applied to capture the semantic information between the article and the questions, and the self-attention is used to match the hidden variables of the question and the answer. The experimental results show that Ti-Reader improves the performance of Tibetan machine reading comprehension, while preserving a good performance on the English dataset SQuAD.
关键词
机器阅读理解,注意力机制 /
端到端网络 /
藏文
{{custom_keyword}} /
Key words
machine reading comprehension /
attention /
end-to-end network /
Tibetan
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] ELLEN R, MICHAEL T. A rule-based question answering system for reading comprehension tests[C]//Proceedings of the ANLP-NAACL Workshop: Reading Comprehension Tests as Evaluation for Computer-Based Language Understanding Systems, 2000: 13-19.
[2] FELIX H, ANTOINE B, SUMIT C, et al. The goldilocks principle: Reading children's books with explicit memory representations[C]//Proceedings of the Conference and Workshop on Neural Information Processing Systems, 2016: 1-13.
[3] LAI G K, XIE Q Z, LIU H X, et al. Race: Large-scale reading comprehension dataset from examinations[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2017: 785-794.
[4] RAJPURKAR P, ZHANG J, KONSTANTIN L, et al. SQuAD: 100,000+ questions for machine comprehension of text[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 2383-2392.
[5] TAN C Q, WEI F R, YANG N, et al. S-net: From answer extraction to answer synthesis for machine reading comprehension[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018.
[6] TAN C Q, WEI F R, YANG N, et al. Text understanding with the attention sum reader network[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 908-918.
[7] ALESSANDRO S, PHILIP B, ADAM T, et al. Iterative alternating neural attention for machine reading[J]. arXiv preprint arXiv: 1606.02245, 2016.
[8] WANG S, JIANG J. Machine comprehension using match-LSTM and answer pointer[C]//Proceedings of the International Conference on Learning Representations, 2017: 1-15.
[9] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997,9(8): 1735-1780.
[10] WANG W H, YANG N, WEI F R, et al. Gated self-matching networks for reading comprehension and question answering[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 189-198.
[11] CUI Y M, CHEN Z P, WEI S, et al. Attention-over-attention neural networks for reading comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 593-602.
[12] SEO M, KEMBHAVI A, FARHADI A, et al. Bidirectional attention flow for machine comprehension[C]//Proceedings of the International Conference on Learning Representations, 2017: 1-13.
[13] XIONG C, ZHONG V, SOCHER R. Dynamic coattention networks for question answering[C]//Proceedings of the International Conference on Learning Representations, 2017: 1-14.
[14] HUANG H Y, ZHU C, SHEN Y, et al. FusionNet: Fusing via fully-aware attention with application to machine comprehension[C]//Proceedings of the 6th International Conference on Learning Representations, 2018: 1-20.
[15] WANG W, YAN M, WU C. Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 1705-1714.
[16] JASON W, SUBMIT C, ANTOINE B. Memory networks[C]//Proceedings of the 3rd International Conference on Learning Representations, 2015: 1-15.
[17] YIN W, EBERT S, SCHTZE H. Attention-based convolutional neural network for machine comprehension[C]//Proceedings of the Workshop on Human-Computer Question Answering, 2016: 15-21.
[18] 龙从军, 刘汇丹, 诺明花. 等. 基于藏语字性标注的词性预测研究[J]. 中文信息学报, 29(5): 211-216.
[19] RUPESH K S, KLAUS G, SCHMIDHUBER J. Highway networks[J]. arXiv preprint arXiv: 1505.00387, 2015.
[20] 孙媛, 刘思思, 陈超凡, 等. 面向机器阅读理解的高质量藏语数据集构建[C]//第二十届中国计算语言学大会, 2021.
[21] ADAMS W Y, DAVID D, MINH THANG L, et al. Qanet: Combining local convolution with global self-attention for reading comprehension[J]. arXiv preprint arXiv: 1804.09541, 2018.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61972436);中央民族大学项目(GRSCP202316,2023QNYL22)
{{custom_fund}}