一种求解数学应用题的多粒度图神经网络编码器

黄林嘉,肖菁,曹阳

PDF(2054 KB)
PDF(2054 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (2) : 148-157.
自然语言处理应用

一种求解数学应用题的多粒度图神经网络编码器

  • 黄林嘉,肖菁,曹阳
作者信息 +

Solving Math Word Problems by Multi-grained Graph Neural Networks

  • HUANG Linjia, XIAO Jing, CAO Yang
Author information +
History +

摘要

近几年,数学应用题自动解答(Math Word Problems, MWP)的研究受到越来越多学者关注,大多数研究的重点是对编码器的改进。然而目前的研究在编码器的改进方面还存在以下问题: ①输入文本的颗粒度一般是字级别,这会导致泛化能力不足; ②大多数模型对文本信息的挖掘没有充分利用文本内实体、词性等信息,只是停留在时序信息层面。该文针对以上问题,在双向GRU(Gated Recurrent Unit)的基础上提出了一种新颖的基于多粒度分词和图卷积网络的编码器结构(Multi-grained Graph Neural Networks, MGNet)。多粒度分词是通过对文本的每个词进行不同颗粒度的分词,增加了样本容量,并且通过引入一些噪声样本,提高了模型的泛化能力。图卷积神经网络通过构建文本内实体、数字、日期之间的不同的属性图,对它们之间隐含的关系进行建模。在Math23K和Ape210K数据集的实验显示,该文提出的模型MGNet准确率分别达到77.73%和80.8%。

Abstract

In recent years, the task of automatically solving Math Word Problems (MWP) has received more and more attention and most of researchers focused on improving the encoders. The issues in current encoders include: (1)The input granularity is character level and it will cause insufficient generalization ability; (2)Only the text sequence is modeled without capturing the entities, part of speech and other textual information.To alleviate the above issues, this paper proposes a novel encoder structure based on multi-grained word segmentation and graph convolution (MGNet) using the bidirectional GRU(Gated Recurrent Unit). The multi-grained word segmentation increases the sample capacity by segmenting the text with different granularities and improves the generalization ability of the model by introducing some noise samples. The graph convolutional neural networks can learn the implicit relationship among name entities, numbers and dates by constructing different attribute graphs among them. Experiments on two public benchmarks Math23K and Ape210K datasets show that our proposed MGNet can achieve the accuracy of 77.73% and 80.8% respectively.

关键词

多粒度 / 图神经网络 / 数学应用题 / 人工智能

Key words

multi-granularity / graph neural network / math word problems / artificial intelligent

引用本文

导出引用
黄林嘉,肖菁,曹阳. 一种求解数学应用题的多粒度图神经网络编码器. 中文信息学报. 2023, 37(2): 148-157
HUANG Linjia, XIAO Jing, CAO Yang. Solving Math Word Problems by Multi-grained Graph Neural Networks. Journal of Chinese Information Processing. 2023, 37(2): 148-157

参考文献

[1] FLETCHER C R. Understanding and solving arithmetic word problems: A computer simulation [J]. Behavior Research Methods, Instruments, & Computers, 1985, 17(5): 565-571.
[2] BAKMAN Y. Robust understanding of word problems with extraneous information [J]. arXiv preprint math/0701393, 2007.
[3] YUHUI M, YING Z, GUANGZUO C, et al. Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems [C]//Proceedings of the 2nd International Workshop on Education Technology and Computer Science. IEEE, 2010, 2: 476-479.
[4] CHO K,GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation [J]. arXiv preprint arXiv: 1406.1078, 2014.
[5] QI P, ZHANG Y, ZHANG Y, et al. Stanza: A python natural language processing toolkit for many human languages [J]. arXiv preprint arXiv: 2003.07082, 2020.
[6] KUSHMAN N, ARTZI Y, ZETTLEMOYER L, et al. Learning to automatically solve algebra word problems [C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014: 271-281.
[7] MITRA A,BARAL C. Learning to use formulas to solve simple arithmetic problems [C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 2144-2153.
[8] ZHOU L, DAI S, CHEN L. Learn to solve algebra word problems using quadratic programming [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 817-822.
[9] ROY S, ROTH D. Mapping to declarative knowledge for word problem solving [J]. Transactions of the Association for Computational Linguistics, 2018, 6: 159-172.
[10] HUANG D, SHI S, LIN C Y, et al. Learning fine-grained expressions to solve math word problems [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2017: 805-814.
[11] WANG Y, LIU X, SHI S. Deep neural solver for math word problems [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2017: 845-854.
[12] XIE Z, SUN S. A goal-driven tree-structured neural model for math word problems [C]//Proceedings of the IJCAI, 2019: 5299-5305.
[13] ZHANG J, WANG L, LEE R K W, et al. Graph-to-tree learning for solving math word problems [C]//Proceedings of the Association for Computational Linguistics, 2020.
[14] LEE D, KI K S, KIM B, et al. TM-generation model: a template-based method for automatically solving mathematical word problems [J]. The Journal of Supercomputing, 2021, 77(12): 14583-14599.
[15] ZAREMBA W, SUTSKEVER I, VINYALS O. Recurrent neural network regularization [J]. Eprint Arxiv, 1409,2329v1,2014.
[16] SONG L, WANG Z, YU M, et al. Exploring graph-structured passage representation for multi-hop reading comprehension with graph neural networks [J]. arXiv preprint arXiv: 1809.02040, 2018.
[17] CAO Y, FANG M, TAO D. BAG: Bi-directional attention entity graph convolutional network for multi-hop reasoning question answering [J]. arXiv preprint arXiv: 1904.04969, 2019.
[18] TU M, WANG G, HUANG J, et al. Multi-hop reading comprehension across multiple documents by reasoning over heterogeneous graphs [J]. arXiv preprint arXiv: 1905.07374, 2019.
[19] VASWANI A,SHAZEER N, PARMAR N, et al. Attention is all you need [J]. Advances in Neural Information Processing Systems, 2017, 30.
[20] RAN Q, LIN Y, LI P, et al.NumNet: Machine reading comprehension with numerical reasoning [J]. arXiv preprint arXiv: 1910.06701, 2019.
[21] 陈雨龙, 付乾坤, 张岳. 图神经网络在自然语言处理中的应用[J]. 中文信息学报, 2021, 35(3): 1-23.
[22] KONCEL-KEDZIORSKI R, ROY S, AMINI A, et al. MAWPS: A math word problem repository [C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 1152-1157.
[23] CAI D, LAM W. Graph transformer for graph-to-sequence learning [C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(05): 7464-7471.
[24] HENAFF M, BRUNA J, LECUN Y. Deep convolutional networks on graph-structured data [J]. arXiv preprint arXiv: 1506.05163, 2015.
[25] DEFFERRARD M, BRESSON X, VANDERGHEYNST P. Convolutional neural networks on graphs with fast localized spectral filtering [J]. Advances in Neural Information Processing Systems, 2016, 29.
[26] CHEN J, ZHU J, SONG L. Stochastic training of graph convolutional networks with variance reduction [J]. arXiv preprint arXiv: 1710.10568, 2017.
[27] CHIANG T R, CHEN Y N. Semantically-aligned equation generation for solving and reasoning math word problems[J].arXiv preprint arXiv: 1811.00720, 2018.
[28] LI J, WANG L, ZHANG J, et al. Modeling intra-relation in math word problems with different functional multi-head attentions[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 6162-6167.

基金

国家自然科学基金(62177015);国防科技重点实验室稳定支持经费项目(WDZC20205250410)
PDF(2054 KB)

1040

Accesses

0

Citation

Detail

段落导航
相关文章

/