关系抽取是信息抽取领域一项十分具有挑战性的任务,用于将非结构化文本转化为结构化数据。近年来,卷积神经网络和循环神经网络等深度学习模型,被广泛应用于关系抽取的任务中,且取得了不错的效果。卷积网络和循环网络在该任务上各有优势,且存在一定的差异性。其中,卷积网络擅长局部特征提取,循环网络能够捕获序列整体信息。针对该现象,该文综合卷积网络抽取局部特征的优势和循环网络在时序依赖中的建模能力,提出了卷积循环神经网络(convolutional recurrent neural network,CRNN)。该模型分为三层: 首先针对关系实例抽取多粒度局部特征,然后通过聚合层融合不同粒度的特征,最后利用循环网络提取特征序列的整体信息。此外,该文还探究多种聚合策略对信息融合的增益,发现注意力机制对多粒度特征的融合能力最为突出。实验结果显示,CRNN优于主流的卷积神经网络和循环神经网络,在SemEval 2010 Task 8数据集上取得了86.52%的F1值。
Abstract
Relation extraction is a challenging task in information extraction, which is used to transform unstructured text into structured data. In recent years, deep learning models such as Convolutional Neural Network and Recurrent Neural Network have been widely used in relation extraction tasks and have achieved good results. To combine the advantages of CNN to extract local features and RNN to model in time series dependence, this paper proposes a convolutional recurrent neural network (CRNN) to extract phrase-level features and multi-granularity phrases for relation instances. The model is divided into three layers. Firstly, multi-granularity local features are extracted for the relation instance, and then the different granularity features are merged through the aggregation layer. Finally, the overall information of the feature sequence is extracted by RNN. In addition, this paper also explores the gains of various aggregation strategies for information fusion, and finds that the attention mechanism is the most prominent for the fusion of different granularity features. The experimental results show that CRNN is superior to state of the art CNN and RNN models with 86.52% of F1 scores on the SemEval 2010 Task 8 dataset.
关键词
关系抽取 /
卷积神经网络 /
循环神经网络 /
聚合策略 /
注意力机制
{{custom_keyword}} /
Key words
relation extraction /
convolutional neural network /
recurrent neural network /
aggregation strategy /
attention mechanism
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Guodong Z, Jian S, Jie Z, et al. Exploring various knowledge in relation extraction[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 2005:427-434.
[2] Kambhatla N. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations[C]//Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. Association for Computational Linguistics, 2004:22.
[3] Zelenko D, Aone C, Richardella A. Kernel methods for relation extraction[J]. Journal of Machine Learning Research, 2003, 3(2): 1083-1106.
[4] Culotta A, Sorensen J. Dependency tree kernels for relation extraction[C]//Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2004:423.
[5] Bunescu R C, Mooney R J. A shortest path dependency kernel for relation extraction[C]//Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2005:724-731.
[6] Zhang M, Zhang J, Su J. Exploring syntactic features for relation extraction using a convolution tree kernel[C]//Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006:288-295.
[7] Hendrickx I, Kim S N, Kozareva Z, et al. Semeval-2010 task 8:Multi-way classification of semantic relations between pairs of nominals[C]//Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions, 2009:94-99.
[8] Zeng D, Liu K, Lai S, et al. Relation classification via convolutional deep neural network[C]//Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, 2014:2335-2344.
[9] Nguyen T H, Grishman R. Relation extraction: Perspective from convolutional neural networks[C]//Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, 2015:39-48.
[10] Santos C N, Xiang B, Zhou B. Classifying relations by ranking with convolutional neural networks[J]. arXiv preprint arXiv: 1504.06580, 2015.
[11] Linlin Wang, Zhu Cao, Gerard de Melo, et al.Relation classification via multi-level attention CNNs[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,2016:1298-1307.
[12] Zhang D, Wang D. Relation classification via recurrent neural network[J]. arXiv preprint arXiv: 1508.01006, 2015.
[13] Li J, Luong M T, Jurafsky D, et al. When are tree structures necessary for deep learning of representations[J]. arXiv preprint arXiv: 1503.00185, 2015.
[14] Miwa M, Bansal M. End-to-end relation extraction using lstms on sequences and tree structures[J]. arXiv preprint arXiv: 1601.00770, 2016.
[15] Zhou P, Shi W, Tian J, et al. Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016, 2:207-212.
[16] Xiao M, Liu C. Semantic relation classification via hierarchical recurrent neural network with attention[C]//Proceedings of COLING, 2016:1254-1263.
[17] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of Advances in Neural Information Processing Systems, 2017:5998-6008.
[18] Feng X, Qin B, Liu T. A language-independent neural network for event detection[J]. Science China(Information Sciences), 2018, 61(09): 81-92.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61672367,61672368);国家重点研发计划(2017YFB1002104)
{{custom_fund}}