Multi-level Attention Mechanism Based Distant Supervision for Relation Extraction
CAI Qiang 1,2, HAO Jiayun 1,2, CAO Jian 1,2, LI Haisheng 1,2
1. School of Computer and Information Engineering, Beijing Technology and Business University, Beijing 100048, China; 2. Beijing Key Laboratory of Big Data Technology for Food Safety, Beijing Technology and Business University, Beijing 100048, China
Abstract:To best exploit the local and global feature, we proposed a distant supervised relation extraction model based on multi-level attention mechanism. We employ an attention matrix in pooling layer to capture the word-level sematic feature which indicates the relevant relationship between input words and relations. Moreover, we adopted sentence-level attention mechanism to compare the relationship between sentences and predicted relations. Experimental results show that the mean accuracy of the proposed model achieves 78% in the NYT data set, indicating an effective use of multi-level feature and better performance of distant relation extraction task.
[1] Li J,Zhang Z,Li X,et al.Kernel-based learning for biomedical relation extraction[J].Journal of the Association for Information Science and Technology,2008,59(5):756-769. [2] Socher R,Huval B,Manning C D,et al.Semantic compositionality through recursive matrix-vector spaces[C]//Proceedings of the EMNLP-CoNLL 2012.Korea,2012:1201-1211. [3] Zeng D,Liu K,Lai S,et al.Relation classification via convolutional deep neural network[C]//Proceedings of the COLING 2014.Ireland,2014:2335-2344. [4] Craven M,Kumlien J.Constructing biological knowledge bases by extracting information from text sources[C]//Proceedings of the ISMB 1999.Heidelberg,1999:77-86. [5] Mintz M,Bills S,Snow R,et al.Distant supervision for relation extraction without labeled data[C]//Proceedings of the ACL-IJCNLP 2009.Singapore,2009:1003-1011. [6] Zeng D,Liu K,Chen Y,et al.Distant supervision for relation extraction via piecewise convolutional neural networks[C]//Proceedings of the EMNLP 2015.Lisbon,2015:1753-1762. [7] Lin Y,Shen S,Liu Z,et al.Neural relation extraction with selective attention over instances[C]//Proceedings of the ACL 2016.Berlin,2016:2124-2133. [8] Zhou P,Shi W,Tian J,et al.Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the ACL 2016.Berlin,2016:207-212. [9] Yang Z,Yang D,Dyer C,et al.Hierarchical attention networks for document classification[C]//Proceedings of the NAACL 2016.San Diego,2016:1480-1489. [10] Chung J,Gulcehre C,Chol K H,et al.Empirical evaluation of gated recurrent neural networks on sequence modeling[J].arXiv preprint arXiv:1412.3555,2014. [11] Santos C,Tan M,Xiang B,et al.Attentive pooling networks[J].arXiv preprint arXiv:1602.03609,2016. [12] Hinton G E,Srivastava N,Krizhevsky A,et al.Improving neural networks by preventing co-adaptation of feature detectors[J].Computer Science,2012,3(4):212-223. [13] Riedel S,Yao L,McCallum A.Modeling relations and their mentions without labeled text[J].Machine Learning and Knowledge Discovery in Databases,2010:148-163. [14] Hoffmann R,Zhang C,Ling X,et al.Knowledge-based weak supervision for information extraction of overlapping relations[C]//Proceedings of the ACL HLT 2011.Portland,Oregon,USA:DBLP,2011:541-550. [15] Surdeanu M,Tibshirani J,Nallapati R,et al.Multi-instance multi-label learning for relation extraction[C]//Proceedings of the EMNLP-CoNLL 2012.Korea,2012:455-465.