Extracting useful information automatically from large-scale unstructured texts has been a long-standing goal of NLP and AI. And open information extraction is now widely pursued for effective web information acquisition. Open information extraction can be divided into dual and n-tuple entity relation extraction according to the number of arguments involved. In accordance with these two aspects, this paper analyses several typical methods for open relation extraction together with their defects. It is indicated that most current methods still belong to shallow semantic processing, hardly considering the implicit relation. Therefore, it is beleved that the adoption of joint inference strategy such as the markov logic and the ontology structure based inference can take advantage of multiple features. The combination of open and open up a promising prospect to infer the fine and full information for open information extraction.
YANG Bo, CAI Dongfeng, YANG Hua.
Progress in Open Information Extraction. Journal of Chinese Information Processing. 2014, 28(4): 1-11
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Oren Etzioni, Michele Banko, Michael J. Cafarella. Machine reading[C]//Proceedings of AAAI Conference on Artificial Intelligence, 2006. [2] K Barker, B Agashe, S Chaw, et al. Learning by reading: A prototype system, performance baseline and lessons learned[C]//Proceedings of 22nd National Conference of Artificial Intelligence, 2007. [3] 赵军,刘康,周光有,蔡黎.开放式文本信息抽取[J].中文信息学报,2011,25(6):98-110. [4] O Etzioni, M Cafarella, D Downey, et al. Unsupervised named-entity extraction from the web: An experimental study[J]. Artificial Intelligence, 2005, 165(1):91-134. [5] Michele Banko, Michael J Cafarella, Stephen Soderland, et al. Open information extraction from the web[C]//Proceedings of IJCAI, 2007. [6] Michele Banko, Oren Etzioni. The tradeoffs between open and traditional relation extraction[C]//Proceedings of Annual Meeting of the Association for Computational Linguistics, 2008. [7] F Wu, D S Weld. Open information extraction using Wikipedia[C]//Proceedings of Annual Meeting of the Association for Computational Linguistics, 2010: 118-127. [8] Fei Wu, Daniel S Weld. Automatically semantifying Wikipedia[C]//Proceedings of the 16th Conference on Information and Knowledge Management, 2007. [9] Anthony Fader, Stephen Soderland, Oren Etzioni. Identifying relations for open information extraction[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing, 2011. [10] Oren Etzioni, Anthony Fader, Janara Christensen, et al. Open information extraction: the second generation[C]//Proceedings of International Joint Conference on Artificial Intelligence, 2011. [11] Mausam, Michael Schmitz, Robert Bart, Stephen Soderland, Oren Etzioni. Open Language Learning for Information Extraction[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CONLL), 2012. [12] Janara Christensen, Mausam, Stephen Soderland, Oren Etzioni. An analysis of open information extraction based on semantic role labeling[C]//Proceedings of K-CAP, 2011: 113-120. [13] Johannes Hoffart, Fabian M. Suchanek, Klaus Berberich, et al. YAGO2: A Spatrally and Iemporally Enhanced Knowledge Base Powwikipedia[J].Artificial Intelligence, 2013,194:28-16. [14] Xiao Ling, Daniel S.Weld. Temporal information extraction[C]//Proceedings of the 24th AAAI Conference on Artificial Intelligence, 2010. [15] Gerhard Weikum, Nikos Ntarmos, Marc Spaniol, et al. Longitudinal analytics on web archive data: Its about time![C]//Proceedings of CIDR, 2011: 199-202. [16] Alan Akbik, Alexander Loser. KRAKEN: N-ary Facts in Open Information Extraction[C]//Proceedings of AKBC-WEKEX at NAACL, 2012: 52-56. [17] Alan Akbik, Jurgen Bross. Wanderlust: Extracting semantic relations from natural language text using dependency grammar patterns[C]//Proceedings of the 1st Workshop on Semantic Search at 18th WWWW Conference, 2009. [18] D T Bollegala, Y Matsuo, M Ishizuka. Relational duality: Unsupervised extraction of semantic relations between entities on the web[C]//Proceedings of the 19th international conference on world wide web, 2010: 151-160. [19] Bonan Min, Shuming Shi, Ralph Grishman, Chin-Yew Lin. Ensemble Semantics for Large-scale Unsupervised Relation Extraction[C]//Proceedings of Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012: 1027-1037. [20] M Mintz, S Bills, R Snow, D Jurafsky. Distant supervision for relation extraction without labeled data[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 2009: 1003-1011. [21] Del Corro L, Gemulla R. ClansIE: Clanse-based Open Information Extraction[C]//Proceedings of the 22nd International conference on world wide web, 2013: 355-366. [22] Andrew McCallum. Joint Inference for Natural Language Processing[C]//Proceedings of the 13th Conference on Computational Natural Language Learning, 2009. [23] P Domingos, D Lowd. Markov Logic: An Interface Layer for Artificial Intelligence[M]. Morgan & Claypool, San Rafael, CA, 2009. [24] Wanxiang Che, Ting Liu. Jointly Modeling WSD and SRL with Markov Logic[C]//Proceedings of the 23rd International Conference on Computational Linguistics, 2010: 161-169. [25] Yang Song, Jing Jiang, Wayne Xin Zhao, et al. Joint Learning for Coreference Resolution with Markov Logic[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing, 2012. [26] Xipeng Qiu, Ling Cao, Zhao Liu, Xuan jing Huang. Recongnizing Inference in Iexts with Markov Logic Networks[J]. ACM Language Information Processing, 2012, 11(4), Article 15. [27] Hongjie Dai, Richard Tzong-Han Tsai, Wen-Lian Hsu. Entity Disambiguation Using a Markov Logic Network[C]//Proceedings of the 5th International Joint Conference on Natural Language Processing, 2011: 846-855. [28] Hoifung Poon, Pedro Domingos. Joint Inference in Information Extraction[C]//Proceedings of the 22nd National Conference on Artificial Intelligence, 2007: 913-918. [29] Jun Zhu, Zaiqing Nie, Xiaojiang Liu, Bo Zhang, Jirong Wen. StatSnowball: a statistical approach to extracting entity relationships[C]//Proceedings of the 18th international conference on World Wide Web, 2009: 101-110. [30] E Agichtein, L Gravano. Snowball: Extracting relations from large plain-text collections[C]//Proceedings of the 5th ACM International Conference on Di-gital Libraries, 2000. [31] Xiaojiang Liu, Nenghai Yu. People Summarization by Combining Named Entity Recognition and Relation Extraction[J]. Journal of Convergence Information Technology, 2010, 5(10): 233-241. [32] Yongbin Liu, Bingru Yang. Joint Inference: a Statistical Approach for Open Information Extraction[J]. Appl. Math. Inf. 2012, 6(2): 627-633. [33] James Clarke. Global Inference for Sentence Compression: An Integer Linear Programming Approach[D]. PHD thesis, University of Edinburgh, 2008.[34] Sebastian Riedel. Efficient Prediction of Relational Structure and its Application to Natural Language Processing[D]. PHD thesis, University of Edinburgh, 2009. [35] Tuyen N. Huynh, Raymond J. Mooney. Online Max-Margin Weight Learning for Markov Logic Networks [C]//Proceedings of the 11th SIAM International Conference on Data Mining, 2011: 642-651. [36] A Carlson, J. Betteridge, B. Kisiel, et al. Toward an architecture for never-ending language learning[C]//Proceedings of the 24th National Conference on Artificial Intelligence, 2010: 1306-1313. [37] Thahir Mohamed, Estevam R. Hruschka Jr., Tom M.Mitchell. Discovering Relations between Noun Categories[C]//Proceedings of EMNLP, 2011. [38] S Schoenmackers. Inference over the web[D]. PHD thesis, University of Washington, 2011. [39] Fei Wu, Daniel S. Weld. Automatically refining the wikipedia infobox ontology[C]//Proceedings of the 17th International Conference on World Wide Web, 2008. [40] Congle Zhang, Raphael Hoffmann, Daniel S. Weld. Ontological Smoothing for Relation Extraction with Minimal Supervision[C]//Proceedings of AAAI, 2012. [41] A Moro, R Navigli. Integrating Syntactic and Semantic Analysis into the Open Information Extraution Paradigm[C]//Proceedings of IJCAI, 2013. [42] D Roth. On the hardness of approximate reasoning[J]. Artificial Intelligence, 1996, 82:273-302. [43] V Gogate, P Domingos. Probabilistic theorem proving[C]//Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, 2011:256-265. [44] C Kiddon, P Domingos. Coarse-to-fine inference and learning for first-order probabilistic models[C]//Proceedings of the 25th AAAI Conference on Artificial Intelligence, 2011:1049-1056. [45] P Domingos, Austin Webb. A Tractable First-Order Probabilistic Logic[C]//Proceedings of the 26th AAAI Conference on Artificial Intelligence, 2012. [46] Chloe Kiddon, Pedro Domingos. Knowledge Extraction and Joint Inference Using Tractable Markov Logic [C]//Proceedings of AKBC-WEKEX at NAACL, 2012: 79-83.