基于LM算法的领域概念实体属性关系抽取

刘丽佳,郭剑毅,周兰江,余正涛,邵发,张金鹏

PDF(2001 KB)
PDF(2001 KB)
中文信息学报 ›› 2014, Vol. 28 ›› Issue (6) : 216-222.
信息抽取与文本挖掘

基于LM算法的领域概念实体属性关系抽取

  • 刘丽佳1,2,郭剑毅1,2,周兰江1,2,余正涛1,2,邵发1,2,张金鹏1,2
作者信息 +

Domain Concepts Entity Attribute Relation Extraction Based on LM Algorithm

  • LIU Lijia1,2, GUO Jianyi1,2, ZHOU Lanjiang1,2, YU Zhengtao1,2, SHAO Fa1,2, ZHANG Jinpeng1,2
Author information +
History +

摘要

针对非结构化自由文本中关系模式比较复杂,关系抽取性能不高的问题,该文提出了利用BP神经网络的优化算法-LM算法,对非结构化自由文本信息中的领域概念实体属性关系进行抽取。首先对语料进行预处理,然后利用CRFs模型对领域概念的实例、属性和属性值进行实体识别,然后根据领域中各类关系的特点分别进行特征提取,构造BP神经网络模型,利用LM算法抽取相应关系。和适用于二分类问题的SVM相比,人工神经网络优化算法自主学习能力强,识别精度高,更适用于多分类的问题。通过几组实验表明,该方法在领域概念实体属性关系抽取方面取得了良好的效果, F值提高了12.8%。

Abstract

Aimed at the problems of complex relation pattern and low relation extraction performance in the unstructured free text, this paper proposes an approach to extract the entity attribute relation from unstructured free text information by applying the LM optimization algorithm of BP neural network. The procedure consists ofthe corpus preprocessing, the named entity recognition (including the instance, attributes and attribute values) by CRFs model, the BP neural network construction over the domain features, and the application ofLM algorithm to extract corresponding relations. Compared to SVM, the artificial neural network optimization algorithm is more suitable for multi-classification problems with a higher recognition accuracy. Several groups of tests show that the method in this paper has achieved good effect in the field of entity attribute relation extraction with an improvments of 12.8% in term of F-score.

关键词

BP神经网络 / LM算法 / 属性关系抽取

Key words

BP neural network / LM algorithm / attribute relation extraction

引用本文

导出引用
刘丽佳,郭剑毅,周兰江,余正涛,邵发,张金鹏. 基于LM算法的领域概念实体属性关系抽取. 中文信息学报. 2014, 28(6): 216-222
LIU Lijia, GUO Jianyi, ZHOU Lanjiang, YU Zhengtao, SHAO Fa, ZHANG Jinpeng. Domain Concepts Entity Attribute Relation Extraction Based on LM Algorithm. Journal of Chinese Information Processing. 2014, 28(6): 216-222

参考文献

[1] David Sánchez. A methodology to learn ontological attributes from the Web[J].Data & Knowledge Engineering, 2010,69(6): 573-597.
[2] S Ravi,M Pasca. Using structured text for large-scale attribute extraction[C]//Proceedings of the 17th CIKM(CIKM 2008),Napa Valley, California,2008: 1183-1192.
[3] 李文杰,穗志方.基于并列结构的实例和属性的同步提取方法[J].中文信息学报,2012,26(2): 82-87.
[4] 唐伟,洪宇,冯艳卉,等.网页中商品“属性-值”关系的自动抽取方法研究[J].中文信息学报,2013,27(1): 21-29.
[5] 程显毅,朱倩.未定义类型的关系抽取的半监督学习框架研究[J].南京大学学报: 自然科学版,2012,48(4):466-474.
[6] 杨宇飞,戴齐,贾真,等.基于弱监督的属性关系抽取方法[J].计算机应用,2014,34(1):64-68.
[7] 王建梅,覃文忠. 基于L-M算法的BP神经网络分类器[J]. 武汉大学学报(信息科学版),2005,10:928-931.
[8] Manolis I A Lourakis. A Brief Description of theLevenberg-Marquardt Algorithm Implemened by levmar. Institute of Computer Science Foundation for Research and Technology - Hellas (FORTH). Vassilika Vouton, P.O. Box 1385, GR 711 10 Heraklion, Crete, GREECE. February 11, 2005.
[9] 张长胜,欧阳丹彤,岳娜,等. 一种基于遗传算法和LM算法的混合学习算法[J].吉林大学学报(理学版),2008,04:675-680.
[10] Janti Shawash,David R Selviah. Real-Time Nonlinear Parameter Estimation Usingthe Levenberg-Marquardt Algorithm on Field Programmable Gate Arrays[J]. IEEE Transactions on Industrial Electronics, 2013,60(1): 170-176.
[11] 董静,孙乐,冯元勇.中文实体关系抽取中的特征选择研究[J].中文信息学报, 2007,21(4): 80-85.
[12] 车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6.
[13] Pengyi Gao,Chuanbo Chen,Sheng Qin. An optimization method of hidden nodes for neural network[J].2nd International Workshop on Education Technology and Computer Science, ETCS 2010, 2: 53-56.

基金

国家自然科学基金(61175068);云南省教育厅基金重大专项项目(KKJI201203001);云南省应用基础研究计划重点项目(2013FA030)
PDF(2001 KB)

466

Accesses

0

Citation

Detail

段落导航
相关文章

/