因短文本实体消歧具有不能完整地表达语义关系、上下文提供的信息较少等局限性。针对以上难点,该文提出了一种新的方法,混合卷积网络(Mixed Convolution Network,MCN)。该方法的核心思想是首先对数据集进行预处理;其次,采用Google提出的BERT模型进行特征提取,并通过注意力机制将特征进一步抽取后作为CNN模型的输入,通过CNN模型获得句子的依赖特征。同时,该文使用GCN模型获取语义特征,将二者提取到的语义信息融合,输出得到结果。在CCKS2019评测数据集上的实验结果表明,该文提出的混合卷积网络取得了86.57%的精确率,验证了该模型的有效性。
Abstract
Entity disambiguation for short text has some limitations that short text can not fully express semantic relations, provide less context information, and so on. This paper proposes a new method named mixed convolution network (MCN). In this method, firstly, preprocess the data in the dataset; Secondly, the BERT model proposed by Google is applied to feature extraction, and the features are further extracted through the attention mechanism as the input of CNN model. The sentence dependent features are obtained through CNN model. At the same time, GCN model obtains text semantic features. The semantic information extracted from them is fused and the results are output. The experimental results on the ccks2019 evaluation data set show that the MCN proposed by this paper achieves an accuracy of 86.57%, which verifies the effectiveness of the method.
关键词
短文本 /
实体消歧 /
BERT /
图卷积网络 /
卷积神经网络
{{custom_keyword}} /
Key words
short text /
entity disambiguation /
BERT /
graph convolution network /
convolutional neural networks
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Fei W, Weld D S. Open information extraction usingWikipedia[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 2010: 11-16.
[2] Fleischman M, Hovy E, Echihabi A. Offline strategies for online question answering: answering questions before they are asked[C]//Proceedings of the 41st Annual Meeingon Associaion for Computational Linguistics, 2003: 1-7.
[3] 赵军.命名实体识别、排歧和跨语言关联[J].中文信息学报,2009,23(2): 3-17.
[4] 赵军, 刘康, 周光有.开放式文本信息抽取[J].中文信息学报,2011,25(6): 98-110.
[5] 郭剑毅, 马晓军, 余正涛, 等.融合词向量和主题模型的领域实体消歧[J]. 模式识别与人工智能,2017,30(12): 1130-1137.
[6] 杨晓. 基于维基百科的命名实体消歧的研究与实现[D]. 北京: 北京邮电大学硕士学位论文,2014.
[7] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[8] Goodfellow I, Bengio Y, Courville A, et al. Deep learning[M]. Cambridge: MIT press, 2016: 326-366.
[9] Gu J, Wang Z, Kuen J, et al. Recent advances in convolutional neural networks[J]. Pattern Recognition, 2018,77: 354-377.
[10] Lecun Y, Bottou L. Gradient-based learning applied to document recognition[C]//Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[11] Bruna J, Zaremba W, Szlam A, et al. Spectral networks and deep locally connected networks on graphs[C]//Proceedings of the 2nd International Conference on Learning Repiesentations, 2014: 1-14.
[12] Defferrard M, Bresson X, Vandergheynst P. Convo-lutional neural network on graphs with fast localized spectral filtering[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016: 3844-3852.
[13] Duvenaud D, Maclaurin D, Iparraguirre J, et al. Convolutional networks on graphs for learning molecular fingerprints[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. MIT Press.2015: 2224-2232.
[14] Kipf T,Welling M. Semi-supervised classification with graph convolutional networks[C]//Proceedings of the 5th International Conference on Learning Representations, 2017.
[15] 王红,林海舟,卢林燕.基于Att_GCN模型的知识图谱推理算法[J/OL].计算机工程与应用,2020,56(9): 183-189.
[16] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv: 1409.0473, 2014.
[17] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc, 2017: 6000-6010.
[18] Kim Y. Convolutional neural network for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2014: 1746-1751.
[19] Kim Y, Denton C, Hoang L, et al. Structured attention networkd[C]//Proceedings of International Conference on Learning Representations,2017: 1-21.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(62062062);新疆大学科研基金(BS 180250)
{{custom_fund}}