郭世伟,马博,马玉鹏,杨雅婷. 基于预训练模型和图卷积网络的中文短文本实体链接[J]. 中文信息学报, 2022, 36(12): 104-114.
GUO Shiwei, MA Bo, MA Yupeng, YANG Yating. Chinese Short Text Entity Linking Based on BERT and GCN. , 2022, 36(12): 104-114.
Chinese Short Text Entity Linking Based on BERT and GCN
GUO Shiwei1,2,3, MA Bo1,2,3, MA Yupeng1,2,3, YANG Yating1,2,3
1.The Xinjiang Technical Institute of Physics and Chemistry, Urumqi, Xinjiang 830011, China; 2.University of Chinese Academy of Sciences, Beijing 100049, China; 3.Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi, Xinjiang 830011, China
Abstract:Short text entity linking can relies on local short text information and knowledge base due to the lack of global topic information. This paper proposes the concept of short text interaction graph (STIG) and a double stage training strategy. The Bert is used to extract the multi-granularity features between local short text and candidate entities, and the graph convolution mechanism is used on the short text interaction graph. To alleviate the degradation of graph convolution caused by mean pooling, a method is further proposed to compress the feature of nodes and edges information in interaction graph into a dense vector. Experiments on CCKS2020 entity linking dataset show the effectiveness of the proposed method.
[1] Cucerzan S. Large-Scale named entity disambiguation based on Wikipedia data[C]//Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning,2007: 708-716. [2] Globerson A, Strube M, Chakrabarti S, et al. Collective entity resolution with multi-focal attention[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,2016: 621-631. [3] Heinzerling B, Michael S, Lin C Y. Trust, but verify! better entity linking through automatic verification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics,2017: 828-838. [4] Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J].arXiv preprint, arXiv: 1810.04805, 2018. [5] Radford A,Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training[EB/OL]. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf. [6] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks[J]. arXiv preprint, arXiv: 1609.02907, 2016. [7] Zhang W, Su J, Tan C L, et al. Entity linking leveraging: automatically generated annotation[C]//Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics,2010: 1290-1298. [8] Raiman J, Raiman O. DeepType: Multilingual entity linking by neural type system evolution[C]//Proceedings of the 32rd Conference on Association for the Advancement of Artificial Intelligence,2018: 5406-5413. [9] Ganea O E, Hofmann T. Deep joint entity disambiguation with local neural attention[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2017: 2619-2629. [10] 李明扬,姜嘉伟,孔芳. 融入丰富信息的高性能神经实体链接[J]. 中文信息学报,2020,34(01): 87-96. [11] 周鹏程,武川,陆伟. 基于多知识库的短文本实体链接方法研究: 以Wikipedia和Freebase为例[J]. 现代图书情报技术,2016, 32(06): 1-11. [12] Nie F, Zhou S, Wang J, et al. Aggregated semantic matching for short text entity linking[C]//Proceedings of the 22nd Conference on Computational Natural Language Learning,2018: 476-485. [13] Herbrich R, Minka T, Graepel T. TrueSkilltm: A bayesian skill rating system[C]//Proceedings of the 19rd Conference on Neural Information Processing Systems,2006: 569-576. [14] Logeswaran L, Chang M W, Lee K, et al. Zero-shot entity linking by reading entity descriptions[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,2019: 3449-3460. [15] Chen S, Wang J, Jiang F, et al. Improving entity linking by modeling latent entity type information[J]. arXiv preprint, arXiv: 2001.01447, 2020. [16] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31th Advances in Neural Information Processing Systems,2017: 6000-6010. [17] Cui Y, Che W, Liu T, et al. Revisiting pretrained models for Chinese natural language processing[J]. arXiv preprint, arXiv: 2004.13922,2004. [18] Krizhevsky A, Sutskever I, Hinton G. Imagenet classification with deep convolutional neural networks[C]//Proceedings of the 26rd Neural Information Processing Systems,2012: 1097-1105. [19] Baidu, CCKS. CCKS 2020: 面向中文短文本的实体链指任务[DB/OL].https://www.biendata.xyz/competition/ccks_2020_el/. [20] Loshchilov I, Hutter F. Decoupled weight decay regularization[C]//Proceedings of the 7th International Conference on Learning Representations,2019, arXiv: 1711.05101. [21] Kingma D P, Ba J. Adam: A method for stochastic optimization[C]//Proceedings of the 3th International Conference on Learning Representations,2015, arXiv: 1412.6980. [22] Xue M, Cai W, Su J, et al. Neural collective entity linking based on recurrent random walk network learning[C]//Proceedings of the 28rd International Joint Conference on Artificial Intelligence,2019: 5327-5333. [23] Yin X, Huang Y, Zhou B, et al. Deep entity linking via eliminating semantic ambiguity with BERT[J]. IEEE Access, 2019, 7: 169434-169445. [24] Goodfellow I J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples[J]. arXiv preprint arXiv: 1412.6572, 2014. [25] LZhao Z, Chen H, Zhang J, et al. UER: An Open-Source Toolkit for Pre-training Models[J]. arXiv preprint arXiv: 1909.05658, 2019. [26] Liu Y, Ott M, Goyal N, et al. Roberta: A robustly optimizedbert pretraining approach[J]. arXiv preprint arXiv: 1907.11692, 2019. [27] 吕荣荣,王鹏程,陈帅. 面向中文短文本的多因子融合实体链指研究[EB/OL]. https://bj.bcebos.com/v1/conference/ccks2020/eval_paper/ccks2020_eval_paper_2_1.pdf.