实体关系抽取旨在提取实体之间存在的语义关系,这可以为知识图谱、自动问答等下游任务提供支持,在自然语言处理领域具有重要作用。由于当前老挝语实体关系抽取的相关研究十分匮乏,可用数据也十分有限,因此在训练时神经网络无法获取足够的语义信息。针对此问题,该文提出了一种基于PCNN和BiGRU的组合模型的多特征老挝语实体关系抽取方法。首先,将位置特征与音素特征融入到词向量中得到包含多种语义的联合向量;然后,分别使用PCNN模型和BiGRU模型对联合向量进行深层语义的提取,其中PCNN模型能够更好地提取文本中的局部信息,BiGRU模型能够更好地考虑文本的全局信息,之后将两个模型的输出进行拼接,便得到了包含多维度语义信息的句子向量;最后,使用softmax进行多分类计算。实验表明,该文提出的方法,在有限的数据下得到了不错的效果,macro-averaged F1达到了82.25%。
Abstract
Entity relation extraction aims to extract the semantic relations between entities, which can provide support for downstream tasks such as knowledge graphs and automatic question and answer. Due to the lack of research related to entity relation extraction in Lao language with very limited data, this paper proposes a multi-feature Lao entity relation extraction method based on the combined model of PCNN and BiGRU. First, the position feature and phoneme feature are integrated into the word vector to obtain joint vector containing multiple semantics. Then, the PCNN model and the BiGRU model are used to extract the deep semantics of the joint vector, respectively. Among them, the PCNN model can better extract the local information in the text, and the BiGRU model can better consider the global information of the text, and the output of the two models are concatenated to obtain multi-dimensional semantic information. Finally, the softmax is used for multi-class predication. Experiments show that the method proposed in this paper has obtained 82.25% macro-averaged F1 with limited data.
关键词
多段卷积神经网络 /
双向门控循环单元 /
音素特征 /
联合向量 /
层归一化
{{custom_keyword}} /
Key words
PCNN /
BiGRU /
phoneme feature /
joint vector /
layer normalization
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 何代欣.东盟国家数字经济税: 运行特征及经验启示[J].中国发展观察,2021(Z2): 121-124.
[2] LIU J, YANG Y, HE H. Multi-level semantic representation enhancement network for relationship extraction[J]. Neurocomputing, 2020, 403: 282-293.
[3] 车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005(02): 1-6.
[4] KAMBHATLA N. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations[C]//Proceedings of the ACL on Interactive Poster and Demonstration Sessions, Barcelona, Spain. USA: ACL, 2004: 178-181.
[5] ZENG D, LIU K, LAI S, et al. Relation classification via convolutional deep neural network[C]//Proceedings of COLING, the 25th International Conference on Computational Linguistics: Technical Papers, 2014: 2335-2344.
[6] SOCHER R, HUVAL B, MANNING C D, et al. Semantic compositionality through recursive matrix-vector spaces[C]//Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012: 1201-1211.
[7] XU Y, MOU L, LI G, et al. Classifying relations via long short term memory networks along shortest dependency paths[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 1785-1794.
[8] XU C, YUAN L P, ZHONG Y. Chinese relation extraction using lattice GRU[C]//Proceedings of the IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference. IEEE, 2020, 1: 1188-1192.
[9] ZHOU P, SHI W, TIAN J, et al. Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 207-212.
[10] YANG Z, YANG D, DYER C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 1480-1489.
[11] 何阳宇,易晓宇,唐亮,等.基于BLSTM-ATT的老挝语军事领域实体关系抽取[J].计算机技术与发展,2021,31(05): 31-37.
[12] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv: 1301.3781, 2013.
[13] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems, 2013: 3111-3119.
[14] 贾晓婷,王名扬,曹宇.结合Doc2Vec与改进聚类算法的中文单文档自动摘要方法研究[J].数据分析与知识发现,2018,2(02): 86-95.
[15] 李卫疆, 李涛, 漆芳. 基于多特征自注意力 BLSTM 的中文实体关系抽取[J]. 中文信息学报, 2019, 33(10): 47-56, 72.
[16] 王兴金. 融合多特征的老挝语词性标注研究[D]. 昆明: 昆明理工大学硕士学位论文, 2020.
[17] REI M, CRICHTON G K O, PYYSALO S. Attending to characters in neural sequence labeling models[J]. arXiv preprint arXiv: 1611.04361, 2016.
[18] ZENG D, LIU K, CHEN Y, et al. Distant supervision for relation extraction via piecewise convolutional neural networks[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 1753-1762.
[19] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[20] BA J L, KIROS J R, HINTON G E. Layer normalization[J]. arXiv preprint arXiv: 1607.06450, 2016.
[21] KINGMA D P, BA J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv: 1412.6980, 2014.
[22] LECUN Y, BOTTOU L, BENGIO,et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998,86(11): 2278-2324.
[23] NAIR V, HINTON G E. Rectified linear units improve restricted boltzmann machines[C]//Proceedings of the 27th International Conference on Machine Learning, 2010: 807-814.
[24] GLOROT X, BORDES,et al. Deep sparse rectifier neural networks[C]//Proceedings of the 14th International Conference on Artificial Intelligence and Statistics,2011: 315-323.
[25] CLEVERT D, UNTERTHINER A,et al. Fast and accurate deep network learning by exponential linear units (elus)[J]. arXiv preprint arXiv: 1511.07289,2015.
[26] RAMACHANDRAN P, ZOPH B, LE Q V. Searching for activation functions[J]. arXiv preprint arXiv: 1710.05941, 2017.
[27] HENDRYCKS D, GIMPEL K. Gaussian error linear units (gelus)[J]. arXiv preprint arXiv: 1606.08415, 2016.
[28] TURIAN J, RATINOV L, BENGIO Y. Word representations: A simple and general method for semi-supervised learning[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 2010: 384-394.
[29] ZHANG X, CHEN F, HUANG R. A combination of RNN and CNN for attention-based relation classification[J]. Procedia Computer Science, 2018, 131: 911-917.
[30] WEN H, ZHU X, ZHANG L, et al. A gated piecewise CNN with entity-aware enhancement for distantly supervised relation extraction[J]. Information Processing & Management, 2020, 57(6): 102373.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61662040)
{{custom_fund}}