蔺志,李原,王庆林. 基于BERT改进的文化活动事件论元抽取研究[J]. 中文信息学报, 2022, 36(12): 115-122.
LIN Zhi, LI Yuan, WANG Qinglin. An Improved Argument Extraction Method for Cultural Events Based on BERT. , 2022, 36(12): 115-122.
基于BERT改进的文化活动事件论元抽取研究
蔺志,李原,王庆林
北京理工大学 自动化学院,北京 100081
An Improved Argument Extraction Method for Cultural Events Based on BERT
LIN Zhi, LI Yuan, WANG Qinglin
School of Automation, Beijing Institute of Technology, Beijing 100081, China
Abstract:Event extraction methods usually use the small-scale open-domain event extraction corpus of ACE 2005, which is difficult for applying deep learning. A semi-supervised domain event argument extraction method is proposed to automatically annotate cultural event corpus from official websites of Chinese public libraries by using template and domain dictionary. Then manual annotation is applied to ensure the label accuracy. To resolve the problem of polysemy in word embedding layer, an improved method using BERT model and positional character embedding layer is proposed for the BiLSTM-CRF model. Experiments demonstrate an F1 value of 84.9% for the proposed method of event argument extraction, which is superior to the classical event argument recognition methods.
[1] Liao T, Sun Y, Liu Z. Extraction method of automatic summarization based on event elements network[C]//Proceedings of the 9th International Symposium on Computational Intelligence and Design. IEEE,2016, 1: 394-397. [2] Li W, Wu M, Lu Q, et al. Extractive summarization using inter-and intra-event relevance[C]//Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, 2006: 369-376. [3] Jungermann F, Morik K. Enhanced services for targeted information retrieval by event extraction and data mining[C]//Proceedings of the International Conference on Application of Natural Language to Information Systems. Springer, Berlin, Heidelberg, 2008: 335-336. [4] Daniel N, Radev D, Allison T. Sub-event based multi-document summarization[C]//Proceedings of the HLT-NAACL Text Summarization Workshop, 2003: 9-16. [5] Li Z, Liu F, Antieau L, et al. Lancet: A high precision medication event extraction system for clinical text[J]. Journal of the American Medical Informatics Association, 2010, 17(5): 563-567. [6] Lee C S, Chen Y J, Jian Z W. Ontology-based fuzzy event extraction agent for Chinese e-news summarization[J]. Expert Systems With Applications, 2003, 25(3): 431-447. [7] Chen Y, Xu L, Liu K, et al. Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 167-176. [8] Nguyen T H, Cho K, Grishman R. Joint event extraction via recurrent neural networks[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 300-309. [9] Liu S, Chen Y, He S, et al. Leveraging framenet to improve automatic event detection[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 2134-2143. [10] Yang H, Chen Y, Liu K, et al.DCFEE: A document-level Chinese financial event extraction system based on automatically labeled training data[C]//Proceedings of the ACL, System Demonstrations, 2018: 50-55. [11] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[C]//Proceedings of the International Conference on Learning Representations, 2013: 1-12. [12] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, 2019: 4171-4186. [13] Donahue J, Anne H L, Guadarrama S, et al. Long-term recurrent convolutional networks for visual recognition and description[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 2625-2634. [14] Chen Q, Zhu X, Ling Z H, et al. Enhanced LSTM fornatural language inference[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 1657-1668. [15] Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult[J]. IEEE Transactions on Neural Networks, 1994, 5(2): 157-166. [16] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. [17] Lafferty J, McCallum A, Pereira FCN. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the 18th International Conference on Machine Learning, 2001: 282-289. [18] Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural Network architectures[J]. Neural Networks, 2005, 18(5-6): 602-610. [19] 史海峰. 基于CRF的中文命名实体识别研究[D]. 苏州: 苏州大学硕士学位论文, 2010. [20] 洪铭材, 张阔, 唐杰, 等. 基于条件随机场 (CRFs) 的中文词性标注方法[J]. 计算机科学, 2006, 33(10): 148-151. [21] Zeng D, Sun C, Lin L, et al. LSTM-CRF for drug-named entity recognition[J]. Entropy, 2017, 19(6): 283. [22] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6000-6010. [23] Jawahar G, Sagot B, Seddah D. What does BERT learn about the structure of language?[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 3651-3657.