基于监督对比重放的持续关系抽取

PDF(3592 KB)

中文信息学报 ›› 2023, Vol. 37 ›› Issue (11) : 60-67,80.

信息抽取与文本挖掘

基于监督对比重放的持续关系抽取

赵基藤,李国正,汪鹏,柳沿河

作者信息 +

Continual Relation Extraction via Supervised Contrastive Replay

ZHAO Jiteng, LI Guozheng, WANG Peng, LIU Yanhe

Author information +

History +

摘要

持续关系抽取被用来解决在新关系上重新训练模型而导致灾难性遗忘的问题。该文针对现有持续关系抽取模型存在的最近任务偏倚等问题,提出了一种基于监督对比重放的持续关系抽取方法。具体而言,对每个新任务,首先利用编码器学习新的样本嵌入,接着通过将相同和不同关系类别的样本作为正负样本对,在每次重放的过程中利用监督对比损失,不断学习一个区分能力强的编码器;同时,在监督对比学习过程中利用关系原型进行辅助增强,防止模型过拟合;最后在测试阶段通过最近类均值分类器进行分类。实验结果表明,该文提出的方法可以有效缓解持续关系抽取中的灾难性遗忘问题,在FewRel和TACRED两个数据集上都达到了最先进的持续关系抽取性能。同时,随着任务数量的增加,在训练至5个任务以后,该文模型性能领先最先进的模型性能约1%。

Abstract

Continual relation extraction is used to solve catastrophic forgetting caused by retraining models on new relations. Aiming at task-recency bias issue, this paper proposes a continual relation extraction method based on supervised contrastive replay. Specifically, for each new task, the model first uses the encoder to learn new sample embeddings, and then uses the samples of the same and different relation categories as positive and negative sample pairs to continually learn an embedding space with strong discrimination ability. At the same time, relation prototypes are added to the supervised contrastive loss to prevent the model from overfitting. Finally, the nearest class mean classifier is used for classification. The experimental results show that the proposed method can effectively alleviate the catastrophic forgetting issue in continual relation extraction, and achieve the state-of-the-art performance on FewRel and TACRED datasets.

导出引用

赵基藤,李国正,汪鹏,柳沿河. 基于监督对比重放的持续关系抽取. 中文信息学报. 2023, 37(11): 60-67,80

ZHAO Jiteng, LI Guozheng, WANG Peng, LIU Yanhe. Continual Relation Extraction via Supervised Contrastive Replay. Journal of Chinese Information Processing. 2023, 37(11): 60-67,80

参考文献

[1] KIRKPATRICK J, PASCANU R, RABINOWITZ N, et al. Overcoming catastrophic forgetting in neural networks[C]//Proceedings of the National Academy of Sciences, 2017, 114(13): 3521-3526.
[2] SHIN H, LEE J K, KIM J, et al. Continual learning with deep generative replay[C]//Proceedings of the 31st International Conference on Neural Information,2017: 2994-3003.
[3] LOPEZ-PAZ D, RANZATO M. Gradient episodic memory for continual learning[C]//Proceedings of the 31st International Conference on Neural Information, 2017:6470-6479.
[4] MCCLELLAND J L, MCNAUGHTON B L, O’REILLY R C. Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory[J]. Psychological Review, 1995, 102(3): 419.
[5] RIEDEL S, YAO L, MCCALLUM A, et al. Relation extraction with matrix factorization and universal schemas[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013: 74-84.
[6] ZENG D, LIU K, LAI S, et al. Relation classification via convolutional deep neural network[C]//Proceedings of COLING, the 25th International Conference on Computational Linguistics: Technical Papers. 2014: 2335-2344.
[7] WANG H, XIONG W, YU M, et al. Sentence embedding alignment for lifelong relation extraction[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019:796-806.
[8] HAN X, DAI Y, GAO T, et al. Continual relation learning via episodic memory activation and reconsolidation[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 6429-6440.
[9] MAI Z, LI R, KIM H, et al. Supervised contrastive replay: Revisiting the nearest class mean classifier in online class-incremental continual learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 3589-3599.
[10] KHOSLA P, TETERWAK P, WANG C, et al. Supervised contrastive learning[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems s, 2020, 33: 18661-18673.
[11] ZENKE F, POOLE B, GANGULI S. Continual learning through synaptic intelligence[C]//Proceedings of the International Conference on Machine Learning. PMLR, 2017: 3987-3995.
[12] MALLYA A, LAZEBNIK S.Packnet: Adding multiple tasks to a single network by iterative pruning[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2018: 7765-7773.
[13] RUSU AA, RABINOWITZ N C, DESJARDINS G, et al. Progressive neural networks[J]. arXiv preprint arXiv:1606.04671 2016.
[14] DE MASSON D’AUTUME C, RUDER S, KONG L, et al. Episodic memory in lifelong language learning[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019:13132-13141.
[15] SUN F K, HO C H, LEE H Y. Lamol: Language modeling for lifelong language learning[J]. arXiv preprint arXiv:1909.03329,2019.
[16] CHUANG Y S, SU S Y, CHEN Y N. Lifelong language knowledge distillation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2020:2914-2924.
[17] MONAIKUL N, CASTELLUCCI G, FILICE S, et al. Continual learning for named entity recognition[C]//Proceedings of the35th AAAI Conference on Artificial Intelligence. 2021: 13570-13577.
[18] CUI L, YANG D, YU J, et al. Refining sample embeddings with relation prototypes to enhance continual relation extraction[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021: 232-243.
[19] WU T, LI X, LI Y F, et al. Curriculum-meta learning for order-robust continual relation extraction[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 35(12): 10363-10369.
[20] REN H, CAI Y, CHEN X, et al. A two-phase prototypical network model for incremental few-shot relation classification[C]//Proceedings of the 28th
International Conference on Computational Linguistics. 2020: 1618-1629.
[21] CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[C]//Proceedings of the International conference on machine learning. PMLR, 2020: 1597-1607.
[22] HAN X, ZHU H, YU P, et al.Fewrel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 4803-4809.
[23] SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 4080-4090.

基金

“十三五”全军共用信息系统装备预先研究项目(31514020501,31514020503)

PDF(3592 KB)

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金

Published
2024-01-25
Issue Date
2024-01-28