GAT: 用于自然语言理解的基于全局的对抗训练

蔡坤钊,曾碧卿,陈鹏飞

PDF(1274 KB)
PDF(1274 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (3) : 27-35.
语言分析与计算

GAT: 用于自然语言理解的基于全局的对抗训练

  • 蔡坤钊,曾碧卿,陈鹏飞
作者信息 +

GAT: Global-Based Adversarial Training for Natural Language Understanding

  • CAI Kunzhao, ZENG Biqing, CHEN Pengfei
Author information +
History +

摘要

在自然语言处理领域中,基于梯度的对抗训练是一种能够有效提高神经网络鲁棒性的方法。首先,该文针对现有的对抗训练算法效率较低的问题,提出基于全局扰动表的初始化策略,在提高神经网络的训练效率的同时保证初始化扰动的有效性;其次,针对传统的归一化方法忽略了令牌之间的相对独立性问题,提出基于全局等权的归一化策略,保证令牌之间的相对独立性,避免少数样本主导对抗训练;最后,对于使用可学习的位置编码的预训练语言模型,提出基于全局多方面的扰动策略,使得神经网络更具鲁棒性。实验结果表明,该方法能有效提升神经网络的性能。

Abstract

In natural language processing, gradient-based adversarial training is an effective method to improve the robustness of neural networks. This paper proposes an initialization strategy based on the global-based perturbation vocabulary to deal with the problem of low efficiency in the existing adversarial training algorithm, improving the efficiency of training neural networks while ensuring the effectiveness of initializing the perturbations. To keep tokens independent and avoid the training dominated by a few samples, we proposes an normalization strategy based on the global-based equal weight. Finally, we propose a multifaceted perturbations strategy to improve the robustness of pretraining language models. The experimental results show that the strategies can effectively improve the performance of neural networks.

关键词

自然语言理解 / 对抗训练 / 初始化策略 / 归一化策略 / 扰动策略

Key words

natural language understanding / adversarial training / initialization strategy / normalization strategy / perturbations strategy

引用本文

导出引用
蔡坤钊,曾碧卿,陈鹏飞. GAT: 用于自然语言理解的基于全局的对抗训练. 中文信息学报. 2023, 37(3): 27-35
CAI Kunzhao, ZENG Biqing, CHEN Pengfei. GAT: Global-Based Adversarial Training for Natural Language Understanding. Journal of Chinese Information Processing. 2023, 37(3): 27-35

参考文献

[1] GOODFELLOW I, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[C]//Proceedings of the Internet Conference on Learning Representations. 2015:1-11.
[2] MIYATO T, DAI A M, GOODFELLOW I. Adversarial training methods for semi-supervised text classification[C]//Proceedings of the Internet Conference on Learning Representations. 2017:1-11.
[3] MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks[C]//Proceedings of the Internet Conference on Learning Representations. 2018:1-28.
[4] MIYATO T, MAEDA S, KOYAMA M, et al. Virtual adversarial training: A regularization method for supervised and semi-supervised learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(8): 1979-1993.
[5] SHAFAHI A, NAJIBI M, GHIASI M A, et al. Adversarial training for free![C]//Proceedings of Advances in Neural Information Processing Systems, 2019:3358-3369.
[6] ZHANG D, ZHANG T, LU Y, et al.You only propagate once: Accelerating adversarial training via maximal principle[C]//Proceedings of Advances in Neural Information Processing Systems, 2019:227-238.
[7] REN S, DENG Y, HE K, et al. Generating natural language adversarial examples through probability weighted word saliency[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 1085-1097.
[8] ZANG Y, QI F, YANG C, et al. Word-level textual adversarial attacking as combinatorial optimization [C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 6066-6080.
[9] LI L, MA R,GUO Q, et al. Bert-attack: Adversarial attack against bert using bert[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2020: 6193-6202.
[10] MORRIS J X,LIFLAND E, YOO J Y, et al. Textattack: A framework for adversarial attacks, data augmentation, and adversarial training in nlp[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020: 119-126.
[11] ZHU C, CHENG Y,GAN Z, et al. Freelb: Enhanced adversarial training for natural language understanding[C]//Proceedings of the Internet Conference on Learning Representations, 2020:1-14.
[12] JIANG H, HE P, CHEN W, et al.Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020:2177-2190.
[13] LI L,QIU X. Token-aware virtual adversarial training in natural language understanding[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2021: 8410-8418.
[14] DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 4171-4186.
[15] LIU Y,OTT M, GOYAL N, et al. RoBERTa: A robustly optimized bert pretraining approach[J]. arXiv preprint arXiv:1907.11692, 2019.
[16] LAN Z, CHEN M, GOODMAN S, et al. AlBERT: A lite bert for self-supervised learning of language representations[C]//Proceedings of the Internet Conference on Learning Representations, 2020: 1-17
[17] WANG A, SINGH A, MICHAEL J, et al. GLUE: A multi-task benchmark and analysis platform for natural language understanding[C]//Proceedings of the Internet Conference on Learning Representations, 2019: 1-20.
[18] DAGAN I, GLICKMAN O,MAGNINI B. The pascal recognising textual entailment challenge[C]//Proceedings of the Machine Learning Challenges Workshop. Springer, Berlin, Heidelberg, 2005: 177-190.
[19] RAJPURKAR P, ZHANG J, LOPYREV K, et al. SQuAD: 100,000+ questions for machine comprehension of text [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 2383-2392.
[20] DOLAN B, BROCKETT C. Automatically constructing a corpus of sentential paraphrases[C]//Proceedings of the 3rd International Workshop on Paraphrasing, 2005: 1-8.
[21] WARSTADT A, SINGH A, BOWMAN S R. Neural network acceptability judgments[J]. Transactions of the Association for Computational Linguistics, 2019, 7: 625-641.
[22] SOCHER R, PERELYGIN A, WU J, et al. Recursive deep models for semantic compositionality over a sentiment treebank[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2013: 1631-1642.
[23] CER D, DIAB M, AGIRRE E, et al. Semeval task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation[C]//Proceedings of the 11th International Workshop on Semantic Evaluation, 2017: 1-14.
[24] WILLIAMS A,NANGIA N, BOWMAN S R. A broad-coverage challenge corpus for sentence understanding through inference[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 1112-1122.

基金

广东省普通高校人工智能重点领域专项(2019KZDZX1033);广东省基础与应用基础研究基金(2021A1515011171);广州市基础研究计划基础与应用基础研究项目(202102080282)
PDF(1274 KB)

917

Accesses

0

Citation

Detail

段落导航
相关文章

/