传统的文本生成对抗方法主要采用位置置换、字符替换等方式,耗费时间较长且效果较差。针对以上问题,该文提出一种基于改进蚁群算法的对抗样本生成模型IGAS (Improved ant colony algorithm to Generate Adversarial Sample),利用蚁群算法的特点生成对抗样本,并利用类形字进行优化。首先,构建城市节点群,利用样本中的词构建城市节点群;然后对原始输入样本,利用改进的蚁群算法生成对抗样本;再针对生成结果,通过构建的中日类形字典进行字符替换,生成最终的对抗样本;最后在黑盒模式下进行对抗样本攻击实验。实验在情感分类、对话摘要生成、因果关系抽取等多种领域验证了该方法的有效性。
Abstract
The classical adversarial sample generation methods mainly use positional substitution and character substitution, defected by the heavy computation and inferior effect. This paper proposes an adversarial sample generation model based on an improved ant colony algorithm, which optimize the adversarial sample by the class word. Firstly, urban agglomeration nodes are constructed by using the words in the samples. Then, the modified ant colony algorithm is used to generate antagonistic samples from the original input samples. And according to the generated result, the final antagonistic sample is generated by character substitution in the constructed Sino-Japanese class dictionary. Finally, the experiment against sample attack is carried out in black-box mode. Experiments verify the effectiveness of this method in many fields including sentiment classification, conversation summary generation, and causality extraction.
关键词
蚁群算法 /
对抗样本生成 /
类形字 /
黑盒攻击
{{custom_keyword}} /
Key words
ant colony algorithm /
countermeasure sample generation /
typefaces /
black-box attack
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] DANISH P,BHUWAN D,ZACHARY C L. Combating adversarial misspellings with robust word recognition[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,2019: 5582-5591.
[2] VALENTIN M. Robust tonoise models in natural language processing tasks[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop,2019: 10-16.
[3] LIANG B,LI H,SU M,et al. Deep text classification can be fooled[C]//Proceedings of the IJCAI, 2018: 4208-4215.
[4] ZHANG X,ZHANG J,CHEN Z,et al. Crafting adversarial examples for neural machine translation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing,2021: 1967-1977.
[5] ALZANTOT M,SHARMA Y,ELGOHARY A,et al. Generating natural language adversarial examples[C]//Proceedings of the EMNLP, 2018: 2890-2896.
[6] TSAI Y,YANG M,CHEN H. Adversarial attack on sentiment classification[C]//Proceedings of the ACL Workshop Black-box NLP: Analyzing and Interpreting Neural Networks for NLP,2019.
[7] JAVID E,ANYI R,DANIEL L,et al. Hotflip: White-box adversarial examples for text classification[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics,2018 (2): 31-36.
[8] BEHJATI M,MOOSAVI S,BAGHSHAH M S,et al. Universal adversarial attacks on text classifiers[C]//Proceedings of the ICASSP, 2019: 7345-7349.
[9] WALLACE E,RODRIGUEZ P,FENG S,et al. Trick me if you can: Human-in-the-loop generation of adversarial examples for question answering[J]. Transactions of the Association for Computational Linguistics,2019,7: 387-401.
[10] LIN J,ZOU J,DING N. Using adversarial attacks to reveal the statistical bias in machine reading comprehension models[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing,2019: 333-342.
[11] JIN D,JIN Z,JOEY T,et al. Is bert really robust? A strong baseline for natural language attack on text classification and entailment[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020,34 (5): 8018-8025.
[12] WANG X,JIN H,YANG Y,et al. Natural language adversarial defense through synonym encoding[C]//Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence,2021,161: 823-833.
[13] 王文琦,汪润,王丽娜,等. 面向中文文本倾向性分类的对抗样本生成方法[J]. 软件学报,2019,30 (8): 2415-2427.
[14] 仝鑫,王罗娜,王润正,等. 面向中文文本分类的词级对抗样本生成方法[J]. 信息网络安全,2020,20 (9): 12-16.
[15] GAO J,LANCHANTIN J,SOFFA M L,et al. Black-box generation of adversarial text sequences to evade deep learning classifiers[C]//Proceedings of the IEEE Security and Privacy Workshops,2018: 50-56.
[16] DOU G,LV Z. Fastwordbug: A fast method to generate adversarial text against nlp applications[J]. arXiv preprint arXiv:2002.00760,2020.
[17] 张顺香,吴厚月,朱广丽,等. 面向中文文本分类的字符级对抗样本生成方法[J]. 电子与信息学报,2023,45 (06): 2226-2235.
[18] LI J,DU T,JI S,et al. Textshield: Robust text classification based on multimodal embedding and neural machine translation[C]//Proceedings of the 29th USENIX Security Symposium,2020: 1381-1398.
[19] MENG Y,WU W,WANG F,et al. Glyce: Glyph-vectors for chinese character representations[C]//Proceedings of the NeurIPS,2019: 2742-2753.
[20] SONG Y,WANG J,JIANG T,et al. Attentional encoder network for targeted sentiment classification[C]//Proceedings of the Artificial Neural Networks and Machine Learning - ICANN: Text and Time Series,2019,11730: 93-103.
[21] HE R,LEE W S,NG H T,et al. Effective attention modeling for aspect-level sentiment classification[C]//Proceedings of the 27th International Conference on Computational Linguistics,2018: 1121-1131.
[22] HUANG B,CARLEY K. Syntax-aware aspect level sentiment classification with graph attention networks[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing,2019: 5469-5477.
[23] ZHANG C,LI Q,SONG D. Aspect-based sentiment classification with aspect-specific graph convolutional networks[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing,2019: 4568-4578.
[24] KUSNER M,SUN YU,KOLKIN N,et al. From word embeddings to document distances[J]. Proceedings of the 32nd International Conference on Machine Learning,Lille,France,2015: 957-966.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(62076006);安徽省属高校协同创新项目(GXXT-2021-008)
{{custom_fund}}