多类型注意力下参数自适应的多标签文本分类

PDF(26108 KB)

中文信息学报 ›› 2022, Vol. 36 ›› Issue (10) : 116-125.

信息抽取与文本挖掘

多类型注意力下参数自适应的多标签文本分类

李智强¹,过弋^1,2,3,王志宏¹

作者信息 +

Parameter Adaptive Model Under Multi-Type Attention for Multi-label Text Classification

LI Zhiqiang¹, GUO Yi^1,2,3, WANG Zhihong¹

Author information +

History +

摘要

多标签文本分类是指从一个极大的标签集合中为每个文档分配最相关的多个标签。该文提出一种多类型注意力机制下参数自适应模型(Parameter Adaptive Model under Multi-strategy Attention Mechanism,MSAPA)对文档进行建模和分类。MSAPA模型主要包括两部分: 第一部分采用多类型注意力机制分别提取融合自注意力机制的全局关键词特征和局部关键词特征及融合标签注意力机制的全局关键词特征和局部关键词特征;第二部分采用多参数自适应策略为多类型注意力机制动态分配不同的权重,从而学习到更优的文本表示,提升分类的准确率。在AAPD和RCV1两个基准数据集上的大量实验证明了MSAPA模型的优越性。

Abstract

Multi-label text classification assigns the most relevant multiple labels to each document from a huge label set. This paper proposes a parameter adaptive model under a multi-strategy attention mechanism (MSAPA) for multi-label text classification. The MSAPA model first uses multiple types of attention mechanisms to extract global and local keyword features with self-attention mechanism and label attention mechanism, respectively. Then it adopts a multi-parameter adaptive strategy to dynamically assign weights to multiple types of attention mechanisms, so as to learn a better text representation for classification. Experiments on the two benchmark data sets of AAPD and RCV1 validate the superiority of the MSAPA model.

导出引用

李智强,过弋,王志宏. 多类型注意力下参数自适应的多标签文本分类. 中文信息学报. 2022, 36(10): 116-125

LI Zhiqiang, GUO Yi, WANG Zhihong. Parameter Adaptive Model Under Multi-Type Attention for Multi-label Text Classification. Journal of Chinese Information Processing. 2022, 36(10): 116-125

参考文献

[1] Shimura K, Li J, Fukumoto F. HFT-CNN: Learning hierarchical category structure for multi-label short text categorization[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018:811-816.
[2] Kim Y. Convolutionalneural networks for sentence classification[C]//Proceedings of the Empirical Methods in Natural Language Processing,2014:1746-1751.
[3] Chen Y , Xu L , Liu K , et al. Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the the 53rd Annual Meeting of the Association for Computational Linguistics, 2015: 167-176.
[4] Conneau A, Schwenk H, Barrault L, et al. Very deep convolutional networks for text classification[C]//Proceedings of the European Association of Computational Linguistics,2017:1107-1116.
[5] Zhang X , Zhao J , Lecun Y . Character-level convolutional networks for text classification[C]//Proceedings of the 29th Internatronal Neural Information Processing Systems. MIT Press, 2015:649-657.
[6] Zaremba W, Sutskever I, Vinyals O. Recurrent neural network regularization[C]//Proceedings of the Computing Research Repository, 2014:1-8.
[7] Lai S, Xu L, Liu K, et al. Recurrent convolutional neural networks for text classification[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence, 2015:2267-2273.
[8] Liu P, Qiu X, Huang X. Recurrent neural network for text classification with multi-task learning[C]//Proceedings of the International Joint Conference on Artificial Intelligence,2016:2873-2879.
[9] Wang B. Disconnected recurrent neural networks for text categorization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018:2311-2320.
[10] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[C]//Proceedings of the International Conference on Learning Representations, 2013:497-500.
[11] Pennington J,Socher R,Manning C. Glove: Global vectors for word representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014:1532-1543.
[12] Nair V, Hinton G E. Rectified linear units improve restricted Boltzmann machines[C]//Proceedings of the International Conference on Machine Learning,2010:807-814.
[13] Misra D. Mish: A self regularized non-monotonic neural activation function[C]//Proceedings of the Computing Research Repository,2019:1-5.
[14] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 5998-6008.
[15] Yang P, Sun X, Li W, et al. SGM: Sequence generation model for multi-label classification[J].arXiv preprint arXiv:1806.04822, 2018:3915-3926.
[16] Lewis D D, Yang Y, Rose T G, et al. Rcv1: A new benchmark collection for text categorization research[J]. Journal of Machine Learning Research, 2004, 5(4): 361-397.
[17] Bhatia K, Jain H, Kar P, et al. Sparse localembeddings for extreme multi-label classification[C]//Proceedings of the 28th International Conference on neural information processing systems, 2015,28:730-738.
[18] Jain H, Prabhu Y, Varma M. Extreme multi-label loss functions for recommendation, tagging, ranking and other missing label applications[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016:935-944.
[19] Liu J, Chang W C, Wu Y, et al. Deep learning for extreme multi-label text classification[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017:115-124.
[20] Zhang W, Yan J, Wang X, et al. Deep extreme multi-label learning[C]//Proceedings of the ACM on International Conference on Multimedia Retrieval, 2018:100-107.
[21] You R, Zhang Z, Wang Z, et al. AttentionxML: label tree-based attention-aware deep model for high-performance extreme multi-label text classification[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019:5820-5830.
[22] Du C, Chen Z, Feng F, et al. Explicit interaction model towards text classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2019, 33:6359-6366.
[23] Huang X, Chen B, Xiao L, et al. Label-aware document representation via hybrid attention for extreme multi-label text classification[J]. arXiv preprint arXiv:1905.10070, 2019:1-14.
[24] Xiao L, Huang X, Chen B, et al. Label-specific document representation for multi-label text classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 466-475.

基金

国家重点研发计划(2018YFC0807105);上海市科学技术委员会科研计划项目(22DZ1204903,2251104800)

PDF(26108 KB)

788

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

Received	Published
2020-12-24	2022-12-30
Issue Date
2022-12-30

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金