|
|
Capsule Network with Multi-scale Feature Attention for Text Classification |
WANG Chaofan, JU Shenggen, SUN Jieping, CHEN Run |
School of Computer Science, Sichuan University, Chengdu, Sichuan 610065, China |
|
|
Abstract In recent years, Capsule Neural Networks (Capsnets) have been successfully applied to text classification. In existing studies, all n-gram features play equal roles in text classification, without capturing the importance of each n-gram feature in the specific context. To address this issue, this paper proposes Partially-connected Routings Capsnets with Multi-scale Feature Attention(MulPart-Capsnets) by incorporating multi-scale feature attention into Capsnets. Multi-scale feature attention can automatically select n-gram features from different scales, and capture accurately rich n-gram features for each word by weighted sum rules. In addition, in order to reduce the redundant information transferring between child and parent capsules, dynamic routing algorithm is also improved. In order to verify the effectiveness of the proposed model, our experiments are conducted on seven well-known datasets in text classification. The experimental results demonstrate that the proposed model consistently improves the performance of classification.
|
Received: 21 February 2021
|
|
|
|
|
[1] Zhang D, Lee W S. Question classification using support vector machines[C]//Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2003: 26-32. [2] 曾义夫, 蓝天, 吴祖峰, 等. 基于双记忆注意力的方面级别情感分类模型 [J]. 计算机学 报, 2019, 42 (8): 1845-1857. [3] 赵妍妍, 秦兵, 刘挺. 文本情感分析[J]. 软件学报,2010, 21(8): 1834-1848. [4] Buzikashvili N. Query topic classification and sociology of web query logs [J]. Computación y Sistemas, 2015, 19(4): 633-646. [5] Kim Y. Convolutional neural networks for sentence classification [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2014: 1746-1751. [6] Van PHAN T, Nakagama M. Text/non-text classification in online handwritten documents with recurrent neural networks[C]//Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition, IEEE, 2014: 23-28. [7] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the Advances in Neural Information Processing Systems, 2017: 5998-6008. [8] Zhang X, Zhao J, Lecun Y. Character-level convolutional networks for text classification [C]// Proceedings of the Advances in Neural Information Processing Systems, 2015: 649-657. [9] Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules[J]. arXiv preprint arXiv:1710.09829, 2017. [10] Yang M, Zhao W, Ye J, et al. Investigating capsule networks with dynamic routing for text classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 3110-3119. [11] Kim J, Jang S, Park E, et al. Text classification using capsules[J]. Neurocomputing, 2020, 376: 214-221. [12] Wang S, Huang M, Deng Z. Densely connected CNN with multi-scale feature attention for text classification[C] //Proceedings of the IJCAI, 2018: 4468-4474. [13] Ding X, Wang N, Gao X, et al. Group reconstruction and max-pooling residual capsule network[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019: 10-16. [14] 李亚超, 熊德意, 张民. 神经机器翻译综述[J]. 计算机学报, 2018, 41(12): 100-121. [15] Conneau A, Schwenk H, Barrault L, et al. Very deep convolutional networks for text classification[J]. arXiv preprint arXiv:1606.01781, 2016. [16] Penninteon J, Socher R, Manning C D. Glove: Global vectors for word representation [C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1532-1543. [17] Zeiler M D. Adadelta: An Adaptive Learning Rate Method[J]. arXiv preprint arXiv:1212.5701, 2012. [18] Joulin A, Grave , Bojanowski P, et al. Bag of tricks for efficient text classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017: 427-431. [19] Qiao C, Huang B, Niu G, et al. A new method of region embedding for text classification[C]//Proceedings of the ICLR,2018. [20] Yogatama D, Dyer C, Ling W, et al. Generative and discriminative text classification with recurrent neural networks[J]. arXiv preprint arXiv:1703.01898, 2017. [21] Yang Z, Yang D, Dyer C, et al. Hierarchical attention networks for document classifiction[C]//Proceedings of the NAACL-HLT, 2016: 1480-1489. [22] Niu G, Xu H, He B, et al. Enhancing local feature extraction with global representation for neural text classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 496-506. [23] Xiang L, Jin X, Yi L, et al. Adaptive region embedding for text classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019,33: 7314-7321. [24] Ren H, Lu H. Compositional coding capsule network with K-means routing for text classification[J]. arXiv preprint arXiv:1810.09177, 2018. |
|
|
|