|
|
Text Classification Based on Meta Learning for Unbalanced Small Samples |
XIONG Wei1,2, GONG Yu1 |
1.Computer Department, North China Electric Power University (Baoding), Baoding, Hebei 071000, China; 2.Engineering Research Center of Intelligent Computing for Complex Energy Systems, Ministry of Education, North China Electric Power University (Baoding), Baoding, Hebei 071000, China |
|
|
Abstract To address the semantic and context transfer of text information, an improved method of dynamic convolution neural network based on meta learning and attention mechanism is proposed. Firstly, cross category classification is carried out by using the underlying distribution features of the text to make the text information ready for transfer. Secondly, the attention mechanism is used to improve the traditional convolution network to improve the feature extraction ability of the network, and the balanced variables are generated according to the information of the original data set to reduce the impact of the imbalance of data. Finally, the parameters of the model are optimized automatically by using the two-level optimization method. The experimental results on the general text classification THUCNews dataset show that the proposed method has improved the accuracy by 2.27% and 3.26% in the 1-shot and 5-shot experiments, respectively, and on the IMDb dataset, by 3.28% and 3.01%, respectively.
|
Received: 26 July 2021
|
|
|
|
|
[1] Zhang K, Gool L V, R Timofte. Deep unfolding network for image super-resolution[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 3217-3226. [2] Chen J, Lei B, Song Q, et al. A hierarchical graph network for 3D object detection on point clouds[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020: 392-401. [3] 王启发,周敏,王中卿,等.基于用户与产品信息和图卷积网络的情感分类研究[J].中文信息学报, 2021, 35(03):134-142. [4] 胡玉兰,赵青杉,陈莉,等.面向中文新闻文本分类的融合网络模型[J].中文信息学报, 2021, 35 (03):107-114. [5] 赵亚欧,张家重,李贻斌,等.基于ELMO和Transformer混合模型的情感分析[J].中文信息学报, 2021, 35(03):115-124. [6] 李凡长,刘洋,吴鹏翔,等.元学习研究综述[J].计算机学报, 2021, 44(02):422-446. [7] Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the International Conference on Machine Learning, 2017: 1126-1135. [8] Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the 34th International Conference on Machine Learning, 2017: 1856-1868. [9] Sung F, Yang Y, Zhang L. Learning to compare: Relation network for few-shot learning [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 1199-1208. [10] Xie E, Sun P, Song X, et al. Polar Mask: Single shot instance segmentation with polar representation[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020: 12190-12199. [11] Gravano A. Turn-taking and affirmative cue words in task-oriented dialogue[M]. Columbia University, 2009. [12] Azpiazu I M, Pera M S. Hierarchical mapping for crosslingual word embedding alignment[J]. Transactions of the Association for Computational Linguistics, 2020, 8:361-376. [13] Zhang X,Duh K. Reproducible and efficient benchmarks for hyperparameter optimization of neural machine translation systems[J]. Transactions of the Association for Computational Linguistics, 2020, 8(1):393-408. [14] Petrenz P, Webber B. Stable classification of text genres[J]. Computational Linguistics, 2011, 37(2): 385-393. [15] Zhang J, Chen C, Liu P, et al. Target-guided structured attention network for target-dependent sentiment analysis[J]. Transactions of the Association for Computational Linguistics, 2020, 8(1):172-182. [16] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1746-1751. [17] Chen Y, Dai X, Liu M, et al. Dynamic convolution: attention over convolution kernels[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 11030-11039. [18] Nichol A, Schulman J. Reptile: a scalable meta-learning algorithm[J]. arXiv preprint arXiv:1803.02999, 2018, 2(3): 4. [19] Hospedales T M, Antoniou A, Micaelli P, et al. Meta-Learning in neural networks: A survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 102: 1-20. [20] Pang Y, He Q, Jiang G, et al. Spatio-temporal fusion neural network for multi-class fault diagnosis of wind turbines based on SCADA data[J]. Renewable Energy, 2020, 161: 510-524. [21] Zhang X, Zhao J, Lecun Y. Character-level convolutional networks for text classification[J]. Advances in Neural Information Processing Systems, 2015, 28: 649-657. [22] Peng Z, Wei S, Tian J, et al. Attention-based bidirectional long short-term memory networks for relation Classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 207-212. [23] Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6): 1137-1149. [24] Mikolov T, Corrado G, Kai C, et al. Efficient estimation of word representations in vector space[C]// Proceedings of the International Conference on Learning Representations, 2013: 1355-1370. [25] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2016: 770-778. [26] Zagoruyko S, Komodakis N. Wide residual networks[C]//Proceedings of the British Machine Vision Conference, 2016: 1-15 [27] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141. [28] Kruspe A. A simple method for domain adaptation of sentence embeddings[J]. arXiv preprint arXiv:2008.11228, 2020. [29] Cai Q, Pan Y, Yao T, et al.Memory matching networks for one-shot image recognition [C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2018: 4080-4088. [30] Pan C, Huang J, Gong J, et al. Few-shot transfer learning for text classification with lightweight wordembedding based models[J]. IEEE Access, 2019, 7: 53296-53304. [31] Shin H C, Roth H R, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning[J]. IEEE Transactions on Medical Imaging, 2016, 35(5): 1285-1298. [32] Antoniou A, Storkey A, Edwards H. How to train your MAML[C]//Proceedings of the International Conference on Learning Representations, 2019: 1-11. [33] Liu Y, Lee J, Park M, et al. Learning to propagate labels: Transductive propagation network for few-shot learning[C]//Proceedings of the International Conference on Learning Representations, 2019: 1-15. [34] Rodríguez P, Laradji I, Drouin A, et al. Embedding propagation: Smoother manifold for few-shot classification[C]//Proceedings of the European Conference on Computer Vision. Springer, Cham, 2020: 121-138. |
|
|
|