Abstract:To avoid the issue of gradient disappearance or gradient explosion associated with the deeper layers and better capture the word semantic information, this paper proposed a fusion network for Chinese news text classification. Firstly, this paper applies the densely connected bi-GRU to learn the deep semantic representation. Secondly, it applies max-pooling layer to reduce the key vector dimension. Thirdly, it adopted the self-attention mechanism to capture more important features. Finally, the learning representations are concatenated as the input of the classifier. The experimental results on NLPCC2014 dataset show that the proposed fusion model is better than the latest model AC-BiLSTM.
[1] S. Lai, L Xu, K Liu, et al. Recurrent convolutional neural networks for text classification[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI, 2015:2267-2273. [2] F Miao, P Zhang, L Jin et al. Chinese news text classification based on machine learning algorithm[C]//Proceedings of the 10th International Conference on Intelligent Human-Machine systems and Cybernetics. IEEE, 2018, 2: 48-51. [3] B Altnel, M C Ganiz. A new hybrid semi-supervised algorithm for text classification with class-based semantics[J]. Knowledge-Based Systems, 2016,108: 50-64. [4] T M Cover, J A Thomas. Elements of information theory[M].NJ: Wiley-Blackwell, 2012: 676-700. [5] Cao Z, Li S, Liu Y, et al. A novel neural topic model and its supervised extension[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI, 2015: 2210-2216. [6] Srivastava A, Sutton C A. Autoencoding variational inference for topic models[C]//Proceedings of the International Conference on Learning Representations. Toulon, France, 2017: 1326-1338. [7] Peng M,Xie Q, Zhang Y, et al. Neural sparse topical coding[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia, 2018: 2332-2340. [8] Card D, Tan C, Smith N A, et al. Neural models for documents with metadata[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Iryna Gurevych, Yusuke Miyao, 2018: 2031-2040. [9] Kim Yoon. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014: 1746-1751. [10] P Liu, X Qiu, X Huang. Recurrent neural network for text classification with multi-task learning[J/OL]. arXiv preprint arXiv: 1605.05101, 2016. [11] Zichao Yang, Diyi Yang, Chris D, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 1480-1489. [12] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation[J]. Journal of Machine Learning Research, 2012, 3: 993-1022. [13] Yan X, Guo J, Lan Y, et al. A biterm topic model for short texts[C]//Proceedings of the 22nd International Conference on World Wide Web. Rio de Janeiro, Brazil, 2013: 1445-1456. [14] Lau J H, Baldwin T, Cohn T. Topically driven neural language model[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, Canada, 2017: 355-365. [15] Wang W, Gan Z, et al. Topic compositional neural language model[C]//Proceedings of the 21st International Conference on Artificial Intelligence and Statistics. Lanzarote, Spain, 2018: 356-365. [16] 吴小华,陈莉,魏甜甜,等. 基于Self-Attention和Bi-LSTM的中文短文本情感分析[J]. 中文信息学报, 2019, 33 (06): 100-107. [17] Johnson R, Zhang T. Deep pyramid convolutional neural networks for text categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, Canada,2017: 562-570. [18] Conneau A, Schwenk H, Barrault L, et al. Very deep convolutional networks for text classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2016: 1107-1116. [19] Huang G, Liu Z,Maaten V D L, et al. Densely connected convolutional networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2261-2269. [20] Cho K, Merrienboer B V, Gulceher C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Doha: Association for Computational Linguistics, 2014: 1724-1734. [21] Max Hovy E. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[J/OL]. arxiv Preprint arXiv, 2016: 1603.01354. [22] Liangchen Luo, Yuanhao Xiong, et al. Adaptive gradient methods with dynamicc bound of learning rate[C]//Proceedings of the 7th International Conference on Learning Representations, 2019: 1-21. [23] M Yang, W Tu, J Wang, et al. Attention-based LSTM for target-dependent sentiment classification[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, AAAI press, San Francisco, CA, United states, 2017: 5013-5014. [24] Gang Liu, Jiabao Guo. Bidirectional LSTM with attention mechanism and convolutional layer for text classification[J]. Neurocomputing, 2019, 337: 325-338.