文本分类是自然语言处理领域中一项基本任务,但目前的文本分类任务往往是领域独立的,且需要丰富的标注数据。该文通过利用不同领域的数据蕴含的相似信息,在一定程度上缓解标签训练数据不足的问题。该文提出了一种多任务学习模型来解决跨领域文本分类任务,通过每个领域的私有编码器和所有领域的共享编码器来分别提取私有特征和共享特征,从而利用不同层面的领域知识来表示文本,并帮助文本分类。另外,该文还利用正交投影将共享特征和领域私有特征进一步异化,从而强化共享特征的纯度,同时使用门控机制将共享特征和私有特征进行重组融合。我们在两个常用的多领域文本分类数据集(Amazon和FDU-MTL)上对所提模型进行了验证。实验结果表明,该模型在Amazon和FDU-MTL数据集上的平均分类准确率分别达到了86.04%和89.2%,较之前多个基线模型有明显提升。
Abstract
Text Classification is a fundamental task in natural larguage processing communing. However, current text classification is usually domain-independent, suffering from insufficient annotated training data. We propose a solution by leveraging the similar information of data in different domains to address the limited labeled training data issue. Under the framework of multi-task learning proposed by this paper, we extract domain-invariant and domain-specific features by using a shared encoder and multiple private encoders, respectively. Latent informaton from different domaius can be captured, which is beneficial for multi-domain text classification. Besides, we further apply an orthogonal projection operation to inherently disjoint shared and private feature spaces to refine of the shared features, and then designed a gate mechanism to fuse the shared and private features. Experiments on Amazon review and FDU-MTL show that the average accuracy of the proposed model on two datasets are 86.04% and 89.2%, respectively, significant better compared with multiple baseline models.
关键词
文本分类 /
多领域 /
特征提纯 /
多任务学习
{{custom_keyword}} /
Key words
text classification /
multi-domain /
feature refinement /
multi-task learning
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 冯超,黎海辉,赵洪雅,等.基于层次注意力机制和门机制的属性级别情感分析[J].中文信息学报,2021,35(10): 128-136.
[2] 杨志明,王来奇,王泳.基于双通道卷积神经网络的问句意图分类研究[J].中文信息学报,2019,33(05): 122-131.
[3] 胡玉兰,赵青杉,陈莉,等.面向中文新闻文本分类的融合网络模型[J].中文信息学报,2021,35(03): 107-114.
[4] Blitzer J, Dredze M, Pereira F. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification[C]//Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, 2007: 440-447.
[5] Gu Q, Li Z, Han J. Joint feature selection and subspace learning[C]//Proceedings of the 22nd International Joint Conference on Artificial Intelligence. Barcelona, Catalonia, Spain, 2011: 1294-1299.
[6] Li S, Zong C. Multi-domain sentiment classification[C]//Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics. Columbus, Ohio, USA, Short Papers, 2008: 257-260.
[7] Wu F, Huang Y. Collaborative multi-domain sentiment classification[C]//Proceedings of the IEEE International Conference on Data Mining. Atlantic City, NJ, USA, 2015: 459-468.
[8] Chen X, Cardie C. Multinomial adversarial networks for multi-domain text classification[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans, Louisiana, USA, 2018: 1226-1240.
[9] Qin Q, Hu W, Liu B. Feature projection for improved text classification[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 8161-8171.
[10] Tang D, Qin B, Feng X, et al. Effective LSTMs for target-dependent sentiment classification[C]//Proceedings of the 26th International Conference on Computational Linguistics, 2016: 3298-3307.
[11] Wang Y, Fei T. Recurrent residual learning for sequence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 938-943.
[12] Kim Y. Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1746-1751.
[13] Johnson R, Tong Z. Deep pyramid convolutional neural networks for text categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 562-570.
[14] Yang Z, Yang D, Dyer C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 1480-1489.
[15] Lin Z, Feng M, Santos C, et al. A structured self-attentive sentence embedding[C]//Proceedings of the 5th International Conference on Learning Representations, 2017.
[16] Ma Y, Peng H, Cambria E. Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive LSTM[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2018: 5876-5883.
[17] Liu P, Qiu X, Huang X. Adversarial multi-task learning for text classification[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 1-10.
[18] Liu P, Fu J, Dong Y, et al. Learning multi-task communication with message passing for sequence learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019: 4360-4367.
[19] Pennington J, Socher R, Manning C. Glove: Global vectors for word representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1532-1543.
[20] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[21] Evgeniou T, Pontil M. Regularized multi-task learning[C]//Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, Washington, USA, ACM, 2004: 109-117.
[22] Zhou J, Chen J, Ye J. Malsar: Multi-task learning via structural regularization[R]. Arizona State University, 2011: 21.
[23] Collobert, Ronan, Weston, et al. A unified architecture for natural language processing: Deep neural networks with multitask learning[C]//Proceedings of the 25th International Conference on Machine Learning, 2008: 160-167.
[24] Yuan J, Zhao Y, Qin B, et al. Learning to share by masking the non-shared for multi-domain sentiment classification[J]. arXiv preprint arXiv: 2104.08480, 2021.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(62176187);国家重点研发计划(2017YFC1200500);教育部基金(18JZD015);教育部人文社会科学青年基金(22YJCZH064);湖北省自然科学基金(2021CFB385)
{{custom_fund}}