基于提示学习的中文短文本分类方法

穆建媛,朱毅,周鑫柯,李云,强继朋,袁运浩

PDF(4426 KB)
PDF(4426 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (7) : 82-90.
信息抽取与文本挖掘

基于提示学习的中文短文本分类方法

  • 穆建媛,朱毅,周鑫柯,李云,强继朋,袁运浩
作者信息 +

Chinese Short Text Classification Based on Prompt Learning

  • MU Jianyuan, ZHU Yi, ZHOU Xinke, LI Yun, QIANG Jipeng, YUAN Yunhao
Author information +
History +

摘要

伴随着互联网的飞速发展,产生了海量以微博、推特等为代表的少于100字的短文本数据,这些文本长度极短、特征稀疏、语义不足,因此给短文本分类问题带来了巨大挑战。现有的中文短文本分类方法往往需要大量的有标签或无标签的数据,但在实际应用中,大量的训练数据往往难以获取,且成本很高。为此,该文提出了一种基于提示学习的中文短文本分类方法,适用于少样本状况下的短文本分类。实验结果表明,该方法在仅使用少样本训练数据的情况下比使用大量训练数据的其他模型表现更好。具体来说,该文手工设计了模板,将使用了模板的原始数据替换为含有mask的文本作为新的输入,最终取得了较好的分类效果。通过对4个基准数据集进行验证,基于提示学习的分类方法在仅有40个训练样本的情况下比BERT预训练语言模型使用740个样本的准确率高出近6%。

Abstract

With the rapid development of the Internet,a large number of short texts no more than 100 words have emerged on Weibo and Twitter in recent years. In contrast to the existing Chinese short text classification methods demanding a large amount of labeled or unlabeled training data, this paper proposes a Chinese short text classification method based on prompt learning,which achieved excellent performance in few-shot scenarios. Specifically,we manually design many templates to replace the original data with masks as new input. Experiments on four benchmark data sets show that,our proposed method with 40 training samples has nearly 6% higher accuracy than the pre-train language model (e.g. BERT) using 740 samples.

关键词

短文本分类 / 提示学习 / 少样本

Key words

short text classification / prompt learning / few shot

引用本文

导出引用
穆建媛,朱毅,周鑫柯,李云,强继朋,袁运浩. 基于提示学习的中文短文本分类方法. 中文信息学报. 2023, 37(7): 82-90
MU Jianyuan, ZHU Yi, ZHOU Xinke, LI Yun, QIANG Jipeng, YUAN Yunhao. Chinese Short Text Classification Based on Prompt Learning. Journal of Chinese Information Processing. 2023, 37(7): 82-90

参考文献

[1] ZHOU Y,XU B,XU J,et al. Compositional recurrent neural networks for Chinese short text classification[C]//Proceedings of the IEEE International Conference on Web Intelligence,2016: 137-144.
[2] WANG P,XU B,XU J,et al. Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification[J]. Neurocomputing,2016,174: 806-814.
[3] YANG L,LI C,DING Q,et al. Combining lexical and semantic features for short text classification[J].Procedia Computer Science,2013,22: 78-86.
[4] LIU W,WANG T. Index-based online text classification for SMS spam filtering[J]. Journal of Computational Physics,2010,5(6): 844-851.
[5] CHEN J,HU Y,LIU J,et al. Deep short text classification with knowledge powered attention[C]//Proceedings of the AAAI Conference on Artificial Intelli-gence,2019,33(01): 6252-6259.
[6] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2014: 1746-1751.
[7] SUN C,QIU X,XU Y,et al. How to fine-tune bert for text classification?[C]//Proceedings of China National Conference on Chinese Computational Linguistics,2019: 194-206.
[8] MINAEE S,KALCHBRENNER N,CAMBRIA E,et al. Deep learning-based text classification: A comprehensive review[J]. ACM Computing Surveys,2021,54(3): 1-40.
[9] PAVLINEK M,PODGORELEC V. Text classification method based on self-training and LDA topic models[J].Expert Systems with Applications,2017,80: 83-93.
[10] LIU P,YUAN W,FU J,et al. Pretrain,prompt,and predict: A systematic survey of prompting methods in natural language processing[J]. arXiv preprint arXiv: 2107.13586,2021.
[11] SCHICK T,SCHTZE H. Exploiting cloze questions for few shot text classification and natural language inf-erence[C]//Proceedings of the 16th Conference of European Chapter of the Association for Computational Linguistics,2021:255-269.
[12] FAGUO Z,FAN Z,BINGRU Y,et al. Research on short text classification algorithm based on statistics and rules[C]//Proceedings of the International Symposium on Electronic Commerce and Security,2010: 3-7.
[13] WANG S I,MANNING C D. Baselines and bigrams: Simple,good sentiment and topic classification[C]//Proceedings of the Association for Computational Linguistics,2012: 90-94.
[14] YONG Z,YOUWEN L,SHIXIONG X. An improved KNN text classification algorithm based on clustering[J]. Journal of Computers,2009,4(3): 230-237.
[15] JOACHIMS T. Text categorization with support vector machines: Learning with many relevant features[C]//Proceedings of the European Conference on Machine Learning,1998: 137-142.
[16] PENG F,SCHUURMANS D. Combining naive Bayes and n-gram language models for text classification[C]//Proceedings of the European Conference on Information Retrieval,2003: 335-350.
[17] CAVNAR W B,TRENKLE J M. N-gram-based text cate-gorization[C]//Proceedings of the Annual Symposium on Document Analysis and Information Retrieval,1994,161175.
[18] LIN Y,WANG J. Research on text classification based on SVM-KNN[C]//Proceedings of the International Conference on Software Engineering and Service Science,2014: 842-844.
[19] WANG H,TIAN K,WU Z,et al. A short text classification method based on convolutional neural network and semantic extension[J]. International Journal of Computational Intelligence Systems,2021,14(1): 367-375.
[20] ZHANG T,YOU F. Research on short text classificati-on based on textcnn[C]//Proceedings of the Journal of Physics: Conference Series,2021,1757(1): 012092.
[21] KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P. A con-volutional neural network for modelling sentences[J]. arXiv preprint arXiv: 1404.2188,2014.
[22] LEE J Y,DERNONCOURT F. Sequential short-text classification with recurrent and convolutional neural net-works[C]//Proceedings of NAACL-HLT, 2016: 515-520.
[23] LAI S,XU L,LIU K,et al. Recurrent convolutional neural networks for text classification[C]//Proceedings of the AAAI Conference on Artificeal Intelligence,2015: 2267-2273.
[24] HOCHREITER S,SCHMIDHUBER J. Long short-term memory[J]. Neural Computation,1997,9(8): 1735-1780.
[25] SCHUSTER M,PALIWAL K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing,1997,45(11): 2673-2681.
[26] ZHOU C,SUN C,LIU Z,et al. A C-LSTM neural network for text classification[J]. arXiv preprint arXiv: 1511.08630,2015.
[27] WANG J,WANG Z,ZHANG D,et al. Combining knowledge with deep convolutional neural networks for short text classification[C]//Proceedings of the International Joint Conference on Artificial Intelligence,2017: 2915-2921.
[28] ALAM M,BIE Q,TRKER R,et al. Entity-based short text classification using convolutional neural networks[C]//Proceedings of the International Conference on Knowledge Engineering and Knowledge Management,2020: 136-146.
[29] XU H,LIU B,SHU L,et al. BERT post-training for review reading comprehension and aspect-based sentiment analysis[C]//Proceedings of NAACL-HLT,2019: 2324-2335.
[30] LU Z,DU P,NIE J Y. VGCN-BERT: Augmenting BERT with graph embedding for text classification[C]//Proceedings of the European Conference on Information Retrieval,2020: 369-382.
[31] DEVLIN J,CHANG M W,LEE K,et al. Bert: Pre-training of deep bidirectional transformers or language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics,2019: 4171-4186
[32] SUN Y,WANG S,LI Y,et al. Ernie 2.0: A continual pre-training framework for language understanding[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020,34(05): 8968-8975.
[33] LAN Z,CHEN M,GOODMAN S,et al. Albert: A lite bert for self-supervised learning of language representations[C]//Proceedings of the ICLR, 2020:1-17.
[34] RADFORD A,NARASIMHAN K,SALIMANS T,et al. Improving language understanding by generative pretraining[EB/OL].http://cdn.openai.com/research-covers/Language-unsupervised/Language-understanding-paper.pdf.[2022-03-06].
[35] RADFORD A,WU J,CHILD R,et al. Language models are unsupervised multitask learners[J]. OpenAI Blog, 2019,1(8): 9-18.
[36] BROWN T,MANN B,RYDER N,et al. Language models are few-shot learners[C]//Proceedings 34th International Conference on Neural Information Processing Systems,2020: 1877-1901.
[37] RAFFEL C,SHAZEER N,ROBERTS A,et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. The Journal of Machine Learning Research, 2020,21(1): 5485-5551.
[38] MA R,ZHOU X,GUI T,et al. Template-free prompt tuning for few-shot NER[C]//Proceedings of EMNLP, 2022:3186-3199.
[39] SHIN T,RAZEGHI Y,LOGAN IV R L,et al. Auto-pr-ompt: Eliciting knowledge from language models with automatically generated prompts[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2020:4222-4235.
[40] LI X L,LIANG P. Prefix-tuning: Optimizing continu-ous prompts for generation[C]//Proceedings of ACL-IJLNLP,2021: 4582-4597.
[41] LIU X,ZHENG Y,DU Z,et al. GPT understands, too[J]. arXiv preprint arXiv: 2103.10385,2021.
[42] GU Y,HAN X,LIU Z,et al. Ppt: Pre-trained prompt tuning for few-shot learning[C]//Proceedings of the ACL,2022:8410-8423.
[43] HAN X,ZHAO W,DING N,et al. Ptr: Prompt tuning with rules for text classification[J]. arXiv preprint arXiv: 2105.11259,2021.
[44] HU S,DING N,WANG H,et al. Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification[C]//Prpceedings of the ACL,2022:2225-2240.
[45] CHEN X,ZHANG N,XIE X,et al. Knowprompt: Knowledge-aware prompt-tuning with synergisti optimization for relation extraction[C]//Proceedings of the ACM Web Conference,2022:2778-2788.
[46] THUCTC: An efficient Chinese text classification tool kit [CP/OL]. http://thuctc.thunlp.org/.[2020-11-11].
[47] KIM Y. Convolutional neural network for sentence classification[C]//Proceedings of the EMNLP, 2014:1746-1751.

基金

国家自然科学基金(61906060,62076217)
PDF(4426 KB)

Accesses

Citation

Detail

段落导航
相关文章

/