高级检索

基于提示的数据增强与双重语义对比的少样本分类研究

Prompt-based Data Augmentation and Dual-enhanced Semantic Contrastive Learning for Few-shot Classification

  • 摘要: 针对少样本学习中数据不足导致的过拟合以及模型泛化能力不足等问题,该文提出了一种基于数据增强、多模板提示学习和双重语义对比的框架PAD。该框架结合回译与释义方法并利用ChatGPT强大的语言理解能力生成新样本。同时,在输入文本中添加演示并设置多模板提示学习任务,激发预训练语言模型中的潜在知识,以提高模型泛化能力。此外,该文提出了一种双重语义增强对比策略DESC,通过分别关注正负样本对并使用两种损失函数进行优化,以提高模型对样本差异的区分能力。实验结果表明,PAD在多个少样本数据集上取得了较好的分类效果,证明基于多模板提示的数据增强和双重语义对比策略的结合能够有效缓解少样本问题并显著提高模型的泛化性能。

     

    Abstract: This study addresses overfitting and generalization issues in few-shot learning by introducing the PAD framework, which leverages data augmentation, multi-template prompt learning, and dual semantic contrastive learning. Utilizing the back-translation and paraphrasing with ChatGPT, this framework first enhances sample generation. It incorporates demonstrations in input text and multi-template tasks to activate latent knowledge in pre-trained models. Furthermore, the Dual-Enhanced Semantic Contrastive (DESC) strategy improves differentiation between positive and negative samples through dual loss functions. Experimental results across multiple datasets confirm that the proposed method significantly boosts model generalization.

     

/

返回文章
返回