小样本文本分类任务同时面临两个主要问题: ①样本量少,易过拟合;②在元学习框架的任务形式下,监督信息被进一步稀疏化。近期工作中,利用图神经网络建模样本的全局信息表示(full context embedding)成为小样本学习领域中一种行之有效的方法,但将其迁移至小样本文本分类任务,由于文本多噪声,且特征易混淆,图神经网络往往出现过度平滑问题(over-smoothing)。该文提出了一种双通道图神经网络,在建模样本的全局特征的同时,充分利用标签传播机制,通过共享两通道的信息传播矩阵使得监督信息有效约束了图神经网络迭代过程。与基线的图神经网络相比,该方法在FewRel数据集上平均取得了1.51%的准确率提升;在ARSC数据集上取得了11.1%的准确率提升。
Abstract
Graph neural networks(GNN) recently appears to be an effective method to model the global context representation of samples, but defected in over-smoothing when faced with the noisy few-shot text classification scenario. We propose a dual channel graph neural network to model the full context features while making full use of the label propagation mechanism. A multi-task parameter sharing mechanism is used in the dual channels to effectively constrain the graph iteration process. Compared with the baseline graph neural network, our method achieves an average improvement of 1.51% on the FewRel dataset and 11.1% improvement on the ARSC dataset.
关键词
小样本学习 /
图神经网络 /
文本分类
{{custom_keyword}} /
Key words
few-shot /
graph neural network /
text classification
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Wang Y, Yao Q, Kwok J, et al. Generalizing from a few examples: A survey on few-shot learning[M].arXiv: 1904.05046, 2019.
[2] Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning[C]//Proceedings of Advances in Neural Information Processing Systems, 2016: 3630-3638.
[3] Koch G,Zemel R, Salakhutdinov R. Siamese neural networks for one-shot image recognition[C]//Proceedings of the ICML Deep Learning Workshop, 2015.
[4] Snell J,Swersky K, Zemel R. Prototypical networks for few-shot learning[C]//Proceedings of Advances in Neural Information Processing Systems, 2017: 4077-4087.
[5] Sung F, Yang Y, Zhang L, et al. Learning to compare: Relation network for few-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 1199-1208.
[6] Kang B, Liu Z, Wang X,et al. Few-shot object detection via feature reweightin[C]//Proceedings of the IEEE International Conference on Computer Vision, 2019: 8420-8429.
[7] Garcia V, Bruna J. Few-shot learning with graph neural networks[J]. arXiv preprint arXiv:1711.04043, 2017.
[8] 王盛玉,曾碧卿,商齐,等. 基于词注意力卷积神经网络模型的情感分析研究[J]. 中文信息学报, 2018, 32(9): 123-131.
[9] Chen D, Lin Y, Li W, et al. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view[C]//Proceedings of the AAAI, 2020: 3438-3445.
[10] 李欣,李旸,王素格. 面向情感聚类的文本相似度计算方法研究[J]. 中文信息学报, 2018, 32(5): 97-104.
[11] 胡艳霞,王成,李弼程,等. 基于多头注意力机制Tree-LSTM的句子语义相似度计算[J]. 中文信息学报, 2020, 34(3): 23-33.
[12] Finn C,Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the 34th International Conference on Machine Learning, 2017: 1126-1135.
[13] Liu Y, Lee J, Park M, et al. Learning to propagate labels:Transductive propagation network for few-shot learning[J]. arXiv preprint arXiv:1805.10002, 2018.
[14] Zhou D,Bousquet O, Lal T N, et al. Learning with local and global consistency[C]//Proceedings of the Advances in Neural Information Processing Systems, 2004: 321-328.
[15] Chapelle O, Weston J, Schlkopf B. Cluster kernels for semi-supervised learning[C]//Proceedings of the Advances in Neural Information Processing Systems, 2003: 601-608.
[16] Gao T, Han X, Liu Z, et al. Hybrid attention based prototypical networks for noisy few-shot relation classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33: 6407-6414.
[17] Geng R, Li B, Li Y, et al. Few-shot text classification with induction network[J]. arXiv preprint arXiv:1902.10482, 2019.
[18] Han X, Zhu H, Yu P, et al.Fewrel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation[J]. arXiv preprint arXiv:1810.10147, 2018.
[19] Blitzer J,Dredze M, Pereira F, et al. Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification[C]//Proceedings of Meeting of the Association for Computational Linguistics, 2007: 440-447.
[20] Gilmer J,Schoenholz S S, Riley P F, et al. Neural message passing for quantum chemistry[C]//Proceedings of the 34th International Conference on Machine Learning, 2017: 1263-1272.
[21] Sun Y, Cheng C, Zhang Y, et al. Circle loss: A unified perspective of pair similarity optimization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 6398-6407.
[22] Yu M,Guo X, Yi J, et al. Diverse few-shot text classification with multiple metrics[J]. arXiv preprint arXiv:1805.07513, 2018.
[23] Gao T, Han X, Zhu H, et al.FewRel 2.0: Towards more challenging few-shot relation classification[J]. arXiv preprint arXiv:1910.07124, 2019.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点研发计划(2018YFC0831103)
{{custom_fund}}