1.School of Information Science, Beijing Language and Culture University, Beijing 100083, China; 2.National Language Monitoring and Research for Print Media Language Center, Beijing Language and Culture University, Beijing 100083, China
As an important part of constructing structured knowledge, relation classification has attracted much attention in the field of natural language processing. However, in many application fields (medical and financial fields), it is very difficult to collect sufficient data for training relation classification model. In recent years, few-shot learning research which relies only on a small number of training samples is emerging in various fields. In this paper, the recent models and methods of few-shot relation classification are systematically reviewed. According to the different measurement methods, the existing methods are divided into prototype and distributed ones. According to whether using additional information, the model is divided into two categories: pretraining and non-pretraining. In addition to the regular setting of few-shot learning, we also summarize the cross domain few-shot learning and few-few-shot learning, discuss the limitations of current few-shot relation classification methods, and analyze the technical challenges faced by cross domain few-shot models. Finally, the future development of few-shot relation classification is prospected.
HU Han, LIU Pengyuan.
Few-Shot Relation Classification: A Survey. Journal of Chinese Information Processing. 2022, 36(2): 1-11
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Wang Y, Yao Q, Kwok J T, et al. Generalizing from a few examples: a survey on few-shot learning[J]. ACM Computing Surveys, 2020, 53(3): 1-34. [2] Mintz M, Bills S, Snow R, et al. Distant supervision for relation extraction without labeled data[C]//Proceedings of the International Joint Conference on ACL. Association for Computational Linguistics, 2009. [3] Han X, Gao T, Lin Y, et al. More data, more relations, more context and more openness: a review and outlook for relation extraction[C]//Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020: 745-758. [4] Wang Z, Lai K, Li P, et al. Tackling long-tailed relations and uncommon entities in knowledge graph completion[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 250-260. [5] Feifei L, Fergus R,Perona P. One-shot learning of object categories[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(4): 594-611. [6] Vanschoren J. Meta-learning: a survey[J]. arXiv preprint arXiv: 1810.03548, 2018. [7] Weston J, Chopra S,Bordes A. Memory networks[J]. arXiv preprint arXiv: 1410.3916, 2014. [8] Sukhbaatar S,Szlam A, Weston J, et al. End-to-end memory networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015: 2440-2448. [9] Andrychowicd M, Denil M, Colmenarejo S G, et al. Learning to learn by gradient descent by gradient descent[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016: 3988-3996. [10] Finn C,Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the International Conference on Machine Learning. PMLR, 2017: 1126-1135. [11] Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016, 29: 3630-3638. [12] Yu M, Guo X, Yi J, et al. Diverse few-shot text classification with multiple metrics[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 1206-1215. [13] Geng R, Li B, Li Y, et al. Induction networks for few-shot text classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 3895-3904. [14] Geng R, Li B, Li Y, et al. Dynamic memory induction networks for few-shot text classification[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 1087-1094. [15] Han X, Zhu H, Yu P, et al.FewRel: a large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 4803-4809. [16] Soares L B, Fitz Gerald N, Ling J, et al. Matching the blanks: distributional similarity for relation learning[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 2895-2905. [17] Gao T, Han X, Liu Z, et al. Hybrid attention-based prototypical networks for noisy few-shot relation classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(01): 6407-6414. [18] Xie Y, Xu H, Li J, et al. Heterogeneous graph neural networks for noisy few-shot relation classification[J]. Knowledge based Systems, 2020, 194: 105548. [19] Obamuyide A, Vlachos A. Model-agnostic meta-learning for relation classification with limited supervision[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 5873-5879. [20] Gao T, Han X, Zhu H, et al.FewRel 2.0: towards more challenging few-shot relation classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 6251-6256. [21] Geng X, Chen X, Zhu K Q. MICK: a meta-learning framework for few-shot relation classification with little training data[J]. arXiv preprint arXiv: 2004.14164, 2020. [22] Chao W L, Ye H J, Zhan D C, et al. Revisiting meta-learning as supervised learning[J].arXiv preprint arXiv: 2002.00573, 2020. [23] Munkhdalai T, Yu H. Meta networks[C]//Proceedings of the International Conference on Machine Learning. PMLR, 2017: 2554-2563. [24] Satorras V G, Estrach J B. Few-shot learning with graph neural networks[C]//Proceedings of the International Conference on Learning Representations, 2018. [25] Mishra N,Rohaninejad M, Chen X, et al. A simple neural attentive meta-learner[C]//Proceedings of the International Conference on Learning Representations, 2018. [26] Snell J,Swersky K, Zemel R. Prototypical networks for few-shot learning[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 4080-4090. [27] Fan M, Bai Y, Sun M, et al. Large margin prototypical network for few-shot relation classification with fine-grained features[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019: 2353-2356. [28] Ye Z X, Ling Z H. Multi-level matching and aggregation network for few-shot relation classification[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 2872-2881. [29] Eberts M. Relation extraction with attention-based transfer learning[D]. Hochschule RheinMain, FB Design Informatik Medien, Informatik, 2019. [30] Bao Y, Wu M, Chang S, et al. Few-shot text classification with distributional signatures[C]//Proceedings of the International Conference on Learning Representations, 2019. [31] Wang X, Gao T, Zhu Z, et al. KEPLER: a unified model for knowledge embedding and pre-trained language representation[J].arXiv preprint arXiv: 1911.06136, 2019. [32] Gao T, Han X,Xie R, et al. Neural snowball for few-shot relation learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(05): 7772-7779. [33] Cong X, Yu B, Liu T, et al. Inductive unsupervised domain adaptation for few-shot classification via clustering[J].arXiv preprint arXiv: 2006.12816, 2020. [34] Edwards H,Storkey A. Towards a neural statistician[J]. arXiv preprint arXiv: 1606.02185, 2016. [35] Li H, Eigen D, Dodge S, et al. Finding task-relevant features for few-shot learning by category traversal[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 1-10. [36] Oreshkin B N, Rodriguez P, Lacoste A. TADAM: task dependent adaptive metric for improved few-shot learning[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018: 719-729. [37] Lee H B, Lee H, Na D, et al. Learning to balance: bayesian meta-learning for imbalanced and out-of-distribution tasks[C]//Proceedings of the 8th International Conference on Learning Representations, ICLR, 2020. [38] Zaheer M,Kottur S, Ravanbhakhsh S, et al. Deep-Sets[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 3394-3404. [39] Guo Y,Codella N C F, Karlinsky L, et al. A new benchmark for evaluation of cross-domain few-shot learning[J]. arXiv preprint arXiv: 1912.07200, 2019. [40] Triantafillou E, Zhu T, Dumoulin V, et al. Meta-dataset: a dataset of datasets for learning to learn from few examples[J].arXiv preprint arXiv: 1903.03096, 2019. [41] Brown T B, Mann B, Ryder N, et al. Language models are few-shot learners[J].arXiv preprint arXiv: 2005.14165, 2020. [42] Chen W Y, Liu Y C, Kira Z, et al. A closer look at few-shot classification[C]//Proceedings of the International Conference on Learning Representations, 2018. [43] Chen Y, Wang X, Liu Z, et al. A new meta-baseline for few-shot learning[J].arXiv preprint arXiv: 2003.04390, 2020. [44] Tian Y, Wang Y, Krishnan D, et al. Rethinking few-shot image classification: a good embedding is all you need?[J]. arXiv preprint arXiv: 2003.11539, 2020. [45] Du S S, Hu W, Kakade S M, et al. Few-shot learning via learning the representation provably[J]. arXiv preprint arXiv: 2002.09434, 2020.