与知识相结合的提示学习研究综述

鲍琛龙,吕明阳,唐晋韬,李莎莎,王挺

PDF(2117 KB)
PDF(2117 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (7) : 1-12.
综述

与知识相结合的提示学习研究综述

  • 鲍琛龙,吕明阳,唐晋韬,李莎莎,王挺
作者信息 +

A Survey of Prompt Learning Combined with Knowledge

  • BAO Chenlong, LYU Mingyang, TANG Jintao, LI Shasha, WANG Ting
Author information +
History +

摘要

近年来,提示学习方法由于可以充分激发预训练语言模型的潜能而得到了研究者越来越多的关注,特别是在知识抽取任务中取得了较好进展。为了提升提示学习性能,研究者也开展了基于知识的提示学习模板工程、答案工程优化等多项研究。该文对提示学习与知识相结合的相关研究进行了系统综述,包括知识抽取中的提示学习方法以及基于知识约束的提示学习相关进展。在此基础上,该文还探讨了目前方法存在的局限性,展望了提示学习与知识相结合的发展趋势。

Abstract

In recent years, prompt learning methods have attracted more attention from researchers for exploiting the pre-trained language models. To optimize the performance of prompt learning, researchers explored the template engineering and the answer engineering which are both based on knowledge. In this paper, the researches on prompt learning combined with knowledge are reviewed systematically, with a focus on the prompt learning methods in knowledge extraction. We also reveal the constraints in these method and discuss the possible future developments.

关键词

提示学习 / 知识抽取 / 知识提示

Key words

prompt learning / knowledge extraction / knowledge-based prompt

引用本文

导出引用
鲍琛龙,吕明阳,唐晋韬,李莎莎,王挺. 与知识相结合的提示学习研究综述. 中文信息学报. 2023, 37(7): 1-12
BAO Chenlong, LYU Mingyang, TANG Jintao, LI Shasha, WANG Ting. A Survey of Prompt Learning Combined with Knowledge. Journal of Chinese Information Processing. 2023, 37(7): 1-12

参考文献

[1] 图书馆·情报与文献学名词审定委员会. 图书馆·情报与文献学名词 [M]. 北京: 科学出版社, 2019:103-358.
[2] SUTTON C, ROHANIMANESH K. MCCALLUM A. Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data[J]. Journal of Machine Learning Research, 2007, 8(3): 693-723.
[3] GILDEA D, KHUDANPUR S, SARKAR A, et al. A smorgasbord of features for statistical machine translation[C]//Proceedings of HLT-NAACL, 2004: 161-168.
[4] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[J].arXiv preprint arXiv:1409.0473, 2014.
[5] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of EMNLP, 2014: 1746-1751.
[6] Alec R, Karthik N, Tim S, Ilya S, et al. Improving language understanding by generative pre training[EB/OL]. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf[2022-7-17].
[7] DEVLIN J, CHANG M, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL, 2019: 4171-4186.
[8] LEWIS M, LIU Y, GOYAL N, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]//Proceedings of ACL, 2020: 7871-7880.
[9] QIU X, SUN T, XU Y, et al. Pre-trained models for natural language processing: A survey[J]. Science China. Technological Sciences, 2020, 63(10): 1872-1897.
[10] LIU P, YUAN W, FU J, et al. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing[C]//Proceedings of the ACM Computing Surveys,2023,55(9):1-35.
[11] NING D, YULIN C, XU H, et al. Prompt-learning for Fine-grained entity typing[C]//Proceedings of the EMNLP, 2022: 6888-6901.
[12] LEYANG C, YU W, JIAN L, et al. Template-Based Named Entity Recognition Using BART[C]//Proceedings of the ACL-IJCNLP, 2021: 1835-1845.
[13] ZHOU X, MA R, GUI T, et al. Plug-tagger: A pluggable sequence labeling framework using language models[J].arXiv preprint arXiv: 2110.07331, 2021.
[14] CHEN X, ZHANG N, LI L, et al. LightNER: A lightweight generative framework with prompt-guided attention for low-resource NER[J].arXiv preprint arXiv:2109.00720, 2021.
[15] LEE D, KADAKIA A, TAN K, et al. Good examples make a faster learner: Simple demonstration-based learning for low-resource NER[C]//Proceedings of ACL, 2022: 2687-2700.
[16] MA R, ZHOU X, GUI T, et al. Template-free prompt tuning for few-shot NER[C]//Proceedings of NAACL, 2022:5721-5732.
[17] ZHOU W, CHEN M. An improved baseline for sentence-level relation extraction[C]//Proceedings of AACL-IJCNLP,2022:161-168.
[18] CHEN X, XIE X, ZHANG N, et al. AdaPrompt: Adaptive prompt-based finetuning for relation extraction[J]. arXiv e-prints, 2021: arXiv: 2104.07650.
[19] CHEN X, LI L, ZHANG N, et al. Relation extraction as open-book examination: Retrieval-enhanced prompt tuning[C]//Proceedings of SIGIR, 2022:2443-2448.
[20] CHIA Y K, BING L, PORIA S, et al. RelationPrompt: Leveraging prompts to generate synthetic data for zero-shot relation triplet extraction[C]//Proceedings of ACL, 2022: 45-57.
[21] ZHANG N, LI L, CHEN X, et al. Differentiable prompt makes pre-trained language models better few-shot learners[C]//Proceedings of ICLR, 2022.
[22] YE H, ZHANG N, BI Z, et al. Learning to ask for data-efficient event argument extraction[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022.
[23] HSU I, HUANG K, BOSCHEE E, et al. DEGREE: A data-efficient generative event extraction model[C]//Proceedings of NAACL, 2022:1890-1908.
[24] SI J, PENG X, LI C, et al. Generating disentangled arguments with prompts: A simple event extraction framework that works[C]//Proceedings of ICASSP, 2022: 6342-6346.
[25] LIN J, JIAN J, CHEN Q. Eliciting knowledge from language models for event extraction[J].arXiv preprint arXiv: 2109.05190, 2021.
[26] HAN X, ZHAO W, DING N, et al. PTR: Prompt tuning with rules for text classification[J].arXiv preprint arXiv: 2105.11259, 2021.
[27] HU S, DING N, WANG H, et al. Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification[C]//Proceedings of ACL, 2022: 2225-2240.
[28] CHEN X, ZHANG N, XIE X, et al. KnowPrompt: Knowledge-aware prompt-tuning with synergistic optimization for relation extraction[C]//Proceedings of the ACM Web Conference, 2022: 2778-2788.
[29] YE D, LIN Y, LI P, et al. Packed levitated marker for entity and relation extraction[C]//Proceedings of ACL, 2022: 4904-4917.
[30] YE H, ZHANG N, DENG S, et al. Ontology-enhanced prompt-tuning for few-shot learning[C]//Proceedings of the ACM Web Conference, 2022: 778-787.
[31] LI X L, LIANG P. Prefix-tuning: Optimizing continuous prompts for generation[C]//Proceedings of ACL-IJCNLP, 2021:4582-4597.
[32] SHIN T, RAZEGHI Y, LOGANIV R L, et al. AutoPrompt: Eliciting knowledge from language models with automatically generated prompts[C]//Proceeding of EMNLP,2020:4222-4235.
[33] LIU X, ZHENG Y, DU Z, et al. GPT understands, too[J].arXiv preprint arXiv:2103.10385, 2021.
[34] PETRONI F, ROCKTSCHEL T, LEWIS P, et al.Language models as knowledge bases[C]//Proceedings of EMNLP-IJCNLP,2019:2463-2473.
[35] SCHICK T, SCHTZE H. It's not just size that matters: Small language models are also few-shot learners[C]//Proceedings of NAACL,2021:2339-2352.
[36] JIANG Z, XU F, et al.How can we know what language models know[J]. Transactions of the Association for Computational Linguistics,2020, 8: 423-438.
[37] SCHICK T, SCHMID H, SCHTZE H. Automatically identifying words that can serve as labels for few-shot text classification[C]//Proceedings COLING,2020:5569-5578.
[38] BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020, 33: 1877-1901.
[39] LU Y, BARTOLO M, MOORE A, et al. Fantastically ordered prompts and where to find them: overcoming few-shot prompt order sensitivity[C]//Proceedings of ACL, 2022: 8086-8098.
[40] WANG X, ZHOU K, WEN J, et al. Towards unified conversational recommender systems via knowledge-enhanced prompt learning[C]//Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022: 1929-1937.
[41] RAJAGOPAL D, KHETAN V, SACALEANU B, et al. Cross-domain reasoning via template filling[J].arXiv preprint arXiv: 2111.00539, 2021.
[42] ZHAO L, ZHENG F, ZENG W, et al.ADPL: Adversarial prompt-based domain adaptation for dialogue summarization with knowledge disentanglement[C]//Proceedings of SIGIL, 2022:245-255.
[43] TAN Z, ZHANG X, WANG, S, et al.MSP: Multi-stage prompting for making pre-trained language models better translators[C]//Proceedings of ACL, 2022: 6131-6142.
[44] SCHUCHER N, SIVA R, HARM V.The power of prompt tuning for low-resource semantic parsing[C]//Proceedings of ACL, 2022:148-156.
[45] HUANG Z, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[J].arXiv preprint arXiv: 1508.01991, 2015.
[46] REIMERS N, GUREVYCH I. Sentence-BERT: Sentence embeddings using siamese BERT-networks[C]//Proceedings of EMNLP-IJCNLP, 2019: 3982-3992.
[47] ZHANG T, KISHORE V, WU F, et al. BERTScore: Evaluating text generation with BERT[C]//Proceedings of ICLR, 2020.
[48] RU C, TANG J, LI S, et al. Using semantic similarity to reduce wrong labels in distant supervision for relation extraction[J]. Information Processing & Management. 2018, 54(4): 593-608.
[49] GRISHMAN R, WESTBROOK D, MEYERS A. NYU's English ACE system description[EB/OL]. http://nlp.cs.nyu.edu/pubs/papers/ACE05-NYUChin-ese.pdf[2022-5-13].
[50] LIN Y, JI H, HUANG F, et al. A joint neural model for information extraction with global features[C]//Proceedings of ACL, 2020: 7999-8009.
[51] WADDEN D, WENNBERG U, LUAN Y, et al. Entity, relation, and event extraction with contextualized span representations[C]//Proceedings of EMNLP-IJCNLP, 2019: 5784-5789.
[52] LI F, PENG W, CHEN Y, et al. Event extraction as multi-turn question answering[G]//Proceedings of the Association for Computational Linguistics: EMNLP, 2020: 829-838.
[53] LIU J, CHEN Y, LIU K, et al. Event extraction as machine reading comprehension [C]//Proceedings of EMNLP, 2020: 1641-1651.
[54] DU X, CARDIE C. Event extraction by answering (Almost) natural questions[C]//Proceedings of EMNLP, 2020:671-683.
[55] PAOLINI G, ATHIWARATKUN B, KRONE J, et al. Structured prediction as translation between augmented natural languages[C]//Proceedings of ICLP,2021:1-26.
[56] LU Y, LIN H, XU J, et al. Text2Event: Controllable sequence-to-structure generation for end-to-end event extraction[C]//Proceedings of ACL, 2021: 2795-2806.
[57] SPEER R, CHIN J, HAVASI C.ConceptNet5.5: An open multilingual graph of general knowledge[C]//Proceedings of AAAI, 2017:4444-4451.
[58] PEDERSEN T, PATWARDHAN S, MICHELIZZI J.WordNet: Similarity—measuring the relatedness of concepts[C]//Proceedings of AAAI, 2004: 25-29.
[59] LIU W, ZHOU P, ZHAO Z, et al. K-BERT: Enabling language representation with knowledge graph[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(03): 2901-2908.
[60] ZHU C, ZENG M. Impossible triangle: What's next for pre-trained language models?[J].arXiv preprint arXiv: 2204.06130, 2022.
[61] YU D, ZHU C, YANG Y, et al. JAKET: Joint pre-training of knowledge graph and language understanding[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022.
[62] SUN T, SHAO Y, QIU X, et al. CoLAKE: Contextualized language and knowledge embedding[C]//Proceedings of COLING, 2020: 3660-3670.
[63] SHEN T, MAO Y, HE P, et al. Exploiting structured knowledge in text via graph-guided representation learning[C]//Proceedings of EMNLP, 2020: 8980-8994.
[64] LU Y, LIU Q, DAI D, et al. Unified structure generation for universal information extraction[C]//Proceedings of the ACL, 2022: 5755-5772.

基金

科技创新2030-“新一代人工智能”重大项目(2021ZD0110700);湖南省自然科学基金(2022JJ30668)
PDF(2117 KB)

Accesses

Citation

Detail

段落导航
相关文章

/