基于小样本学习的意图识别与槽位填充方法

孙相会,苗德强,窦辰晓,袁龙,马宝昌,邓勇,张露露,李先刚

PDF(1920 KB)
PDF(1920 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (2) : 119-128.
信息抽取与文本挖掘

基于小样本学习的意图识别与槽位填充方法

  • 孙相会1,苗德强1,窦辰晓1,袁龙2,马宝昌1,邓勇1,张露露1,李先刚1
作者信息 +

A Few-shot Learning Approach to Intent Recognition and Slot Filling

  • SUN Xianghui1, MIAO Deqiang1, DOU Chenxiao1,
    YUAN Long2, MA Baochang1, DENG Yong1, ZHANG Lulu1, LI Xiangang1
Author information +
History +

摘要

“意图识别”与“槽位填充”是智能人机交互中的两个核心任务,受到学术界和工业界的广泛关注。目前业界前沿主流的方法,在一些学术公开数据集上已取得了不错的效果。不过这些方法大多依赖于丰富的标注数据集来完成训练,而数据集需要经过人工采集、标注等流程构造,且需满足其分布的均匀性。然而,真实业务场景下的数据却很难达到以上标准,往往面临小样本学习困难的难题,大多主流方法在小样本学习任务上的表现远不如其在大样本学习上的效果。针对此业界难点,该文提出一种基于半监督学习与迁移学习的“意图识别”与“槽位填充”的串联方法。该方法根据“意图识别”和“槽位填充”的各自任务特性,分别设计了针对性更强的小样本解决思路,即通过半监督学习的思想,在不需引入大量标注数据的情况下,利用无标签数据丰富、构造训练样本集,提高小样本意图识别的准确性;以及通过迁移学习的思想,将从大样本数据中学习到的先验知识迁移到小样本数据模型中,利用大样本数据与小样本数据间的公共知识,提高小样本槽位填充的精准度。该文所提出的方法通过实验对比被证实有效,且在2021年中国计算机学会大数据与计算智能大赛(CCF-BDCI)组委会与中国中文信息学会(CIPS)共同举办的全国信息检索挑战杯(CCIR Cup)的“智能人机交互自然语言理解”赛道取得了第一名的成绩。

Abstract

"Intent Recognition" and "Slot Filling" are two core tasks in intelligent human-computer interaction, which have received extensive attentions from academia and industry. Most state-of-the-art models on few-shot learning tasks are far inferior to their performance on many-shot learning tasks. In this paper, we propose a novel joint model based on semi-supervised and transfer learning for intent recognition and slot filling. Semi-supervised learning is used to identify the few-shot intent, requiring no additional labelled data. Transfer learning is used to exploit prior knowledge learned from the large samples to acquire a slot-filling model for the small samples. The proposed method won the first place in 2021 China Computer Society Big Data and Computational Intelligence Competition (CCF-BDCI) Organizing Committee and Chinese Information Society of China (CIPS) jointly held the National Information Retrieval Challenge Cup (CCIR Cup) track of 2021 CCIR Cup held by CCF Big Data & Computing Intelligence Contest (CCF-BDCI).

关键词

小样本学习 / 半监督 / 迁移学习

Key words

few-shot learning / semi-supervised learning / transfer learning

引用本文

导出引用
孙相会,苗德强,窦辰晓,袁龙,马宝昌,邓勇,张露露,李先刚. 基于小样本学习的意图识别与槽位填充方法. 中文信息学报. 2023, 37(2): 119-128
SUN Xianghui, MIAO Deqiang, DOU Chenxiao,
YUAN Long, MA Baochang, DENG Yong, ZHANG Lulu, LI Xiangang.
A Few-shot Learning Approach to Intent Recognition and Slot Filling. Journal of Chinese Information Processing. 2023, 37(2): 119-128

参考文献

[1] CASTELLUCCI G, BELLOMARIA V, FAVALLI A, et al. Multi-lingual intent detection and slot filling in a joint BERT-based model[J]. arXiv preprint arXiv: 1907.02884, 2019.
[2] QIN L, LIU T, CHE W, et al. A cointeractive transformer for joint slot filling and intent detection[C]//Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2021: 8193-8197.
[3] CHEN Q, ZHUO Z, WANG W. BERT for joint intent classification and slot filling[J]. arXiv preprint arXiv: 1902.10909, 2019.
[4] ZHANG X, WANG H. A joint model of intent determination and slot filling for spoken language understanding[C]//Proceedings of International Joint Conference on Artificial Intelligence, 2016: 2993-2999.
[5] LIU B, LANE I. Attention-based recurrent neural network models for joint intent detection and slot filling[J]. arXiv preprint arXiv: 1609.01454, 2016.
[6] HAKKANI-TR D, TR G, CELIKYILMAZ A, et al. Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM[C]//Proceedings of International Speech Communication Association, 2016: 715-719.
[7] GOO C W, GAO G, HSU Y K, et al. Slot-gated modeling for joint slot filling and intent prediction[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 753-757.
[8] LI C, LI L, QI J. A selfattentive model with gate mechanism for spoken language understanding[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 3824-3833.
[9] VASWANI A,SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of Advances in Neural Information Processing Systems, 2017: 5998-6008.
[10] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pretraining of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171: 4186.
[11] LIU Y, OTT M, GOYAL N, et al. RoBERTa: A robustly optimized BERT pretraining approach[J]. arXiv preprint arXiv: 1907.11692, 2019.
[12] WEI J, REN X, LI X, et al.NEZHA: neural contextualized representation for Chinese language understanding[J]. arXiv preprint arXiv: 1909.00204, 2019.
[13] CUI Y, CHE W, LIU T, et al. Revisiting pre-trained models for Chinese natural language processing[J]. arXiv preprint arXiv: 2004.13922, 2020.
[14] PAN S J, YANG Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 22(10): 1345-1359.
[15] ZHUANG F, QI Z,DUAN K, et al. A comprehensive survey on transfer learning[J]. Proceedings of the IEEE, 2020, 109(1): 43-76.
[16] QIN L, XU X, CHE W, et al. Dynamic fusion network for multi-domain end-to-end task-oriented dialog[J]. arXiv preprint arXiv: 2004.11019, 2020.
[17] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[18] LAFFERTY J, MCCALLUM A, PEREIRA F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the 18th International Conerence of Machine Learning, 2001: 282-289.
[19] GOODFELLOW I, SHLENS J, SZEGEDY C. Explaining and hardnessing adversarial examples[J]. arXiv preprint arXiv: 1412.6572, 2014.

基金

国家自然科学基金(61902184);江苏省自然科学基金(BK20190453)
PDF(1920 KB)

Accesses

Citation

Detail

段落导航
相关文章

/