限定领域口语对话系统中的商品属性抽取

叶大枢;黄沛杰;邓振鹏;黄 强

PDF(3087 KB)
PDF(3087 KB)
中文信息学报 ›› 2016, Vol. 30 ›› Issue (6) : 67-74.
综述

限定领域口语对话系统中的商品属性抽取

  • 叶大枢;黄沛杰;邓振鹏;黄 强
作者信息 +

Product Feature Mining in Restricted Domain Spoken Dialogue System

  • YE Dashu; HUANG Peijie; DENG Zhenpeng; HUANG Qiang
Author information +
History +

摘要

按功能或问题域划分,商品属性抽取(product feature mining)在限定领域的对话系统中属于口语语言理解(spoken language understanding, SLU)的范畴。商品属性抽取任务只关注自然文本中描述商品属性的特定部分,它是细粒度观点抽取(fine-grained opinion mining)的一个重要的子任务。现有的商品属性抽取技术主要建立在商品的评论语料上,该文以手机导购对话系统为背景,将商品属性抽取应用到整个对话过程中,增强对话系统应答的针对性。使用基于CBOW (continuous bag of words)语言模型的word2vector(W2V)对词汇的语义层面建模,提出一个针对口语对话的指数型变长静态窗口特征表达框架,捕捉不同距离词语组合的重要特征,使用卷积神经网络(convolutional neural network, CNN)结合词汇的语义和上下文层面对口语对话语料中的商品属性进行抽取。词嵌入模型给出了当前词和所给定的属性类别是否存在相关性的证据,而所提出的特征表达框架则是为了解决一词多义的问题。实验结果表明,该方法取得了优于研究进展中方法的商品属性识别效果。

Abstract

This paper applies the product feature mining on a dialogue system of a mobile phone recommendation assistant, enhancing the focus of the system during the interaction. CBOW (continuous bag of words) language model is used to represent the sematic clue. A feature framework with exponential elongate static window is introduced to capture the import features among the interactions between words of variant distance. We finally utilize convolutional neural network (CNN) to perform product feature mining task. The word embedding representing sematic clue gives the relation between current word and the product feature, while the feature framework can alleviate the word ambiguity. The experiment shows that our model outperforms the state-of the act methods on product feature mining.

关键词

商品属性抽取 / 词向量 / 卷积神经网络 / 特征表达 / 口语对话系统

Key words

product feature mining / word2vector / CNN / feature representation / spoken dialogue system
 
/   /   /
 
/   /   /
 
/   /  

引用本文

导出引用
叶大枢;黄沛杰;邓振鹏;黄 强. 限定领域口语对话系统中的商品属性抽取. 中文信息学报. 2016, 30(6): 67-74
YE Dashu; HUANG Peijie; DENG Zhenpeng; HUANG Qiang. Product Feature Mining in Restricted Domain Spoken Dialogue System. Journal of Chinese Information Processing. 2016, 30(6): 67-74

参考文献

[1] Hu M, Liu B. Mining opinion features in customer reviews[C]//Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI 2004), 2004: 755-760.
[2] Yi J, Niblack W. Sentiment mining in Web Fountain[C]//Proceedings of the 21st IEEE Conference on Data Engineering (ICDE 2005), 2005: 1073-1083.
[3] Chen Y N, Wang W Y, Rudnicky A I. Jointly modeling inter-slot relations by random walk on knowledge graphs for unsupervised spoken language understanding[C]//Proceedings of the 2015 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2015), 2015: 619-629.
[4] DeJong G. An overview of the FRUMP system[M]. Strategies for Natural Language Processing, 1982: 113.
[5] Radev D R, McKeown K R. Generating natural language summaries from multiple on-line sources[J]. Computational Linguistics, 1998, 24(3): 470-500.
[6] Paice C D. Constructing literature abstracts by computer: techniques and prospects[J]. Information Processing & Management, 1990, 26(1): 171-186.
[7] Hovy E, Lin C Y. Automated text summarization and the SUMMARIST system[C]//Proceedings of the ACL/EACL Workshop on Intelligent Scalable Text Summarization, 1998: 197-214.
[8] Popescu A M, Etzioni O. Extracting product features and opinions from reviews[M]. Natural Language Processing and Text Mining. Springer London, 2007: 9-28.
[9] Zhuang L, Jing F, Zhu X Y. Movie review mining and summarization[C]//Proceedings of the 15th ACM international conference on Information and knowledge management (CIKM 2006), 2006: 43-50.
[10] Qiu G, Liu B, Bu J, et al. Expanding Domain Sentiment Lexicon through Double Propagation[C] // Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI 2009), 2009: 1199-1204.
[11] Zhang L, Liu B, Lim S H, et al.Extracting and ranking product features in opinion documents[C]//Proceedings of the 23rd international conference on computational linguistics (COLING 2010), 2010: 1462-1470.
[12] Kleinberg J M. Authoritative sources in a hyperlinked environment[J]. Journal of the ACM, 1999, 46(5): 604-632.
[13] Wu Y, Zhang Q, Huang X, et al. Phrase dependency parsing for opinion mining[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP 2009), 2009: 1533-1541.
[14] Zhao Y, Qin B, Hu S, et al. Generalizing syntactic structures for product attribute candidate extraction[C]//Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2010), 2010: 377-380.
[15] Xu L, Liu K, Lai S, et al. Mining opinion words and opinion targets in a two-stage framework[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), 2013: 1764-1773.
[16] Xu L, Liu K, Lai S, et al. Product feature mining: semantic clues versus syntactic constituents[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2013), 2014: 336-346.
[17] Morin F, Bengio Y. Hierarchical probabilistic neural network language model[C]//Proceedings of the International Workshop on Artificial Intelligence and Statistics (AISTATS 2005), 2005: 246-252.
[18] Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model[J]. Journal of Machine Learning Research, 2003, (3): 1137-1155.
[19] Bengio Y, Schwenk H, Senécal J S, et al. Neural probabilistic language models[M]. Innovations in Machine Learning. Springer Berlin Heidelberg, 2006: 137-186.
[20] Huang P J, Lin X M, Lian Z Q, et al. Ch2R: a Chinese chatter robot for online shopping guide[C]//Proceedings of the 3rd CIPS-SIGHAN Joint Conference on Chinese Language Processing (CLP-2014), 2014: 26-34.

基金

国家自然科学基金(71472068);广东省大学生科技创新培育专项项目(pdjh2016b0087)
PDF(3087 KB)

647

Accesses

0

Citation

Detail

段落导航
相关文章

/