针对中文口语问句的表达多样性对对话系统问题理解带来的挑战,该文采用“在语法结构之上获取语义知识”的设计理念,提出了一种语法和语义相结合的口语对话系统问题理解方法。首先人工编制了独立于领域和应用方向的语法知识库,进而通过句子压缩模块简化复杂句子,取得结构信息,再进行问题类型模式识别,得到唯一确定问题的语义组织方法、查询策略和应答方式的句型模式。另一方面,根据领域语义知识库,从源句子中提取相应的语义信息,并根据识别到的句型模式所对应的知识组织方法进行语义知识组织,完成对问句的理解。该文的方法被应用到开发的中文手机导购对话系统。测试结果表明,该方法能有效地完成对话流程中的用户问题理解。
Abstract
To solve the problems caused by diversity and flexibility of Chinese language in question understanding, the paper adopts the strategy of “getting semantic knowledge based on grammar question type structure”, and proposes a question understanding method by combining grammar and semantics for Chinese spoken dialogue system. First, we set up a hand-crafted grammar bases working independent of the domain and application direction. Second, through sentence compression, utterances are simplified to the structure of a sentence. Then question type pattern recognition is applied to determining the only question type pattern for the utterance which corresponds to the proper semantic organization method, query strategy and response way. On the other hand, we extract the relevant semantic information from the source utterance according to domain knowledge base. Afterwords, the extracted semantic information is converted into well-organized semantic knowledge based on the corresponding question type pattern to complete the question understanding. The proposed method is implemented as a Chinese dialogue system for mobile phone shopping guide. Test results demonstrate the efficiency of our approach.
关键词
问题理解 /
对话系统 /
句型模式 /
中文
{{custom_keyword}} /
Key words
question understanding /
dialogue system /
question type pattern /
Chinese
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Weizenbaum J. ELIZA-A computer program for the study of natural language communication between man and machine[J]. Communications of the ACM, 1966, 9(1): 36-45.
[2] Colby K M, Weber S, Hilf F D. Artificial paranoia[J]. Artificial Intelligence, 1971, 2(1): 1-25.
[3] Wallace R S. The Anatomy of A.L.I.C.E. [EB/OL], A.L.I.C.E. Artificial Intelligence Foundation Inc., 2004.
[4] 清华大学图书馆智能机器人小图[EB/OL]. http://166.111.120.164: 8081/programd/
[5] 小I机器人[EB/OL]. http://www.xiaoi.com/index.html
[6] Schumaker R P, Chen H. Leveraging question answer technology to address terrorism inquiry[J]. Decision Support Systems, 2007, 43(4): 1419-1430.
[7] Jia J Y. CSIEC: A computer assisted English learning chatbot based on textual knowledge and reasoning[J]. Knowledge-Based Systems, 2009, 22 (4): 249-255.
[8] Crutzen R, Peters G Y, Portugal S D, et al. An artificially intelligent chat agent that answers adolescents questions related to sex, drugs, and alcohol: An exploratory study[J]. Journal of Adolescent Health, 2011, 48(5): 514-519.
[9] Huang J Z, Zhou M, Yang D. Extracting chatbot knowledge from online discussion forums[C]//Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI 2007), Hyderabad, India, January 2007: 423-428.
[10] Russell R S. Language Use, Personality and True Conversational Interfaces[R]. Project Report, AI and CS, University of Edinburgh, Edinburgh, 2002.
[11] Pieraccini R, Tzoukermann E, Gorelov Z, et al. A speech understanding system based on statistical representation of semantics[C]//Proceedings Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP 1992), San Francisco, CA.
[12] Miller S, Bobrow R, Ingria R, et al. Hidden understanding models of natural language[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 1994), Las Cruces, NM.
[13] Seneff S. TINA: a natural language system for spoken language applications[J]. Computational Linguistics, 1992, 18 (1): 61-86.
[14] Ward W, Issar S. Recent improvements in the CMU spoken language understanding system[C]//Proceedings of the ARPA Human Language Technology Conference (HLT 1994) Workshop. 1994:213-216.
[15] Dowding J, Gawron J M, Appelt D, et al. Gemini: a natural language system for spoken language understanding[C]//Proceedings of the ARPA Workshop on Human Language Technology, Princeton, NJ,1993.
[16] 黄寅飞, 郑方, 燕鹏举, 等. 校园导航系统EasyNav的设计与实现[J].中文信息学报, 2001, 15(4): 35-40.
[17] 何伟, 李红莲, 袁保宗, 等. 基于对话回合衰减的cache语言模型在线自适应研究[J].中文信息学报, 2003, 17(5): 41-47.
[18] Pappu A, Rudnicky A. The Structure and Generality of Spoken Route Instructions[C]//Proceedings of the 13th SIGdial Workshop on Discourse and Dialogue, 2012: 99-107.
[19] Zue V, Seneff S, Glass J, et al. JUPITER: a telephone-based conversational interface for weather information[J]. IEEE Transactions on Speech and Audio Processing, 2000, 8(1): 85-96.
[20] 刘蓓, 杜利民, 于水源. 面向任务口语对话系统中期
待模型的实现算法[J].电子与信息学报, 2004, 26(11): 1721-1727.
[21] 张琳, 高峰, 郭荣, 等. 汉语股票实时行情查询对话系统[J]. 计算机应用, 2004, 24(7): 61-63.
[22] 黄民烈, 朱小燕. 对话管理中基于槽特征有限状态自动机的方法研究[J]. 计算机学报, 2004, 27 (08): 1092-1101.
[23] Chen Y N, Wang W Y, and Rudnicky A I. Unsupervised Induction and Filling of Semantic Slots for Spoken Dialogue Systems using Frame-Semantic Parsing[C]//Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2013.
[24] Turing A M. Computing Machinery and Intelligence[M]. Washington: Publishing House of Electronics Industry, 1950: 5-23.
[25] Chomsky N. The Minimalist Program. Cambridge[M], Mass: MIT Press, 1995.
[26] Chomsky N. Three factors in language design[J]. Linguistic Inquiry, 2005, 36(1): 1-22.
[27] Knight K, Marcu D. Summarization beyond sentence extraction: a probabilistic approach to sentence compression[J]. Artificial Intelligence, 2002, 139 (1): 91-107.
[28] Levy R, Manning C D. Is it harder to parse Chinese, or the Chinese Treebank? [C]//Proceedings of the 41th Annual Meeting of the Association for Computational Linguistics (ACL 2003), 2003: 439-446.
[29] Cohn T, Lapata M. Sentence compression as tree transduction[J]. Journal of Artificial Intelligence Research, 2009, 34(1): 637-674.
[30] Cohn T, Lapata M. An Abstractive Approach to Sentence Compression[J]. ACM Transactions on Intelligent Systems and Technology, 2013, 4(3): 1-35.
[31] Zhang H P, Yu H K, Xiong D Y, et al. HHMM-based Chinese lexical analyzer ICTCLAS[C]//Proceedings of the Second SIGHAN Workshop Affiliated with the 41 Annual Meeting of the Association for Computational Linguistics (ACL 2003), 2003, 184-187.
[32] Ristad E S, Yianilos P N. Learning string-edit distance[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(5): 522-532.
[33] 林仙茂, 黄沛杰, 杨德, 等. 中文手机导购对话系统中的语义信息提取[J]. 现代计算机, 2014, (04): 52-55.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
广东省大学生创新训练计划项目(1056412151, 1056413096, 201410564290);广东省科技计划项目(2012A020602012)
{{custom_fund}}