自然语言的语义理解涉及多个层面的问题,包括以谓词为中心的基本命题义、命题义之外的概念义、逻辑补足义等。目前主流的浅层语义分析主要集中在对命题义的分析上,缺少对概念义和逻辑义的支持,难以辅助计算机对文本的深度理解与推理。该文借鉴论元结构理论、事件语义学等相关语言学理论,突破语义角色标注等浅层语义分析的局限,建立了一种融合概念与逻辑的中文深层语义描述体系;并在该体系基础上,采用层层渲染的标注策略,构建了基于真实语料的大规模中文深层语义标注语料库,通过语言工程实践验证该描述体系的完备性和覆盖度。这一理论体系的建立和语言资源的构建,有望推动中文自动语义分析技术和人工智能等相关工作的创新发展。
Abstract
The natural language understanding involves multiple categories of meaning, including propositions, modality, and temporal logic. The most popular study of shallow semantics is focused on the analysis of propositional meaning. Without supporting for conceptual meaning and deep logical meaning, it can be hardly used to assist the computer in deep understanding and reasoning of the text. Based on the theory of argument structures, event semantics, and construction grammar, this paper breaks through the limitations of shallow semantic analysis (e.g. semantic role labeling) and establishes a deep semantic representation system for concepts and logic. Based on a layered rendering annotation strategy, a large scale Chinese deep semantic annotated corpus is constructed, which also helps to verify the completeness and coverage of the description system by real practice. The establishment of this theore-tical system and the construction of language resources are expected to promote the innovative development of Chinese automatic semantic analysis technology and artificial intelligence.
关键词
中文语义 /
意义表示 /
资源构建
{{custom_keyword}} /
Key words
Chinese semantics /
meaning representation /
resource construction
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Kingsbury P,Palmer M.From TreeBank to PropBank[C]//Proceedings of the LREC,2002:1989-1993.
[2] Fillmore C J,Johnson C R,Petruck M R L.Background to FrameNet[J].International Journal of Lexicography,2003,16(3):235-250.
[3] Liang P,Jordan M I,Klein D.Learning dependency-based compositional semantics[J].Computational Linguistics,2013,39(2):389-446.
[4] Oepen S,Kuhlmann M,Miyao Y,et al.SemEval 2014 Task 8:Broad-coverage semantic dependency parsing[C]//Proceedings of International Workshop on Semantic Evaluation,2015:63-72.
[5] Oepen S,Kuhlmann M,Miyao Y,et al.Semeval 2015 task 18:Broad-coverage semantic dependency parsing[C]//Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015),2015:915-926.
[6] Che W,Zhang M,Shao Y,et al.SemEval-2016 Task 9:Chinese semantic dependency parsing[C]//Proceedings of Joint Conference on Lexical and Computational Semantics,2012:378-384.
[7] 刘挺,车万翔,李正华.语言技术平台[J].中文信息学报,2011,25(6):53-63.
[8] Che W,Zhang M,Shao Y,et al.SemEval-2012 Task 5:Chinese semantic dependency parsing[C]//Proceedings of Joint Conference on Lexical and Computational Semantics.Association for Computational Linguistics,2012:378-384.
[9] Banarescu L,Bonial C,Cai S,et al.Abstract meaning representation for sembanking[C]//Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse,2013:178-186.
[10] Hovy E,Marcus M,Palmer M,et al.OntoNotes:the 90% solution[C]//Proceedings of the Human Language Technology Conference of the NAACL,Companion Volume:Short Papers.Association for Computational Linguistics,2006:57-60.
[11] Li B,Wen Y,Weiguang Q U,et al.Annotating the Little Prince with Chinese AMRs[C]//Proceedings of the 10th Linguistic Annotation Workshop Held in Conjunction with ACL 2016 (LAW-X 2016),2016:7-15.
[12] Zelle J M,Mooney R J.Learning to parse database queries using inductive logic programming[C]//Proceedings of the National Conference on Artificial Intelligence,1996:1050-1055.
[13] 顾阳.论元结构理论介绍[J].当代语言学,1994 (1):1-11.
[14] 吴平.试论事件语义学的研究方法[J].外语与外语教学,2007 (4):8-12.
[15] 陆俭明.词语句法,语义的多功能性:对 “构式语法” 理论的解释[J].外国语,2004,2(2):15-20.
[16] 俞士汶,朱学锋,王惠,等.现代汉语语法信息词典规格说明书[J].中文信息学报,1996,10(2):1-22.
[17] 现代汉语词典[M].北京:商务印书馆,2002.
[18] 詹卫东,穗志方,常宝宝,等.现代汉语谓词语义角色标注语料库规范[EB/OL].[2018-07-12].http://www.klcl.pku.edu.cn/xwdt/231644.htm.
[19] 邱立坤,赵慧,俞士汶,等.《现汉》与《语法信息词典》词类对应分析[J].中文信息学报,2017,31(5):1-7,20.
[20] 傅爱平.汉语信息处理中单字的构词方式与合成词的识别和理解[J].语言文字应用,2003 (4):25-33.
[21] 詹卫东.面向自然语言处理的现代汉语词组本位语法体系[J].语言文字应用,1997,4:101-106.
[22] Xue Nianwen,Martha Palmer.Adding semantic roles to the Chinese Treebank[J].Natural Language Engineering,2009,15(1):143-172.
[23] 袁毓林.谓词隐含及其句法后果——“的”字结构的称代规则和 “的” 的语法、语义功能[J].中国语文,1995 (4):241-255.
[24] Wang C,Xue N.Getting the most out of AMR parsing[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing,2017:1257-1268.
[25] 刘美君,万明瑜.中文动词及分类研究:中文动词词汇语义网的构建及应用[J].辞书研究,2019,2.
[26] 贺阳.试论汉语书面语的语气系统[J].中国人民大学学报,1992(5):59-66.
[27] 张喜洪.现代汉语情态范畴初论[D].成都:四川师范大学硕士学位论文,2008.
[28] 柯永红,俞士汶,穗志方,等.基于群体智慧的语料标注方法研究[J].中文信息学报,2017,31(4):108-113.
[29] Elizabeth M Daly.Harnessing wisdom of the crowds dynamics for time-dependent reputation and ranking[C]//Proceedings of International Conference on Advances in Social Network Analysis and Mining,IEEE,2009:267-272.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点基础研究发展计划(2014CB340504);国家自然科学基金(61751201)
{{custom_fund}}