HowNet是我国知识库研究的奠基性成果。目前HowNet已构建出汉英双语的知识表征模式,并在语义相似度计算、向量表示等技术领域取得了较好的效果,但现有研究对HowNet共性知识体系本身的合理性以及跨语言的适应性仍缺乏有益探索。藏语作为施通格语言,与汉语、英语具有较大差别,有助于检验HowNet共性知识体系的合理性。该文以具体藏文实例为依据,从藏语格助词的表义性、动词能所关系以及语义分类特征等方面指出HowNet共性知识体系的跨语言适应性有待完善;结合原型理论及藏文能所关系,以提升HowNet共性知识体系的科学性,并据此对HowNet知识体系的架构数据进行修正。
Abstract
HowNet is a fundamental resource of knowledge bases in China, demonstrating promising results in semantic similarity computation and word embedding representation in both Chinese and English.However, current research still lacks valuable exploration into the rationality of HowNet’s common knowledge system itself and its cross-linguistic adaptability. As an ergative language, Tibetan serves as a good perspective to examine the rationality of the knowledge system of HowNet due to its significant differences from Chinese and English. In this paper, it is revealed that the cross-language adaptability of the HowNet system requires further refinement via the analysis of the semantic denotative properties of Tibetan case markers, controllable-uncontrollable relationship of verbs and semantic classification of verbs. The architecture data of the HowNet knowledge system are refined accordingly. It is suggested to apply the prototype theory and the controllable-uncontrollable relationship on Tibetan to enhance the HowNet knowledge system.
关键词
藏语 /
HowNet /
共性知识体系
{{custom_keyword}} /
Key words
Tibetan /
HowNet /
common knowledge system
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 刘群,李素建.基于《知网》的词汇语义相似度计算[C]. 第三届汉语词汇语义研讨会,台北,2002.
[2] 王振宇,吴泽衡,胡方涛.基于HowNet和PMI的词语情感极性计算[J].计算机工程,2012,38(15): 187-189,193.
[3] 张瑞霞,庄晋林,杨国增.基于《知网》的中文信息结构消歧研究[J].中文信息学报,2012,26(04): 43-49.
[4] 唐共波,于东,荀恩东.基于知网义原词向量表示的无监督词义消歧方法[J].中文信息学报,2015,29(06): 23-29.
[5] 孙茂松,陈新雄.借重于人工知识库的词和义项的向量表示: 以HowNet为例[J].中文信息学报,2016, 30(6): 1-6.
[6] NIU Y, XIE R, LIU Z, et al. Improved word representation learning with sememes[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,2017: 2049-2058.
[7] VOGT P. Language evolution and robotics: Issues on symbol grounding and language acquisi-tion[M]//Artificial Cognition Systems. Pennsylvania: IGI Global, 2007: 176-209.
[8] CHOMSKY N. Syntactic structures[M]. Berlin: De Gruyter Mouton, 2009.
[9] FILLMORE C J. The case for case[A].Bach and Harms: Universals in Linguistic Theory. New York: Holt, Rinehart, and Winston, 1968: 1-88.
[10] GOLDBERG A E. Constructions: A construction grammar approach to argument structure[M]. Chicago: University of Chicago Press, 1995.
[11] HJELMSLEV L. Language, an introduction[M]. Wisconsin: The University of Wisconsin Press, 1970.
[12] GRUBER J S. Studies in lexical relations[D]. PhD diss, Massachusetts: Massachusetts Institute of Technology, 1965.
[13] JACKENDOFF R. S. Semantic interpretation in generative grammar[J]. Journal of Linguistics, 1972: 140-147.
[14] DOWTY D. Thematic proto-roles and argument selection[J]. Language, 1991, 67(3): 547-619.
[15] EVANS N. Australian languages reconsidered: A review of Dixon[J]. Oceanic Linguistics, 2005: 242-286.
[16] 徐烈炯,沈阳.题元理论与汉语配价问题[J].当代语言学,1998 (3): 1-21.
[17] C.J.菲尔墨. 格辨[M]. 胡明扬,译.北京: 商务印书馆, 2002.
[18] WIERZBICKA A. Anchoring linguistic typology in universal semantic primes[J]. Linguistic Typology, 1998: 141-194.
[19] 金立鑫,王红卫.动词分类和施格、通格及施语、通语[J].外语教学与研究,2014,46(01): 45-57.
[20] 格桑居冕.实用藏文文法教程[M].成都: 四川民族出版社,2004: 445-450.
[21] 祁坤钰.面向信息处理的藏语语义角色研究[J].西北民族大学学报(自然科学版),2014,35(04): 19-26.
[22] GIVN T. Syntax: An introduction[M]. Amsterdam: John Benjamins Publishing, 2001.
[23] 张伯江. 施事和受事的语义语用特征及其在句式中的实现[D]. 上海: 复旦大学博士学位论文, 2007.
[24] 陈龙,詹卫东.施事的语义分布考察与动词的语义特征[J].中文信息学报,2019,33(01): 1-9.
[25] CRUSE D A. Some thoughts on agentivity1[J]. Journal of Linguistics, 1973, 9(1): 11-23.
[26] COMRIE B. Language universals and linguistic typology: Syntax and morphology[M]. Chicago: University of Chicago press, 1989.
[27] HOPPER P J, THOMPSON S A. Transitivity in grammar and discourse[J].Language, 1980: 251-299.
[28] 袁毓林.基于认知的汉语计算语言学研究[M].北京: 北京大学出版社,2003.
[29] 袁毓林.汉语配价语法研究[M].北京: 商务印书馆,2010.
[30] 徐波.中文信息处理若干重要问题[M].北京: 科学出版社,2003.
[31] 陆俭明.亟需解决好中文信息处理和汉语本体研究的接口问题[J].当代修辞学,2021,(01): 1-9.
[32] DONG Z D, QIANG Q. HowNet and the computation of meaning[M]. Singapore: World Scientific Publishing, 2006.
[33] 董振东,董强,郝长伶.下一站在哪里?[J].中文信息学报,2011,25(06): 3-11.
[34] 董振东,董强.面向信息处理的词汇语义研究中的若干问题[J].语言文字应用,2001(03): 27-32.
[35] 董振东,董强,郝长伶.知网的理论发现[J].中文信息学报,2007(04): 3-9.
[36] 陈玉忠,李保利,俞士汶.基于格关系和配价的藏语动词再分类研究[C]. 语言计算与基于内容的文本处理——全国第七届计算语言学联合学术会议论文集. 北京: 清华大学出版社, 2003: 294-300.
[37] 才华. 基于格语法的藏语句法语义一体化研究[D]. 拉萨: 西藏大学博士学位论文, 2018.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家语委项目(ZDI145-61)
{{custom_fund}}