A basic approach for measuring semantic similarity/distance between words and concepts is to use lexical taxonomy, such as Wordnet. Hownet is a Chinese semantic dictionary, containing abundant semantic information and ontology knowledge, but has quite different construction and architecture. In this paper, we present a new approach using Hownet by drawing in the idea of information theory. We propose that the more semantic information a “sememe” take, the more powerful it in describing concepts. Then we divide “sememe” which describes a concept into two set: directly describing part and indirectly describing part. In the experiments, we demonstrate our method have improved performance in measuring semantic similarity between Chinese words.
LI Feng ,LI Fang.
An New Approach Measuring Semantic Similarity in Hownet 2000. Journal of Chinese Information Processing. 2007, 21(3): 99-105
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Eneko Agirre, German Rigau. A Proposal for Word Sense Disambiguation using Conceptual Distance [A]. In: Proceedings of the First International Conference on Recent Advanced in NLP [C]. 1995. [2] Dekang Lin. An Information-Theoretic Definition of Similarity Semantic distance in WordNet [A]. In: Proceedings of the Fifteenth International Conference on Machine Learning [C]. 1998. [3] HowNet [R]. HowNet’s Home Page. http://www.keenage.com. [4] 刘群, 李素建. 基于《知网》的词汇语义相似度的计算[A] . 第三届汉语词汇语义学研讨会[C],台北,2002. [5] BUDANITSKY, A. AND HIRST, G. Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures [A].In: Workshop on WordNet and Other Lexical Resources, Second meeting of the North American Chapter of the Association for Computational Linguistics[C]. 2001. [6] 吴健, 吴朝晖, 李莹, 等. 基于本体论和词汇语义相似度的Web 服务发现[J]. CHINESE JOURNAL OF COMPUTERS, 2005, 28 (4). [7] 同义词词林[R]. http://www.ir-lab.org/.