基于量词的名词概念获取研究

王 萌,俞士汶

PDF(713 KB)
PDF(713 KB)
中文信息学报 ›› 2014, Vol. 28 ›› Issue (5) : 60-65.
词法·句法·语义分析及应用

基于量词的名词概念获取研究

  • 王 萌1,俞士汶2
作者信息 +

Concept Acquisition Based on Chinese Classifier Words

  • WANG Meng1,YU Shiwen2
Author information +
History +

摘要

概念获取是自然语言理解领域中重要的研究课题。该文提出了一种基于汉语量词的名词概念描述方法,设计并实现了一个权重计算方案。通过聚类实验探索了量词对名词语义区分的作用和贡献,实验结果表明基于量词的名词概念表达方式是有效的,可以区分大部分名词概念。

Abstract

Concept acquisition from corpora has become increasingly important in NLP. This paper presents a new concept representation based on classifier words. Concepts are modeled as vectors with one component corresponding to each classifier word. We propose a weighting scheme that assigns each classifier word a weight in a concept. Then we conduct experiments to identify concept similarities via clustering, and the results show classifier words can categorize most concept classes.

关键词

概念获取 / 量名搭配 / 量词 / 聚类

Key words

Concept acquisition / classifier-noun collocation / classifier words / cluster

引用本文

导出引用
王 萌,俞士汶. 基于量词的名词概念获取研究. 中文信息学报. 2014, 28(5): 60-65
WANG Meng,YU Shiwen. Concept Acquisition Based on Chinese Classifier Words. Journal of Chinese Information Processing. 2014, 28(5): 60-65

参考文献

[1] Grefenstette, Gregory. SEXTANT: Extracting Semantics from Raw Text Implementation Details[R]. Compater Science Technical keport, Cs92-05, University of Pittsburgh, Feb. 1992.
[2] D Lin. Automatic Retrieval and Clustering of Similar Words [C]//Proceedings of the COLING-ACL, 1998: 768-774.
[3] Almuhareb A, Poesio M. Attribute-based and value-based clustering: an evaluation [C]//Proceedings of the EMNLP, 2004.
[4] Tai, James H Y. Chinese Classifier Systems and Human Categorization [M]. In Honor of Professor William S-Y. Wang: Interdisciplinary Studies on Language and Language Change, Matthew Chen and Ovid Tseng, eds. Pyramid Publishing Company, 1994: 479-494.
[5] Huang Chu-ren, CHEN Keh-jiann, GAO Zhao-ming. Noun Class Extraction from a Corpus-based Collocation Dictionary: An Integration of Computational and Qualitative Approaches [J]. Quantitative and Computational Studies of Chinese Linguistics, 1998: 339-352.
[6] 俞士汶,朱学锋,王惠等.现代汉语语法信息词典详解(第二版)[M].北京: 清华大学出版社,2003.
[7] Karypis G. CLUTO: A Clustering toolkit [R], Technical Report 02-017, University of Minnesota, 2002.
[8] 王萌,俞士汶,段慧明,孙薇薇. 现代汉语名词语法属性的计量研究初探[J],中文信息学报,2008,22(5): 22-27.
[9] Dongdong Zhang, Mu Li, Nan Duan. Measure Word Generation for English-Chinese SMT System [C]//Proceedings of the ACL, 2008: 89-96.
[10] Dominic Widdows, Beate Dorow. A Graph Model for Unsupervised Lexical Acquisition [C]//Proceedings of the COLING, 2002: 1093-1099.
[11] Hong Zhang. Numeral Classifiers in Mandarin Chinese [J], East Asian Linguist, 2007 (16): 43-59.

基金

国家自然科学基金(No.61300152)
PDF(713 KB)

Accesses

Citation

Detail

段落导航
相关文章

/