维吾尔语词切分方法初探

古丽拉·阿东别克,米吉提·阿布力米提

PDF(187 KB)
PDF(187 KB)
中文信息学报 ›› 2004, Vol. 18 ›› Issue (6) : 62-66.

维吾尔语词切分方法初探

  • 古丽拉·阿东别克,米吉提·阿布力米提
作者信息 +

Research on Uighur Word Segmentation

  • Gulila Adongbieke,Mijit Ablimit
Author information +
History +

摘要

维语词的词干-词附加成分切分、音节切分的规律对维吾尔语自然语言处理方面提供更多方便。本文提出了以“词=词根+附加成分”结构。维语附加成分种类繁多,连接形式各式各样,在句子中起着非常重要的作用,同时有相当的规律性。本文提出了维语中可能出现的基本语音规律的处理方法,如:语音同化、音节切分、语音和谐规律处理。本文对维文词的词法和语音法结构进行了归纳,提出了维语词切分的一些规律和实现方法。以新疆高校学报为语料来测试,对规则词准确率达到95%。

Abstract

Root-affix and syllable segmentation of Uighur word bring great facilities in Uighur natural language processing. Affix in Uighur are various , they link between themselves and to a root in different ways. But there are intricate rules in their linkage. In this paper , we propose methods of handling with the basic phonetic features of Uighur words , such as the final vowel change , rules of vowel and consonant harmony , and syllable segmentation. We also summarized the word structures and phonetic structures of Uighur , and proposed some rules of Uighur word segmentation and implementation of this segmentation. According to the implementation of these rules on regular words from scientific publishing in Xinjiang , the accuracy is 95%.

关键词

人工智能 / 自然语言处理 / 维吾尔语 / 词干 / 词附加成分 / 切分

Key words

artificial intelligence / natural language processing / uighur / word segmentation / root / affix / segmentation

引用本文

导出引用
古丽拉·阿东别克,米吉提·阿布力米提. 维吾尔语词切分方法初探. 中文信息学报. 2004, 18(6): 62-66
Gulila Adongbieke,Mijit Ablimit. Research on Uighur Word Segmentation. Journal of Chinese Information Processing. 2004, 18(6): 62-66

基金

国家自然科学基金资助(69963002)
PDF(187 KB)

785

Accesses

0

Citation

Detail

段落导航
相关文章

/