艾孜尔古丽,努尔艾合买提,玉素甫·艾白都拉. 现代维吾尔语常用词统计关键技术研究[J]. 中文信息学报, 2014, 28(5): 192-197.
Azragul, Nurahmat, Yusup Abaydula. Research on Key Technology for Statistics of Modern Uyghur Language. , 2014, 28(5): 192-197.
现代维吾尔语常用词统计关键技术研究
艾孜尔古丽,努尔艾合买提,玉素甫·艾白都拉
新疆师范大学 计算机科学与技术学院 新疆 乌鲁木齐 830054
Research on Key Technology for Statistics of Modern Uyghur Language
Azragul, Nurahmat, Yusup Abaydula
1. School of Computer Science and technology,Xinjiang Normal University,Urumqi Xinjiang,830054,China
Abstract:This paper studies key technologies for the modern Uyghur language corpus construction, in particular the collection of modern Uyghur language corpus, and the pre-processing of modern Uyghur corpus, the statistical technique in modern Uyghur corpus, the stemming of modern Uyghur and the analysis of modern Uyghur data. To develope a candidate list for modern Uyghur common words, this paper examines the words in two aspects: the frequency and distribution, specifically including the word species, frequency , frequency rate, document coverage word length.