Morphological Analysis Based Noun Stem Identification for Modern Uyghur
Azragul1,2, Alim Murat1,2, Yusup Abaydula1
1. School of Computer Science & Technology, Xinjiang Normal University, Urumqi, Xinjiang 830054, China
2. The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi, Xinjiang 830011, China)
Abstract:Modern Uyghur noun stem identification is a fundamental issue in the field of natural language processing. The morphological analysis is first introduced, especially on its role in identifying the POS of words. Then this paper describes the POS scheme in Uyghur, as well as the morphological characteristics of Uyghur nouns, suffix ambiguity and the disambiguation rules. The algorithm of new nouns identification in modern Uyghur language is proposed, including feature selection (features within and between words) and parameter estimation. The experiment is carried on the corpus of Uyghur physical textbooks in junior and senior middle schools.
Key words modern Uyghur; morphological analysis; noun stems recognition