史建国,侯宏旭,飞 龙. 基于词典、规则的斯拉夫蒙古文词切分系统的研究[J]. 中文信息学报, 2015, 29(1): 197-202.
SHI Jianguo ,HOU Hongxu, BAO Feilong. Research on Slavic Mongolian Word Segmentation Based on Dictionary and Rule. , 2015, 29(1): 197-202.
基于词典、规则的斯拉夫蒙古文词切分系统的研究
史建国,侯宏旭,飞 龙
内蒙古大学 计算机学院,内蒙古 呼和浩特 010021
Research on Slavic Mongolian Word Segmentation Based on Dictionary and Rule
SHI Jianguo ,HOU Hongxu, BAO Feilong
College of Computer Science, Inner Mongolia University, Hohhot, Inner Mongolia 010021,China
Abstract:Slavic Mongolian is the daily language in Mongolia, which is also known as Cyrillic Mongolian or new Mongolian. This paper explores the Slavic Mongolian word segmentation by combining the dictionary with rules. We first preprocess with the dictionary for the words of high-frequency or not consistent with rulesm then deal with the rest words with rules to generate n-best candidates for final decision We combine the two different methods, taking bothadvantages and achieving excellent performance in the Slavic Mongolian word segmentation.
[1] 那顺乌日图.蒙古文词根、词干、词尾自动切分系统[J].内蒙古大学学报,1997,29(2):53-67. [2] M F Porter. An algorithm for suffix stripping [J].Program, 1980, 14(3): 130-137. [3] Massimo M and Nicola O. A Novel Method for Stemmer Generation Based on Hidden Markov Models[C]//Conference on Information and Knowledge Management Archive Proceedings of the twelfth International Conference on Information and Knowledge Management, 2003: 131 134. [4] 淑琴.“蒙古语语法信息词典附加成分分库”的设计与实现[D],内蒙古大学硕士学位论文,2005.6. [5] 叶嘉明,基于规则的蒙古语词法分析研究与实现[D],北京: 北京大学硕士学位论文,2005. [6] 侯宏旭,刘群,那顺乌日图等.基于统计语言模型的蒙古文词切分[J].模式识别与人工智能,2009,22(1):108-112. [7] 明玉.基于词典、规则与统计的蒙古文词切分系统的研究[D],内蒙古大学硕士学位论文,2011. [8] 萨仁都拉嘎.新蒙文自学入门[M],内蒙古: 天马出版有限公司,2005.1. [9] 清格尔泰.蒙古语语法[M],内蒙古: 内蒙古人民出版社,1991.5. [10] 嘎拉桑朋斯格.蒙古国基立尔蒙古文正字法[M],内蒙古: 内蒙古人民出版社,2001.11. [11] 舍·却玛.蒙古文、基里尔文正字法比较研究[M],内蒙古教育出版社,2010.9. [12] 古丽拉·阿东别克,米吉提·阿布力米提. 维吾尔语词切分方法初探[J]. 中文信息学报,2004,18:61-65. [13] 那顺乌日图,雪艳,叶嘉明.现代蒙古文语料库加工技术的新进展——新一代蒙古文词语自动切分与标注系线[C]//第十届少数民族语言文字信息处理学术研讨会,2005 [14] 米海涛,熊德意,刘群. 中文词法分析与句法分析融合策略研究[J]. 中文信息学报,2008,22:10-17. [15] 包萨日娜. 传统蒙古文到新蒙文转换中名词及其格附加成分转换的研究[D]. 内蒙古大学硕士学位论文,2009.6. [16] 赵伟,侯宏旭,从伟,宋美娜.基于条件随机场的蒙古语词切分研究[J].中文信息学报,2010,24(5):31-35.