在ICU中实现少数民族文字的处理

董治江,吴健,钟义信

PDF(236 KB)
PDF(236 KB)
中文信息学报 ›› 2004, Vol. 18 ›› Issue (2) : 67-73.

在ICU中实现少数民族文字的处理

  • 董治江1,2,吴健2,钟义信1
作者信息 +

Implementation of Minority Languages Processing in ICU

  • DONG Zhi-jiang1,2,WU Jian2,ZHONG Yi-xin1
Author information +
History +

摘要

基于ISO/IEC 10646和UNICODE国际标准,用传统的字体技术(如TrueType)来实现少数民族文字处理所面临的一个“瓶颈”问题是:“变形显现字符”不存在确定的码位。这也是多年来民文系统重复开发、互不兼容的根本原因。本文基于ICU的文字处理体系结构,阐述了完全支持Unicode标准的少数民族文字(本文主要指蒙古文字、维文、藏文等)的实现方法。文中首先介绍了少数民族文字的特点,分析其与拉丁文、汉字在计算机输入、输出过程中的不同之处,并指出少数民族文字处理的难点。其次介绍了一种能满足少数民族文字处理需求的字体技术——OpenType。最后,阐述了文字处理引擎的工作原理,以及ICU中如何实现对少数民族文字的支持。

Abstract

As we process minority scripts in computers based on ISO/IEC 10646 and Unicode standards , there is a bottle - neck problem that variations of presentation characters have no definite code points. It is why many software systems processing minority scripts are produced in repetition and are incompatible with each other. Based on scripts processing architecture in ICU , this paper illustrates methods of implementation of minority scripts processing complying with Unicode standard. Firstly , we analyze the characteristics of minority scripts , and point out the difficulties of processing them. Then the OpenType font technology , which can satisfy the requirements of minority languages processing , is introduced. Lastly , we illuminate the principle of Layout Engine , as well as present how to embed minority scripts processing in ICU.

关键词

计算机应用 / 中文信息处理 / 复杂文本 / Unicode / OpenType / 布局引擎

Key words

computer application / Chinese information processing / complex text layout / unicode / OpenType / layout engine

引用本文

导出引用
董治江,吴健,钟义信. 在ICU中实现少数民族文字的处理. 中文信息学报. 2004, 18(2): 67-73
DONG Zhi-jiang,WU Jian,ZHONG Yi-xin. Implementation of Minority Languages Processing in ICU. Journal of Chinese Information Processing. 2004, 18(2): 67-73

参考文献

[1] International Standard ISO/IEC 10646 - 1 Second Edition. Information technology - Universal Multiple-Octet Coded Character Set (UCS) [S] ,2000.
[2] The Unicode Consortium. http://www.Unicode.org[EB].
[3] Introduction to ICU. http://oss.software.ibm.com/icu/userguide/icu.pdf [EB].
[4] OpenType specification. http://www.microsoft.com/typography/otspec/ [EB].
[5] 确精扎布. 蒙古文编码[M] . 呼和浩特:内蒙古大学出版社,2000.
[6] Owen Taylor. Pango : internationalized text handling . http://lwn.net/2001/features/OLS/pdf/pdf/pango.pdf [EB] ,2001.

基金

国家“863”软件重大专项资助(2003AA1Z2110);中科院知识创新工程项目资助(KGCX2-SW-504)
PDF(236 KB)

838

Accesses

0

Citation

Detail

段落导航
相关文章

/