中文语音合成系统中的文本标准化方法

陈志刚,胡国平,王熙法

PDF(436 KB)
PDF(436 KB)
中文信息学报 ›› 2003, Vol. 17 ›› Issue (4) : 46-52.

中文语音合成系统中的文本标准化方法

  • 陈志刚1,胡国平2,王熙法1
作者信息 +

Text Normalization In Chinese Text-To-Speech System

  • CHEN Zhi-gang,HU Guo-ping,WANG Xi-fa
Author information +
History +

摘要

文本标准化是对输入文本进行分析,生成其中非汉字符号的拼音、节奏等信息的过程。本文提出了一种层次化的、基于外部规则的标准化方法,通过规则匹配识别这些符号,并给出各种正确信息。本文首先介绍了分析树的概念,其次给出构造规则的步骤,利用权值控制规则的匹配顺序,最后给出实验结果。实验结果表明:这种方法具有很好的易维护性和可扩展性,开放测试的正确率达到99.76%。

Abstract

Text normalization is a procedure to generate information , such as pronunciation , rhythm and so on , for special symbols correctly. In this paper , a method based on hierarchical , external rules is presented. By matching rules , we can recognize normal special symbols and generate correct information. This paper introduces the concept of analysis tree firstly , then shows the steps of constructing rules and presents the experiment results. The results show that we can achieve easy-maintainability and easy-expandability , and the correct rate of open test is 99.76%.

关键词

计算机应用 / 中文信息处理 / 文本标准化 / 特殊符号 / 外部规则

Key words

computer application / Chinese information processing / text normalization / special symbols / external rules

引用本文

导出引用
陈志刚,胡国平,王熙法. 中文语音合成系统中的文本标准化方法. 中文信息学报. 2003, 17(4): 46-52
CHEN Zhi-gang,HU Guo-ping,WANG Xi-fa. Text Normalization In Chinese Text-To-Speech System. Journal of Chinese Information Processing. 2003, 17(4): 46-52

参考文献

[1] Richard Sproat. Multilingual text analysis for text-to-speech synthesis [C] , ICSLP'96.
[2] Richard Sproat , Alan Black , Stanley Chen , Shankar Kumar , Mari Ostendorf , Christopher Richards. Normalization of Non-Standard Words [C] : WS'99 Final Report (1999) .
[3] Wu Xiaoru. Special Text Processing Based External Descriptor Rule [C] , ICSLP'2000.
[4] Andrew Breen ,Barry Eggleton. Refocussing on the text normalization process in Text-to-speech Systems [C]. ICSLP'2002.
[5] Mehryar Mohri ,Richard Sproat. A Efficient Compiler for Weighted Rewrite Rules [C]. Meeting of the Association for Computational Linguistics ,1996.
[6] 陈意云. 编译原理和技术[M] . 合肥:中国科技大学出版社.

基金

国家自然科学基金资助(69975018)
PDF(436 KB)

875

Accesses

0

Citation

Detail

段落导航
相关文章

/