基于大规模语音数据库的文语转换系统(Text-to-Speech , TTS)中,如何选取合适的语音基元是提高合成语音自然度的重要因素。本文研究了连续语流中的韵律关联现象,提出了包含韵律关联参数的汉语韵律特征参数集,基于数据挖掘中的关联规则模型(Association Rules Model)建立韵律关联模型,并将该模型应用于基元选取。实验表明,该方法有效地利用了语音基元的韵律及关联信息,符合人耳的知觉感受,使得合成语音自然度的主观评测MOS(Mean Opinion Score)得分与不考虑韵律关联时的结果相比提高了12.22%(3.49/3.11)。
Abstract
In this paper , a new unit selection approach for concatenative Text-to-Speech (TTS) synthesis based on prosodic correlation model is proposed. Firstly , prosodic correlations in continuous speech are studied. Then , some prosodic parameters , including prosodic correlation parameters , are concluded. Thirdly , a prosodic correlation model (association rules model from data mining) is put into use in unit selection. The experiments show that the unit selection method described in this paper can improve the naturalness of the synthesized speech : the MOS score can achieve 12.22% higher than before (3.49/3.11) .
关键词
计算机应用 /
中文信息处理 /
文语转换 /
基元选取 /
韵律关联
{{custom_keyword}} /
Key words
computer application /
Chinese information processing /
Text-to-Speech (TTS) /
unit selection /
prosodic correlation
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Andrew J. Hunt , Alan W. Black. Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database[A] . ICASSP96[C] . Atlanta , Georgia , 1996. 373 - 376.
[2] CHOU Fu-chiang , TSENG Chiu-yu , LEE Lin-shan. Selection of Waveform Units for Corpus-based Mandarin Speech Synthesis Based on Decision Trees and Prosodic Modification Cost [A] . Eurospeech99 [C] . Budapest , Hungary , 1999. 2295 - 2298.
[3] 陶建华, 蔡莲红等. 汉语文语转换系统中可训练韵律模型的研究[J]. 声学学报, 2001 , 26 (1) : 67 - 72.
[4] CHU Min , PENG Hu. An Objective Measure for Estimating MOS of Synthesized Speech [A] . Eurospeech 2001[C] . Denmark , 20011 2087 - 2090.
[5] 初敏. 韵律研究与合成语音的自然度[A]. 第五届全国现代语音学学术会议. 新世纪的现代语音学[C] . 北京: 清华大学出版社, 2001. 295 - 301.
[6] G. Fant . 言语产生中的相互作用现象[M]. 1987.
[7] 王玮, 蔡莲红. 基于数据挖掘算法的汉语合成韵律参数预测方法[J]. 声学学报, 2003 , 28 (1) : 1 - 6.
[8] 周讯溢, 王蓓, 杨玉芳等. 语句中协同发音对音节知觉的影响[J]. 心理学报, 2003 , 35 (3) : 340 - 344.
[9] 吴志勇, 蔡莲红, 陶建华. 基于汉语韵律参数的语音基元选取[A]. 第六届全国人机语音通讯学术会议[C]. 深圳, 2001. 199 - 202.
[10] 陶建华, 赵晟, 蔡莲红. 基于统计韵律模型的汉语语音合成系统的研究[J]. 中文信息学报, 2002 , 16 (1) : 1 - 6.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金项目(60275014);863资助项目(2002AA117010-05,2001AA114072)
{{custom_fund}}