在充分利用普通话水平测试试卷的文本信息、同一人的声母时长在常规语速下基本稳定、同一人的声母之间以及韵母之间的相对时长基本保持比例关系等先验知识的基础上,使用经小波变换后再重构的3个语音信号分量的累计能量特征为参数,提出了利用话者语音统计信息的两级音节切分算法,使音节切分精度达98.3%以上。
Abstract
Many kinds of knowledge have been applied in this paper to separate the syllables, such as the prior information from the standard text of speech in Mandarin proficiency test, from the duration of initial in Mandarin speech which is stable in the normal speed speech, from the proportions of initials durations in related to the finals durations in ones speech and so on. A two-level syllable segmentation algorithm is proposed by using accumulating energies of the three wavelets which are re-constructured from wavelet transform. The experimental results demonstrat that the accuracy of syllable separation reaches to 98.3% at least.
Key wordscomputer application; Chinese information processing;syllable segmentation; speech signal processing; Mandarin proficiency test
关键词
计算机应用 /
中文信息处理 /
音节切分 /
语音信号处理 /
普通话水平测试
{{custom_keyword}} /
Key words
computer application /
Chinese information processing /
syllable segmentation /
speech signal processing /
Mandarin proficiency test
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 张红,黄泰翼,李治. 基于半波差分谱的语音信号音节切分[J]. 声学学报(中文版), 2000, 25(4):323-328.
[2] 王帆,郑方,吴文虎. 基于多尺度分形维数的汉语语音声韵切分[J]. 清华大学学报(自然科学版), 2002, 42(1):68-71.
[3] 王卓,苏牧,李鹏,等. 噪声环境下基于高阶谱的端点检测算法[J].中文信息学报,2004,18(5):70-77.
[4] 张继勇,郑方,杜术, 等. 连续汉语语音识别中基于归并的音节切分自动机[J]. 软件学报, 1999, 10(11):1212-1215.
[5] 张文军,谢剑英,李聪. 基于贝叶斯方法的鲁棒语音切分[J]. 数据采集与处理, 2002, 17(3):260-264.
[6] 齐峰岩,鲍长春. 一种基于支持向量机的含噪语音的清/浊/静音分类的新方法[J]. 电子学报, 2006, 34(4):605-611.
[7] Wilpon J. G.; Juang B. H.; Rabiner L. R. An investigation on the use of acoustic sub-word units for automatic speech recognition [C]//Proc. of IEEE Internat. Conf. on Acoustic, Speech, and Signal Processing. 1987:821-824.
[8] Van Hernert J.P.Automatic segmentation of speech [J]. IEEE Trans.Signal Process, 1991, 39(4):1008-1012.
[9] Greenberg S. Speaking in short hand: asyllable-centric perspective for under standing pronunciation variation [J]. Speech Communication, 1999, 29(2):159-176.
[10] Prasad V. K.; Nagarajan T.; Murthy H. A. Automatic segmentation of continuous speech using minimum phase group delay functions[J]. Speech Communication, 2004, 42(3-4):429.
[11] 顾明亮,代春倩. 一种新的汉语连续语音统计切分算法[J]. 徐州师范大学学报(自然科学版), 2005, 23(4):45-49.
[12] 贾磊,穆向禺,徐波.广播语音的音频分割[J].中文信息学报,2002,16(1):37-42.
[13] 胡瑞敏,薛东辉,姚天任, 等. BP人工神经元网络与汉语语音的音节切分[J]. 华中理工大学学报, 1996, 24(S2):25 .
[14] 刘宇红,刘桥,任强. 基于改进的模糊ART的语音信号端点检测与切分[J]. 系统工程与电子技术, 2004, 26(8):147.
[15] Rabiner L. R.; Rosenberg A. E.; Wilpon J. G., et al. A bootstrapping training technique for obtaining demisyllable reference patterns [J]. J. Acoustic Soc. Amer., 1982, 71(6):1588-1595.
[16] Ying G. S.; Mitchell C.D.; Jamieson L.H. Endpoint Detection of Isolated Utterances Based on a Modified Teager Energy Measurement[C]//Proc. ICASSP, 1992:732-735.
[17] 冯隆. 北京话语流中声韵调的时长[M]. 北京语音实验录, 北京:北京大学出版社, 1985.
[18] 马大猷,沈豪, 等. 声学手册[M]. 北京:科学出版社, 1983.
[19] 齐士钤,张家騄. 汉语普通话辅音音长分析[J]. 声学学报, 1982, 7(1):5.
[20] 吴宗济,曹剑芬. 普通话辅音声学特征的几个问题[C]//1979第二届全国声学学术会议论文摘要.
[21] 陈肖霞,祖漪清. 基于连续话语语料库的语音音段的初步统计分析[R]. 语音研究报告, 1998.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
江门市科技三项资金资助
{{custom_fund}}