基于能量变化率的汉语塞音检测算法

PDF(1933 KB)

中文信息学报 ›› 2014, Vol. 28 ›› Issue (3) : 116-122.

语音识别与分析

基于能量变化率的汉语塞音检测算法

张连海,陈斌,屈丹,李弼程

作者信息 +

Chinese Stop Detection Based on Energy Change Rate

ZHANG Lianhai, CHEN Bin, QU Dan, LI Bicheng

Author information +

History +

摘要

针对爆发谱特征不稳定的问题,论文提出了一种基于能量变化率的汉语塞音检测方法。该方法首先基于Seneff听觉谱提取了一组描述音段能量变化率特性的参数,然后采用Fisherface方法进行特征变换,变换后的特征采用K近邻(KNN)分类器进行分类,实现了塞音的检测,最后利用留一法对模型性能进行交叉验证。实验结果表明,干净语音塞音检测准确率可以达到96.39%,信噪比10dB的语音塞音检测准确率可达到88.07%,模型具有较好的稳定性和泛化性能。

Abstract

In order to solve the issue of unreliable burst spectrum feature, a Chinese stop detection method based on energy change rate characteristic is proposed. The energy change rate features are first acquired from the Seneff's auditory spectrum, and then transformed by Fisherface approach. Finally the KNN classifier is implemented to realize stop detection. Tested by leave-one-out cross validation, the results indicate a good performance of high stability and generalization: the accuracy is 96.39% for clean speech and 88.07% for noisy speech with the SNR of 10dB.

导出引用

张连海,陈斌,屈丹,李弼程. 基于能量变化率的汉语塞音检测算法. 中文信息学报. 2014, 28(3): 116-122

ZHANG Lianhai, CHEN Bin, QU Dan, LI Bicheng. Chinese Stop Detection Based on Energy Change Rate. Journal of Chinese Information Processing. 2014, 28(3): 116-122

参考文献

[1] Chin-Hui.Lee, From knowledge-ignorant to knowledge-rich modeling: A new speech research paradigm for next generation automatic speech recognition[C]//Proceedings of ICSLP Keynote Speech, 2004:1137-1140.
[2] Jurgen T Geiger, Mohamed Anouar Lakhal, Bjorn Schuller, Gerhard Rigoll. Learning new acoustic events in an HMM-based system using MAP adaptation[C]//Proceedings of INTERSPEECH, 2011:293-296.
[3] David Mejía-Navarrete, Ascensin Gallardo-Antolín, Carmen Pelez-Moreno. Feature Extraction Assessment for an Acoustic-Event ClassificationTask Using the Entropy Triangle[C]//Proceedings of INTERSPEECH, 2011:309-312.
[4] 张宝奇,张连海,屈丹. 基于听觉事件检测的汉语语音声韵切分[J].声学学报,2010,35(6): 701-707.
[5] Almpanidis G, Kotti M, Kotropoulos, and C., Robust Detection of Phone Boundaries Using Model Selection Criteria With Few Observations[J], IEEE Transactions on Audio, Speech, and Language Processing, 2009,17(2):287-298.
[6] 陈斌,张连海,王波,屈丹.基于Seneff听觉谱特征的汉语连续语音声韵母边界检测[J].声学学报,2012,37(1):104-112.
[7] M F Dorman. Relative spectral change and formant transitions as cues to labial and alveolar place of articulation[J]. J.Acoust. Soc. Am. 1996,100(6):3825-3830.
[8] A R Jayan and P C Pandey, Detection of stop landmarks using gaussian mixture model of speech spectrum[C]//Proceedings of ICASSP, 2009:4681 4684.
[9] Chi-Yueh Lin, Hsiao-Chuan Wang. Using Burst Onset Information To Improve Stop/Affricate Phone Recognition[C]//Proceedings of ICASSP[C], 2010:4862-4865.
[10] Prem C Pandey, Milind S Shah, Estimation of Place of Articulation During Stop Closures of Vowel Consonant Vowel Utterances, IEEE Transactions on Audio, Speech, and Language Processing, 2009,17(2):277-286.
[11] Chi-Yueh Lin, Hsiao-Chuan Wang. Mandarin Stops Classification Based On Random Forest Approach[C]//Proceedings of ISCSLP 2008:1-4.
[12] Stephanie Seneff, A joint synchrony/mean-rate model of auditory speech processing[J], Journal of Phone-tics, 1988,16: 55-76.
[13] Stephanie Seneff, Pitch and Spectral Analysis of Speech Based on an Auditory Synchrony Model[M], Cambridge, Massachusetts Institute of Technology,1985.
[14] Ahmed M. Abdelatty Ali, Jan Van der Spiegel, Paul Mueller, Robust Auditory-Based Speech Processing Using the Average Localized Synchrony Detection[J], IEEE Transaction on Signal and Audio Processing, 2001, 10:279-292.
[15] Ahmed M. Abdelatty Ali, Jan Van der Spiegel, Paul MuellerAcoustic Phonetic Features for the Automatic Classification of Stop Consonants, IEEE Transactions on Audio, Speech, and Language Processing, 2001,9(8):833-841.
[16] Yang J,Yang J Y. Why can LDA be performed in PCA transformed space[J]. Pattern Recognition,2003,36(2):563-566.
[17] Steve Young.The HTK Book(for HTK Version 3.4).Cambridge University Engineering Department,2006:289.
[18] Richard O. Duda,Peter E. Hart David G. Stork著,李宏东,姚天翔等译.模式分类[M].北京: 机械工业出版社,2009.

基金

国家自然科学基金(61175017)

PDF(1933 KB)

564

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

Received	Published
2012-04-17	2014-03-10
Issue Date
2014-03-10

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金