在实际应用中,噪声或信道干扰导致说话人识别(SR)识别性能急剧下降。针对该问题,本文分析传统方法的优缺点并提出相应的系统解决方案:采用维纳滤波对语音信号进行前端处理;以MFCC声道特征结合基频(F0)韵律特征来提高识别系统的鲁棒性。实验结果表明:维纳滤波能有效地消除噪声影响;经维纳滤波处理后,使得F0-MFCC联合模型能更好的区分说话人。可以看出在噪声环境下系统的综合性能得到很大改善。
Abstract
Speaker recognition (SR) has got excellent result in clean speech. However , the effects of noises or channel mismatch can cause significant performance degradation in practical appliance. The focus of this paper is to address those problems related to robust and efficient speaker identification (SI) in noise environment. The main contributions center around two areas that include signal processing based on Wiener filtering and speaker features integration of F0 and MFCC. The experimental results on YOHO corpus show that Wiener filter is an efficient front-end processing technique and F0 is a robust feature for SR in noise environments. The performance has improved 20% above baseline system.
关键词
计算机应用 /
中文信息处理 /
说话人辨认 /
维纳滤波 /
F0-MFCC
{{custom_keyword}} /
Key words
computer application /
Chinese information processing /
speaker identification /
Wiener filtering /
F0-MFCC
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Jian Wu Jasha Droppo , Li Deng , Alex Acero. A noise-robust ASR front-end using Wiener filter constructed from MMSE estimation of clean speech and noise[A] . ICSLP [C] ,2002.
[2] Hassan Ezzaidi , Jean Rouat. Towards combining pitch and MFCC for speaker identification systems [A] . Eurospeech [C] , 2001.
[3] Guo-Hong Ding , Chengrong Li and Bo Xu. Comparison of MLLR and CDCN for speech recognition in additive noise by experiments [A] . ISCSLP [C] , 2002.
[4] Douglas Reynolds , Walter Andrews , Joseph Campbell , JiíNavrátil. Exploiting High-level Information For High Performance Speaker Recognition [EB] . Super SID Project Final Report , 2002.
[5] Andre G. Adami , Radu Mihaescu , Douglas A. Reynolds , John J. Godfrey. Modeling Prosodic Dynamics for Speaker Recognition [A] . ICASSP [C] , 2003.
[6] B. Tseng ,F. Soong , A. Rosenberg , Continuous Probabilistic Acoustic MAP for Speaker Recognition [A] . ICASSP [C] , 1992.
[7] Lit Ping Wong and Martin Russell , Text-dependent Speaker Verification Under Noisy Conditions Using Parallel Model Combination[A] . ICASSP[C] , 2001.
[8] Joseph P. Campbell , Jr. Testing with the YOHO CD-ROM Voice Verification Corpus [A] . ICASSP95 [C] .
[9] Douglas A. Reynolds , Richard C. Rose. Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models [J] . In : IEEE Trans. on Speech and Audio Processing ,1995 ,3 (1) .
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
863资助项目(002AA117010);奥运资助项目(H030130050430)
{{custom_fund}}