通过对应力影响下语音数据的分析,发现不同的特征维对变异的敏感程度不同。一般低维特征对变异比较敏感,相应的高维特征敏感程度差些。在此基础上,提出一种新的基于特征加权的变异语音识别方法。该方法通过对不同维特征加不同的权值来消除变异因素对语音特征的影响,从而提高系统的识别性能。文中提出对线性权值用最大相对熵估计方法获得权值。对航空模拟飞行器中采集的特定话者小词表孤立词的实验,最大相对熵估计方法的识别率可达到89.9% ,与多重风格训练方法相比,识别率提高了13.1%。
Abstract
Based on the analysis of stressful speech ,an interesting fact that the different dimension of MFCC feature has different sensitivity of G-force is found. Generally ,the lower dimensions are more sensitive to stress ,and the sensitivity of higher dimensions is less. Therefore ,a new approach named weighted MFCC feature is proposed for the recognition under G-force in the paper. Using the weighted feature to emphasize the influence of higher dimensions , the better performance of recognition system can be achieved. In order to obtain the weights ,a new method named maximum relative entropy weights is proposed in which the initial weights are the linear weights. For a small-vocabulary speaker-dependent system ,the recognition rates of these methods are better than that of traditional multi-style training method. Among these methods ,maximum relative entropy weights can reach the best performance with 89.9% recognition rate ,which improves 13.1% comparing with the multi-style training method.
关键词
语音识别 /
应力影响 /
特征加权 /
最大熵相对估计
{{custom_keyword}} /
Key words
Speech recognition /
G-force /
Weighted feature /
Maximum relative entropy
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] R. P. Lippmann , E. A. Martin and D. B. Paul. Multi-Style Training for Robust Isolated-Word Speech Recognition , ICASSP’87 ,1987 ,705 - 708
[2] J . H. L. Hansen and B. D. Womack. Classification of Speech Under Stress using Target Driven Features. Speech Communication ,1996 ,20 :131 - 150
[3] Y. Chen. Cepstral Domain Talker Stress Compensation for Robust Speech Recognition. IEEE Trans. ,On Acoustics ,Speech and Signal Processing ,1988 ,36 (4) :433 - 439
[4] J . H. L. Hansen. Adaptive Source Generator Compensation and Enhancement for Speech Recognition in Noisy Stressful Environment . ICASSP’93 ,1993 ,2 :95 - 98
[5] T. Cover ,J . Thomas. Elements of Information Theroy. John Wiley & Sons , Inc. ,1991 ,90 - 95
[6] S. Furui. Cepstral analysis technique for automatic speaker verification. IEEE Trans. ,On Acoustics ,Speech and Signal Processing ,1981 ,29 (4) :254 - 272
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(项目号:60085001)
{{custom_fund}}