基于最大似然模型插值的快速说话人自适应算法

PDF(315 KB)

中文信息学报 ›› 2002, Vol. 16 ›› Issue (1) : 50-54.

吕萍,王作英,陆大金

作者信息 +

A Speaker Adaptation Algorithm Based on Matrix Linear Interpolation

LV Ping,WANG Zuo-ying,LU Da-jin

Author information +

History +

摘要

本文提出了一种新的说话人自适应算法——最大似然模型插值。其基本思想是,利用语音单元间的相关性,根据最大似然准则由一组说话人相关模型的线性组合得到测试者的说话人自适应模型。接着介绍了此插值框架下的两种具体自适应算法:均值线性插值算法和矩阵线性插值算法。实验证明上述算法有良好的收敛性,在只有3句自适应数据时便能使识别系统的性能有较大提高。

Abstract

A novel speaker adaptation method named maximum likelihood model interpolation (MLMI) is proposed. The basic idea of MLMI is to compute the speaker adapted (SA) model of a test speaker by a linear convex combination of a set of speaker dependent (SD) models according to maximum likelihood (ML) criterion. This method has made use of the correlation of speech units. Then ,two concrete algorithms named mean linear interpolation and matrix linear interpolation respectively are given. Experiments show that 3 adaptation utterances can give a significant performance improvement .

导出引用

吕萍,王作英,陆大金. 基于最大似然模型插值的快速说话人自适应算法. 中文信息学报. 2002, 16(1): 50-54

LV Ping,WANG Zuo-ying,LU Da-jin. A Speaker Adaptation Algorithm Based on Matrix Linear Interpolation. Journal of Chinese Information Processing. 2002, 16(1): 50-54

参考文献

[1] Lee CH. On stochastic feature and model compensation spproaches to robust speech recognition [J]. Speech Communication 1998 (25) :29 - 47
[2] Gauvain JL ,Lee CH. Maximum a posteriori estimation for multivariate gaussian observations of markov chains [J]. IEEE Transaction. Audio Speech Processing ,1994 ,2 (2) :291 - 298
[3] Legetter CJ ,Woodland PC. Maximum likelihood linear regression for speaker adaptation of continuous density HMM’s [J]. Computer Speech and Language ,1995 ,9 (2) :171 - 186
[4] Ahadi SM ,Woodland PC. Combined Bayesian and predictive techniques for rapid speaker adaptation of continuous density hidden Markov models [J]. Computer Speech and Language ,1997 (11) :187 - 206
[5] Cox. SJ , Predictive speaker adaptation on speech recognition. Computer Speech and Language ,1995, 9 (1) :1 - 17
[6] WANG zuoying ,Liu feng. Speaker adaptation using maximum likelihood model interpolation [A] . Proceedings of ICASSP[C] . 1999 , (2) . 1368 - 1372
[7] 刘丰. 说话人自适应在汉语连续语音识别中的应用[R] . 北京:清华大学电子工程系,2000
[8] 王作英. 基于段长分布的HMM语音识别模型[A] . 中文信息学会,第二届全国汉字语音识别会议[C] . 1989

基金

“九八五”重大项目(985校-22-攻关-06)

PDF(315 KB)

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金

Published
2002-02-15
Issue Date
2002-02-15