本文针对汉语中所有声韵母发音序列中的连续口型提出了一种口型分类的思路。在建立了覆盖所有声韵母的汉语双模态语料库的基础之上,本文提出了一种两次分类的方法,对语料库中的图像进行唇的分割、定位及特征提取,并依靠选择的特征,将声韵母的发音序列中的口型聚为15类。本文的目的是在此分类的基础上,明确唇读识别阶段的状态数,减小搜索的空间,提高收敛速度。
Abstract
This paper describes an approach of classifying the continuous mouth shapes ,which are obtained from sequence images of Chainese pronunciation of vowel and consonant .Based on the audiovisual bimodal database ,we present a classifying method called Two-Step Classification. First ,we located the lip and extract the features using adaptive chromatic filter model. Then ,relying on the features chosen ,we classify the sequence mouth shapes into 15 categories. The purpose of mouth shape classification is to confirm the mumber of states ,shrink searching space and expedite convergence speed for lipreading recognition.
关键词
唇读 /
双模态语料库 /
口型聚类 /
语音识别
{{custom_keyword}} /
Key words
Lipred /
Bimodal Database /
Mouth Shape Classification /
Voice Recognition
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] D. G. Stork , G. J . Wolff , and E. P. Levine. Neural Network Lipreading System for Improved Speech Recognition. In Proceedings International Joint Conference on Neural Networks. 1992 ,2 :289 - 295
[2] M. E. Hennecke ,D. G. Stork ,and K. V. Prasad. Visionary Speech :Looking ahead to Practical Speechreading Systems. In David G. Stork and Marcus E. Hennecke ,editors , Speechreading by Humans and Machines ,Springer and Systems Sciences. 1996 :331 - 350
[3] 徐彦君. 汉语听觉视觉双模态数据库CAVSR1.0. 声学学报(中文版) . 2000. 1
[4] W. Gao ,M.B. Liu ,A Hierarchical Approach to Human Face Detection in Complex Background ,the First International Conference on Multimodal Interface ,Beijing ,1996
[5] 姚鸿勋,高文,李静梅,吕雅娟等. 用于口型识别的实时唇定位方法,软件学报,2000 ,11(8) :1126 - 1132
[6] 姚鸿勋,刘明宝,高文等.基于彩色图像的色系坐标变换的面部定位与跟踪法,计算机学报,2000 ,23(2) :158 - 165
[7] A. L. Yuille ,D. S. Cohen ,and P. W. Hallinan. Feature Extraction from Faces Using Deformable Templates. In IEEE Computer Society Conference on Computer Vision and Patter Recognition ,CVPR. 1989
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家863计划项目(863-306-QN99-4);(863-306-ZT03-01-2);国家自然科学基金重点项目(69789301)
{{custom_fund}}