基于语音配列的汉语方言自动辨识

顾明亮,沈兆勇

PDF(333 KB)
PDF(333 KB)
中文信息学报 ›› 2006, Vol. 20 ›› Issue (5) : 79-84.

基于语音配列的汉语方言自动辨识

  • 顾明亮1,2,沈兆勇2
作者信息 +

Phonotatics Based Chinese Dialects Identification

  • GU Ming-liang1,2,SHEN Zhao-yong2
Author information +
History +

摘要

本文首先讨论了汉语方言辨识的依据及特征选取的基本原则,并由此导出了区间差分倒谱特征。然后利用GMM符号发生器和N元语言模型及ANN建立了一个方言辨识系统,该系统与传统的语种识别系统相比,具有以下特点:第一,系统不需要标注好的语音库,从而降低了汉语方言语音库建设的劳动强度和要求;第二, GMM符号化器计算量远远低于音素辨识器,从而提高了方言辨识速度,便于今后实时处理。第三,具有更高的辨识效果和更好的容错性。汉语普通话和三种方言辨识实验结果表明,系统平均辨识率可以达到83.8%。

Abstract

This paper discusses the criterions for distinguishing different Chinese dialects and the basic features selection firstly. According to these principals, a novel feature named district differential cepstral feature was proposed. Then, a novel dialect identification system combining GMM tokenizer, N-gram language model and ANN is constructed. Compared with traditional LID system, the new system has following characteristics: first, it is unnecessary to use tagged dialects speech database ,which becomes less labour-intensive to build corpora. Second, GMM tokenizer is more computationally efficient. Third, the system has more accurate and robust performance. In a test under Chinese dialects classification, averagely 83.8% accuracy is achicved.

关键词

计算机应用 / 中文信息处理 / GMM符号化器 / N元语言模型 / 汉语方言辨识

Key words

computer application / Chinese information processing / GMM tokenizer / n-gram language modeling / Chinese dialects identification

引用本文

导出引用
顾明亮,沈兆勇. 基于语音配列的汉语方言自动辨识. 中文信息学报. 2006, 20(5): 79-84
GU Ming-liang,SHEN Zhao-yong. Phonotatics Based Chinese Dialects Identification. Journal of Chinese Information Processing. 2006, 20(5): 79-84

参考文献

[1] Wuei-He Tsai, Wen-Whei Chang, Discrimination Training of Guassian Mixture Bigram Models with Application to Chinese Dialect Identification[J]. Speech Communication, 2002, 36: 317 - 326.
[2] 陈海伦. 方言机器识别技术研究[J]. 公安大学学报, 2000, 1 (1) : 33 - 38.
[3] Y. K. Muthusamy, E. Barnard, and R. A. Cole, Reviewing Automatic Language Identification [J]. IEEE Signal Processing Mag. , 1994, 11 (4) : 33 - 41.
[4] M. A. Zissman, Comparison of Four Approaches to Automatic Language Identification of Telephone Speech [J]. IEEE Trans. Speech and Audio Processing, 1996, 4 (1) : 31 - 34.
[5] Alvin F. Martin, Mark A. Przybocki, NIST 2003 Language Recognition Evaluation [M]. In: EuroSpeech [C] , 2003.
[6] Torres-Carrasquillo, P. A. ; Reynolds, D. A. ;Deller, J. R. , Jr. Language identification using Gaussian mixture model tokenization [A]. IEEE International Conference on Acoustics, Speech, and Signal Processing [C] ,Orlando, Florida, May 2002,USA.
[7] F. Jelinek, Statistical Methords for Speech Recognition [M]. Cambridge, Massachusetts, MIT Press, 1999.
[8] 周志华,曹存根. 经网络及其应用[M] ,北京:清华大学出版社, 2004年9月.
[9] 侯精一. 现代汉语方言音库[M] ,上海:上海教育出版社, 1994年~1999年.

基金

江苏省“十五”社科基金资助项目(K3-013);江苏省高校自然科学基金资助项目(99KJB510002)
PDF(333 KB)

Accesses

Citation

Detail

段落导航
相关文章

/