Unsupervised speaker adaptation for telephone call transcription

R. Wallace; K. Thambiratnam; F. Seide

doi:10.1109/ICASSP.2009.4960603

Acoustics, Speech, and Signal Processing, IEEE International Conference on

Unsupervised speaker adaptation for telephone call transcription

Year: 2009, Pages: 4393-4396

DOI Bookmark: 10.1109/ICASSP.2009.4960603

Authors

R. Wallace, Speech and Audio Research Laboratory, Queensland University of Technology, 2 George Street, Brisbane, Australia
K. Thambiratnam, Microsoft Research Asia, 5F Sigma Center, 49 Zhi Chun Road, Beijing, China 100080
F. Seide, Microsoft Research Asia, 5F Sigma Center, 49 Zhi Chun Road, Beijing, China 100080

Abstract

The use of the PC and Internet for placing telephone calls will present new opportunities to capture vast amounts of un-transcribed speech for a particular speaker. This paper investigates how to best exploit this data for speaker-dependent speech recognition. Supervised and unsupervised experiments in acoustic model and language model adaptation are presented. Using one hour of automatically transcribed speech per speaker with a word error rate of 36.0%, unsupervised adaptation resulted in an absolute gain of 6.3%, equivalent to 70% of the gain from the supervised case, with additional adaptation data likely to yield further improvements. LM adaptation experiments suggested that although there seems to be a small degree of speaker idiolect, adaptation to the speaker alone, without considering the topic of the conversation, is in itself unlikely to improve transcription accuracy.

Like what you’re reading?

Already a member?Sign In

Member Price

$11

Non-Member Price

$21

Add to Cart Sign In

Get this article FREE with a new membership!

Speaker Segmentation and Adaptation for Speech Recognition on Multiple-Speaker Audio Conference Data
2007 International Conference on Multimedia & Expo
All-phoneme ergodic hidden Markov network for unsupervised speaker adaptation
Acoustics, Speech, and Signal Processing, IEEE International Conference on
On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition
Acoustics, Speech, and Signal Processing, IEEE International Conference on
Speaker detection and tracking for telephone transactions
Proceedings of International Conference on Acoustics, Speech and Signal Processing (CASSP'02)
Development of the 2003 CU-HTK conversational telephone speech transcription system
2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
Effective speaker adaptations for speaker verification
Acoustics, Speech, and Signal Processing, IEEE International Conference on
Unsupervised incremental online adaptation to unknown environment and speaker
Proceedings of International Conference on Acoustics, Speech and Signal Processing (CASSP'02)
Tree-structured speaker clustering for fast speaker adaptation
Acoustics, Speech, and Signal Processing, IEEE International Conference on
The 1998 HTK system for transcription of conversational telephone speech
Acoustics, Speech, and Signal Processing, IEEE International Conference on
A hybrid HMM-MLP speaker verification algorithm for telephone speech
Acoustics, Speech, and Signal Processing, IEEE International Conference on

Unsupervised speaker adaptation for telephone call transcription

Authors

Abstract

Related Articles