Acoustics, Speech, and Signal Processing, IEEE International Conference on
Download PDF

Abstract

In this paper, we investigate combining semi-tied covariance matrices and Random Forests (RFs) based phonetic decision trees (PDTs) for acoustic modeling in conversational speech recognition. We first use the RF method to train multiple PDTs for each phone state unit, and generate multiple sets of acoustic models accordingly. We then apply semi-tied covariance matrices to each set of acoustic models to improve their fit to data. In decoding search we combine the likelihood scores from the multiple acoustic models for each speech frame. The viability of semi-tied covariance matrices with different tying classes are studied from their effects on the diversity of RF-based acoustic models as well as on the word accuracy of our task of telehealth automatic captioning. Experimental results indicate that semi-tied covariance matrices help enhance the diversity of the RFs-PDTs based acoustic models as well as increase word accuracy.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles