Archives of Acoustics,
33, 2, pp. 221–242, 2008
Some comments about the existing theory of sound with comparison to the experimental research of vector effects in real-life acoustic near fields
The article presents a~novel method of speaker individual
characteristics normalization and linear transmission distortion compensation
aimed at improving the effectiveness of short isolated utterances recognition.
To achieve this goal, spectral transformation banks of a~speaker's signal and
the division of speakers into classes were applied. The article also discusses
the form of spectral transformation, the method of its parameter values
optimization, the method of transformation banks definition, the method of
speaker classes selection and the way of iterative improvement of recognition
results. Moreover, the study puts forward a fast method of speaker classes
selection on the basis of the fundamental voice frequency. The efficiency of the
proposed solution has been validated by the recognition results obtained by
means of four versions of a recognition system using Hidden Markov Models (HMM)
and the mel frequency cepstral coefficients (MFCC) parametrization.
characteristics normalization and linear transmission distortion compensation
aimed at improving the effectiveness of short isolated utterances recognition.
To achieve this goal, spectral transformation banks of a~speaker's signal and
the division of speakers into classes were applied. The article also discusses
the form of spectral transformation, the method of its parameter values
optimization, the method of transformation banks definition, the method of
speaker classes selection and the way of iterative improvement of recognition
results. Moreover, the study puts forward a fast method of speaker classes
selection on the basis of the fundamental voice frequency. The efficiency of the
proposed solution has been validated by the recognition results obtained by
means of four versions of a recognition system using Hidden Markov Models (HMM)
and the mel frequency cepstral coefficients (MFCC) parametrization.
Keywords:
automatic speech recognition; speaker normalization; transmission
distortion compensation
Full Text:
PDF
Copyright © Polish Academy of Sciences & Institute of Fundamental Technological Research (IPPT PAN).