Archives of Acoustics, 32, 1, pp. 25–40, 2007

Prosody annotation for unit selection TTS synthesis

Adam Mickiewicz University, Institute of Linguistics

Agnieszka WAGNER
Adam Mickiewicz University, Institute of Linguistics

This paper concerns prosody annotation and intonation modeling, especially for the application in a corpus based speech synthesis. In order to establish the rules of the automatic intonation modeling, a four hour fully annotated speech database has been acoustically and perceptually analyzed. The speech material included different text types, dialogs and prosodically rich phrases.
As the result of these analyses, a basic prosodic annotation including 6 pitch accent types and 5 types of prosodic phrases have been distinguished. Moreover, the analyses made it possible to define rules for a semi-automatic stylization and parametrization of intonation contours for the application in text-to-speech and speech recognition systems. The assumptions behind the stylization method and results of the quantitative and qualitative evaluation of the stylization accuracy based on the speech consisting of ca. 1000 phrases coming from a literary text read by female and male speakers are discussed. Finally, a classification of pitch accents and boundary tones based on the parameterization is presented.
Keywords: speech synthesis and recognition, segmental and suprasegmental (prosodic) annotation, intonation modeling, intonation stylization, pitch accents, boundary tones
Full Text: PDF

Copyright © Polish Academy of Sciences & Institute of Fundamental Technological Research (IPPT PAN)