Errors on Tutorial Proceedings CD: see Errata page
T-1: Analysis by synthesis of speech prosody, from data to models
Presented by
Daniel Hirst
Outline
The study of speech prosody today has become a research area which has attracted interest from researchers in a
great number of different related fields including academic linguistics and phonetics, conversation analysis, semantics
and pragmatics, sociolinguistics, acoustics, speech synthesis and recognition, cognitive psychology, neuroscience,
speech therapy, language teaching... and no doubt many more. So much so, that it is particularly difficult for any
one person to keep up to date on research in all relevant areas. This is particularly true for new researchers coming
into the field.
This tutorial will propose an overview of a variety of current ideas on the methodology and tools for the automatic
and semi-automatic analysis and synthesis of speech prosody, consisting in particular of lexical prosody, rhythm,
accentuation and intonation. The tools presented will include but not be restricted to those developed by the
presenter himself.
The emphasis will be on the importance of data analysis for the testing of linguistic models and the relevance of
these models to the analysis itself. The target audience will be researchers who are aware of the importance of
the analysis and synthesis of prosody for their own research interests and who wish to update their knowledge of
background and current work in the field.
Speaker Biography
Daniel Hirst is a linguist and phonetician, who has been working in the field of prosody and phonology for nearly
forty years. He is at present Directeur de Recherches at the CNRS laboratory Parole et Langage in the University
of Provence, Aix-en-Provence, where he co- directs a research team devoted to linguistic models, annotation and
interfaces. He is the author of a study of English intonation with a purely functional representation (1977) and was
responsible for the edition of a major study of the intonation of languages of the world (Hirst & Di Cristo (eds)
1998) to which he contributed the chapter on British English (Hirst 1998) as well as an 80 page introduction (Hirst
& Di Cristo 1998) in which he proposed a new international transcription system for intonation (INTSINT). He
is the founder and current President of the ISCA Special Interest Group on Speech Prosody (SproSIG), organisers
of the International Speech Prosody meetings (Aix en Provence 2002; Nara 2004; Dresden 2006; Campinas 2008;
Chicago 2010). He has developed software for the automatic analysis of speech prosody. In particular:
- Momel - an algorithm for the automatic factoring of fundamental frequency contours into two components: a
macromelodic component and a micromelodic component.
- INTSINT - a prosodic equivalent of the International Phonetic Alphabet, originally designed as a descriptive
tool for linguistic annotation, INTSINT has since been implemented as an algorithm converting the output of
the Momel algorithm to a sequence of discrete tonal symbols which can then be used as input to synthesise a
fundamental frequency contour.