Brighton Pavilion

10thAnnual Conference of the International Speech Communication Association

ISCA Interspeech 2009 Brighton

Tutorials Day - Sunday 6 September 2009

Errors on Tutorial Proceedings CD: see Errata page

T-1: Analysis by synthesis of speech prosody, from data to models

Presented by Daniel Hirst

Outline

The study of speech prosody today has become a research area which has attracted interest from researchers in a great number of different related fields including academic linguistics and phonetics, conversation analysis, semantics and pragmatics, sociolinguistics, acoustics, speech synthesis and recognition, cognitive psychology, neuroscience, speech therapy, language teaching... and no doubt many more. So much so, that it is particularly difficult for any one person to keep up to date on research in all relevant areas. This is particularly true for new researchers coming into the field. This tutorial will propose an overview of a variety of current ideas on the methodology and tools for the automatic and semi-automatic analysis and synthesis of speech prosody, consisting in particular of lexical prosody, rhythm, accentuation and intonation. The tools presented will include but not be restricted to those developed by the presenter himself. The emphasis will be on the importance of data analysis for the testing of linguistic models and the relevance of these models to the analysis itself. The target audience will be researchers who are aware of the importance of the analysis and synthesis of prosody for their own research interests and who wish to update their knowledge of background and current work in the field.

Speaker Biography

Daniel Hirst is a linguist and phonetician, who has been working in the field of prosody and phonology for nearly forty years. He is at present Directeur de Recherches at the CNRS laboratory Parole et Langage in the University of Provence, Aix-en-Provence, where he co- directs a research team devoted to linguistic models, annotation and interfaces. He is the author of a study of English intonation with a purely functional representation (1977) and was responsible for the edition of a major study of the intonation of languages of the world (Hirst & Di Cristo (eds) 1998) to which he contributed the chapter on British English (Hirst 1998) as well as an 80 page introduction (Hirst & Di Cristo 1998) in which he proposed a new international transcription system for intonation (INTSINT). He is the founder and current President of the ISCA Special Interest Group on Speech Prosody (SproSIG), organisers of the International Speech Prosody meetings (Aix en Provence 2002; Nara 2004; Dresden 2006; Campinas 2008; Chicago 2010). He has developed software for the automatic analysis of speech prosody. In particular:

  • Momel - an algorithm for the automatic factoring of fundamental frequency contours into two components: a macromelodic component and a micromelodic component.
  • INTSINT - a prosodic equivalent of the International Phonetic Alphabet, originally designed as a descriptive tool for linguistic annotation, INTSINT has since been implemented as an algorithm converting the output of the Momel algorithm to a sequence of discrete tonal symbols which can then be used as input to synthesise a fundamental frequency contour.