Brighton Pavilion

10thAnnual Conference of the International Speech Communication Association

ISCA Interspeech 2009 Brighton

Technical Programme

This is the final programme for this session. For oral sessions, the timing on the left is the current presentation order, but this may still change, so please check at the conference itself. If you have signed in to My Schedule, you can add papers to your own personalised list.

Mon-Ses3-O2:
Phoneme-level Perception

Time:Monday 16:00 Place:East Wing 1 Type:Oral
Chair:Rolf Carlson

16:00Categorical perception of speech without stimulus repetition

Jack Rogers (MRC Cognition and Brain Sciences Unit, Cambridge, UK)
Matthew Davis (MRC Cognition and Brain Sciences Unit, Cambridge, UK)

We explored the perception of phonetic continua generated with an automated auditory morphing technique in three perceptual experiments. The use of large sets of stimuli allowed an assessment of the impact of single vs. paired presentation without the massed stimulus repetition typical of categorical perception experiments. A third experiment shows that such massed repetition alters the degree of categorical and sub-categorical discrimination possible in speech perception. Implications for accounts of speech perception are discussed.

16:20Non-automaticity of use of orthographic knowledge in phoneme evaluation

Anne Cutler (Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands)
Chris Davis (MARCS Auditory Laboratories, University of Western Sydney, Australia)
Jeesun Kim (MARCS Auditory Laboratories, University of Western Sydney, Australia)

Two phoneme goodness rating experiments addressed the role of orthographic knowledge in the evaluation of speech sounds. Ratings for the best tokens of /s/ were higher in words spelled with S (e.g., bless) than in words where /s/ was spelled with C (e.g., voice). This difference did not appear for analogous nonwords for which every lexical neighbour had either S or C spelling (pless, floice). Models of phonemic processing incorporating obligatory influence of lexical information in phonemic processing cannot explain this dissociation; the data are consistent with models in which phonemic decisions are not subject to necessary top-down lexical influence.

16:40Learning and generalization of novel contrastive cues

Meghan Sumner (Stanford University, Department of Linguistics)

This paper examines the learning of a novel phonetic contrast. Specifically, we examine how a contrast is learned – do speakers learn a specific property about a particular word, or do they internalize a pattern that can be applied to words of a particular type in subsequent processing? In two experiments, participants listened to foreign-accented English and were taught to make stop release contrastive. Following training, participants take either a minimal pair decision task or a cross-modal form priming task, both of which include trained words, words that were untrained but include a trained rime, and novel, untrained words. The results of both experiments suggest that listeners use both strategies in learning – they generalize to words with similar rimes, but are unable to extend this knowledge to novel words.

17:00Vowel Category Perception Affected by Microdurational Variations

Einar Meister (Institute of Cybernetics, Tallinn University of Technology, Estonia)
Stefan Werner (Department of General Linguistics and Language Technology, University of Joensuu, Finland)

Vowel quality perception in quantity languages is considered to be unrelated to vowel duration since duration is used to realize quantity oppositions. To test the role of microdurational variations in vowel category perception in Estonian listening experiments with synthetic stimuli were carried out, involving five vowel pairs along the close-open axis. The results show that in the case of high-mid vowel pairs vowel openness correlates positively with stimulus duration; in mid-low vowel pairs no such correlation was found. The discrepancy in the results is explained by the hypothesis that in case of shorter perceptual distances (high-mid area of vowel space) intrinsic duration plays the role of a secondary feature to enhance perceptual contrast between vowels, whereas in case of mid-low oppositions perceptual distance is large enough to guarantee the necessary perceptual contrast by spectral features alone and vowel intrinsic duration as an additional cue is not needed.

17:20Perceptual grouping of alternating word pairs: Effect of pitch difference and presentation rate

Nandini Iyer (Air Force Research Laboratory)
Douglas Brungart (Air Force Research Laboratory)
Brian Simpson (Air Force Research Laboratory)

When listeners hear sequences of tones that slowly alternate between a low frequency and a slightly higher frequency, they report hearing a single stream of alternating tones. However, when the alternation rate and/or the frequency difference increases, they report hearing two distinct streams: a slowly pulsing high and low frequency stream. This experiment used repeating sequences of spondees to investigate whether a similar streaming phenomenon might occur for speech stimuli. The F0 difference between every other word was varied from 0 - 18 semitones. Each word was either 100 or 125 ms in duration. The inter-onset intervals (IOIs) of the individual words were varied from 100 - 300 ms. As expected, F0 differences was a strong cue for sequential segregation. Moreover, the number of 'two' stream judgments were greater at smaller IOIs, suggesting that factors that influence the obligatory streaming of tonal signals are also important in the segregation of speech signals.

17:40Comparing methods to find a best exemplar in a multidimensional space

Titia Benders (Institute of Phonetic Sciences, University of Amsterdam)
Paul Boersma (Institute of Phonetic Sciences, University of Amsterdam)

We present a simple algorithm for running a listening experi- ment aimed at finding the best exemplar in a multidimensional space. For simulated humanlike listeners, who have perception thresholds and some decision noise on their responses, the algo- rithm on average ends up twelve times closer than Iverson and Evans’ goodness interpolation algorithm.