|
10thAnnual Conference of the International Speech Communication Association
Interspeech 2009 Brighton
|
Technical Programme
This is the final programme for this session. For oral sessions, the timing on the left is the current presentation order, but this may still change, so please check at the conference itself. If you have signed in to My Schedule, you can add papers to your own personalised list.
Mon-Ses3-O2: Phoneme-level Perception
| Time: | Monday 16:00 |
Place: | East Wing 1 |
Type: | Oral |
| Chair: | Rolf Carlson |
| 16:00 | Categorical perception of speech without stimulus repetition
Jack Rogers (MRC Cognition and Brain Sciences Unit, Cambridge, UK) Matthew Davis (MRC Cognition and Brain Sciences Unit, Cambridge, UK)
We explored the perception of phonetic continua generated with an automated auditory morphing technique in three perceptual experiments. The use of large sets of stimuli allowed an assessment of the impact of single vs. paired presentation without the massed stimulus repetition typical of categorical perception experiments. A third experiment shows that such massed repetition alters the degree of categorical and sub-categorical discrimination possible in speech perception. Implications for accounts of speech perception are discussed.
|
| 16:20 | Non-automaticity of use of orthographic knowledge in phoneme evaluation
Anne Cutler (Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands) Chris Davis (MARCS Auditory Laboratories, University of Western Sydney, Australia) Jeesun Kim (MARCS Auditory Laboratories, University of Western Sydney, Australia)
Two phoneme goodness rating experiments addressed the role of
orthographic knowledge in the evaluation of speech sounds. Ratings for the best tokens of /s/ were higher in words spelled with S (e.g.,
bless) than in words where /s/ was spelled with C (e.g., voice). This
difference did not appear for analogous nonwords for which every
lexical neighbour had either S or C spelling (pless, floice). Models
of phonemic processing incorporating obligatory influence of lexical
information in phonemic processing cannot explain this dissociation;
the data are consistent with models in which phonemic decisions are
not subject to necessary top-down lexical influence.
|
| 16:40 | Learning and generalization of novel contrastive cues
Meghan Sumner (Stanford University, Department of Linguistics)
This paper examines the learning of a novel phonetic contrast. Specifically, we examine how a contrast is learned – do speakers learn a specific property about a particular word, or do they internalize a pattern that can be applied to words of a particular type in subsequent processing? In two experiments, participants listened to foreign-accented English and were taught to make stop release contrastive. Following training, participants take either a minimal pair decision task or a cross-modal form priming task, both of which include trained words, words that were untrained but include a trained rime, and novel, untrained words. The results of both experiments suggest that listeners use both strategies in learning – they generalize to words with similar rimes, but are unable to extend this knowledge to novel words.
|
| 17:00 | Vowel Category Perception Affected by Microdurational Variations
Einar Meister (Institute of Cybernetics, Tallinn University of Technology, Estonia) Stefan Werner (Department of General Linguistics and Language Technology, University of Joensuu, Finland)
Vowel quality perception in quantity languages is considered to be unrelated to vowel duration since duration is used to realize quantity oppositions. To test the role of microdurational variations in vowel category perception in Estonian listening experiments with synthetic stimuli were carried out, involving five vowel pairs along the close-open axis.
The results show that in the case of high-mid vowel pairs vowel openness correlates positively with stimulus duration; in mid-low vowel pairs no such correlation was found. The discrepancy in the results is explained by the hypothesis that in case of shorter perceptual distances (high-mid area of vowel space) intrinsic duration plays the role of a secondary feature to enhance perceptual contrast between vowels, whereas in case of mid-low oppositions perceptual distance is large enough to guarantee the necessary perceptual contrast by spectral features alone and vowel intrinsic duration as an additional cue is not needed.
|
| 17:20 | Perceptual grouping of alternating word pairs: Effect of pitch difference and presentation rate
Nandini Iyer (Air Force Research Laboratory) Douglas Brungart (Air Force Research Laboratory) Brian Simpson (Air Force Research Laboratory)
When listeners hear sequences of tones that slowly alternate between a low frequency and a slightly higher frequency, they report hearing a single stream of alternating tones. However, when the alternation rate and/or the frequency difference increases, they report hearing two distinct streams: a slowly pulsing high and low frequency stream. This experiment used repeating sequences of spondees to investigate whether a similar streaming phenomenon might occur for speech stimuli. The F0 difference between every other word was varied from 0 - 18 semitones. Each word was either 100 or 125 ms in duration. The inter-onset intervals (IOIs) of the individual words were varied from 100 - 300 ms. As expected, F0 differences was a strong cue for sequential segregation. Moreover, the number of 'two' stream judgments were greater at smaller IOIs, suggesting that factors that influence the obligatory streaming of tonal signals are also important in the segregation of speech signals.
|
| 17:40 | Comparing methods to find a best exemplar in a multidimensional space
Titia Benders (Institute of Phonetic Sciences, University of Amsterdam) Paul Boersma (Institute of Phonetic Sciences, University of Amsterdam)
We present a simple algorithm for running a listening experi-
ment aimed at finding the best exemplar in a multidimensional
space. For simulated humanlike listeners, who have perception
thresholds and some decision noise on their responses, the algo-
rithm on average ends up twelve times closer than Iverson and
Evans’ goodness interpolation algorithm.
|
|
|