T-4: Emerging Technologies for Silent Speech Interfaces
Presented by
Tanja Schultz and Bruce Denby
Outline
In the past decade, the performance of automatic speech processing systems, including speech recognition, text
and speech translation, and speech synthesis, has improved dramatically. This has resulted in an increasingly
widespread use of speech and language technologies in a wide variety of applications, such as commercial information
retrieval systems, call center services, voice-operated cell phones or car navigation systems, personal dictation and
translation assistance, as well as applications in military and security domains. However, speech-driven interfaces
based on conventional acoustic speech signals still suffer from several limitations. Firstly, the acoustic signals
are transmitted through the air and are thus prone to ambient noise. Despite tremendous efforts, robust speech
processing systems, which perform reliably in crowded restaurants, airports, or other public places, are still not
in sight. Secondly, conventional interfaces rely on audibly uttered speech, which has two major drawbacks: it
jeopardizes confidential communications in public and it disturbs any bystanders. Services which require the access,
retrieval, and transmission of private or confidential information, such as PINS, passwords, and security or safety
information are particularly vulnerable.
Recently, Silent Speech Interfaces have been proposed which allow its users to communicate by speaking silently,
i.e. without producing any sound. This is realized by capturing the speech signal at the early stage of human
articulation, namely before the signal becomes airborne, and then transfer these articulation-related signals for
further processing and interpretation. Due to this novel approach Silent Speech Interfaces have the potential to
overcome the major limitations of traditional speech interfaces today, i.e. (a) limited robustness in the presence
of ambient noise; (b) lack of secure transmission of private and confidential information; and (c) disturbance of
bystanders created by audibly spoken speech in quiet environments; while at the same time retaining speech as the
most natural human communication modality. The SSI furthermore could provide an alternative for persons with
speech disabilities such as laryngectomy, as well as the elderly or weak who may not be healthy or strong enough
to speak aloud effectively.
Speaker Biography
Tanja Schultz is a Full Professor at the Computer Science Department of Karlsruhe University in Germany and
an Assistant Research Professor at the Language Technologies Institute at Carnegie Mellon University. She is the
director of the Cognitive Systems Lab and director of the Center for Visually Impaired Students, both at Karlsruhe
University. Her research activities focus on human-human communication and human-machine interfaces with
a particular area of expertise in rapid adaptation of speech processing systems to new domains and languages.
She co-edited a book on this subject and received several awards for this work. In 2001 she received the FZI
price for her outstanding Ph.D. thesis on language independent and language adaptive speech recognition. In
2002 she received the Allen Newell Medal for Research Excellence from Carnegie Mellon for her contribution to
Speech-to-Speech Translation and the ISCA best paper award for her publication on language independent acoustic
modeling. In 2005 she was awarded the Carnegie Mellon Language Technologies Institute Junior Faculty Chair. Her
recent research focuses on the development of human-centered technologies and intuitive human-machine interfaces
based on biosignals, by capturing, processing, and interpreting signals such as muscle and brain activities. The
development of the silent speech interface based on myoelectric signals received the Interspeech 2006 Demo award.
Together with Prof. Denby she is a guest editor of the Speech Communication Special Issue on Silent Speech
Interfaces to be published in 2009. Tanja Schultz is the author of more than 150 articles published in books,
journals, and proceedings. She is a member of the IEEE Computer Society, the International Speech Communication
Association ISCA, the European Language Resource Association, the Society of Computer Science (GI) in Germany,
and currently serves as elected ISCA Board member, on several program committees, and review panels.
Bruce Denby is Full Professor of Electronics and Signal Processing at the Université Pierre et Marie Curie (Paris-
VI), and Research Scientist at the Laboratoire d’Electronique ESPCI- ParisTech (CNRS) in Paris, France. He
holds a BS degree from the California Institute of Technology (Caltech), MS from Rutgers University, and PhD
from the University of California at Santa Barbara, all in physics. During post-doctoral studies in Switzerland,
France, and the UK in the late 1980’s, he developed the Denby-Peterson contour extraction algorithm, and became
well known for introducing statistical learning techniques to the experimental physics community. In 1995 he was
named professor at the University of Versailles, France, where he created, and for 10 years directed, an innovative
Master’s degree program in cellular telephone technology. Since transferring to Paris-VI in 2004, he has been leader
in the area of applications of statistical learning techniques to real-time systems. Professor Denby is an Associate
Editor of the journal Pattern Recognition, and has authored over 180 publications in international journals and
peer-reviewed international conferences. He is a member of the International Speech Communication Association
ISCA, the Association for Computing Equipment (ACM), Senior Member of IEEE, and member of the IEEE
Computer, Communications, Consumer Electronics, and Instrumentation and Measurement Societies. He is one
of the originators of the “Silent Speech Interface” concept, having authored in 2004, with Prof. Maureen Stone, a
pioneering article on speech synthesis from ultrasound imagery of the tongue, and is coordinator of the OUISPER
(Oral Ultrasound SynthetIc SPEech SouRce) Project, funded by the French Department of Defense (DGA) and the
French Agence Nationale de la Recherche (ANR). He is also the primary guest editor of the Speech Communication
Special Issue on Silent Speech Interfaces to be published in 2009. Prof. Denby’s current research interests include
speech and audio signal processing, telecommunications, and radio engineering.