Time: 2pm-3pm, November, Oct 20
Place: Room 4422, CUNY Graduate Center, 365 Fifth Ave (34str&35str).
Speaker: Raul Fernandez (IBM)
Title: Expressive Text-to-Speech Synthesis: State-of-the-art and Challenges.


State-of-the-art text-to-speech (TTS) systems have reached an operating 
point where they're highly intelligible and produce speech that is very 
acceptable in transactional applications, such as those mediated by a typical 
Voice User Interface. This success has shifted the research focus toward 
systems that can achieve a higher degree of naturalness, and by extension 
expressiveness, in a variety of synthesis contexts. In this talk I will 
discuss a concatenative TTS system that produces speech according to a 
repertoire of expressions contained in a training database. I will first 
give a general introduction to the concatenative approach to synthesis, 
and then discuss extensions to this architecture to facilitate 
incorporating a variety of expressions. I will also review the limitations 
of this approach as a way to outline the challenges that next-generation 
speech synthesis systems will face.

Speaker Bio: 

Raul Fernandez is a Research Staff Member in the Multilingual Analytics 
and User Technologies group at IBM's TJ Watson Research Center where he 
works on developing human-language technologies for speech synthesis and 
recognition. Prior to joining the group, he was a Research Assistant at 
the Massachusetts Institute of Technology's Media Lab, from which he 
received a PhD for his work on developing computational models for the 
automatic recognition of affect from speech. He has published papers in 
several journals and conferences in the areas of text-to-speech, spoken 
affect recognition, speech analysis, and computational prosody modeling.