Time: 1245pm-145pm, Friday, April 16
Place: Room 6496, CUNY Graduate Center, 365 Fifth Ave (34str&35str).
Speaker: Sameer Maskey (IBM)
Title : Power Mean Based Algorithm for Combining Alignments in
Speech-to-Speech Translation

Abstract:

Speech-to-Speech (S2S) translation system can improve communication
among speakers who do not share a common language by translating
bi-directional speech in real time. Although advances made in
Speech-to-Speech (S2S) translation systems over the last decade have
made it possible to deploy real time S2S systems for certain domains
and languages, human-level accuracy is far from being achieved.

In this talk, I will describe some of the research problems we face
when we develop S2S systems for low resource languages such as Dari
and Pashto. Particularly, I will focus on Machine Translation (MT)
component of S2S system and describe the problem of alignment
combination. Combining alignments based on direction of translation
have shown to be useful for MT models; but most of the current
combination methods are based on heuristics. I will present a
mathematical formulation for combining an arbitrary number of
alignment tables using their power mean that does not rely on
heuristics. The method frames the combination task as an optimization
problem, and finds the optimal alignment lying between the
intersection and union of multiple alignment tables by optimizing the
parameter p: real number defining the order of the power mean
function. I will describe how this combination method results in
better S2S system for English-Pashto language pair.


Bio:

Sameer Maskey is a Research Staff Member at IBM T.J Watson Research
Center in Yorktown Heights, New York. He is also teaching this
semester at Columbia University as an Adjunct Assistant Professor in
the Department of Computer Science. He received his Ph.D. in Computer
Science from Columbia University in 2008. He got his undergraduate
degree (Honors) in 2002 from Bates college in Mathematics and Physics.
 His main research interests are Machine Learning/Statistical
Techniques for Natural Language and Speech processing, particularly
Machine Translation and Summarization of spoken documents. He has
previously worked on other topics such as Information Extraction,
Speech Synthesis and Question Answering. Currently, he is developing
statistical methods to improve various aspects of speech-to-speech
translation for low resource languages.