Time: 1245pm-145pm, Friday, March 26 Place: Room 6496, CUNY Graduate Center, 365 Fifth Ave (34str&35str). Speaker: Nitin Madnani (Maryland) Title: The Circle of Meaning: From Translation to Paraphrasing and Back Abstract: The preservation of meaning between their inputs and outputs is perhaps the most ambitious and, often, the most elusive goal of systems that attempt to process natural language. Nowhere is this goal of more obvious importance than for the tasks of machine translation and paraphrase generation. Preserving meaning between the input and the output is paramount for both, the monolingual vs bilingual distinction notwithstanding. In this talk, I propose a novel, symbiotic connection between these two tasks. Today's SMT systems require high quality human translations, in addition to large bitexts, for parameter tuning. For such tuning, it is generally considered wise to have multiple (usually 4) reference translations to avoid unfair penalization of translation hypotheses. However, this reliance on multiple reference translations creates a problem, because reference translations are labor intensive and expensive to obtain. Therefore, most current MT datasets only contain a single reference. This leads to the problem of reference sparsity--- the primary open problem that I attempt to address in this talk---one that has a serious effect on the SMT parameter tuning process. Bannard & Callison-Burch (2005) were the first to provide a practical connection between phrase-based statistical machine translation techniques paraphrase generation. However, their technique is restricted to generating phrasal paraphrases. We build upon their approach and augment a phrasal paraphrase extractor into a sentential paraphraser with extremely broad coverage. The novelty in this augmentation lies in the further strengthening of the connection between statistical machine translation and paraphrase generation; whereas Bannard and Callison-Burch only rely on SMT machinery to extract phrasal paraphrase rules and stop there, we take it a few steps further and build a full English-to-English SMT system. This system can, as expected, "translate" any English input sentence into a new English sentence with the same degree of meaning preservation that exists in a bilingual SMT system. In fact, being a state-of-the-art SMT system, it is able to generate n-best "translations" for any given input sentence. This sentential paraphraser, built almost entirely from SMT machinery, represents the first 180 degrees of the proposed circle of meaning. To complete the circle, we propose a novel connection in the other direction. We claim that the sentential paraphraser, once built in this fashion, can provide a solution to the reference sparsity problem and, hence, be used to improve the performance a bilingual SMT system. We posit two different instantiations of the sentential paraphraser and show results that provide empirical validation for this proposed connection. Speaker Bio: Nitin Madnani is a final year PhD student at the University of Maryland, College Park. He works as a research assistant in the Laboratory for Computational Linguistics and Information Processing with his advisors Bonnie Dorr and Philip Resnik. Besides exploring the intersection of and interaction between machine translation and paraphrasing as part of his thesis, he has also worked on multi-document summarization and information retrieval. He is planning to graduate in May 2010.