Title: Information Extraction Crossing Language, Robustness and Domain 

Speaker: Imed Zitouni (Microsoft)
Time:  4:15pm-5:30pm, Thursday, April 25

Place: CUNY Graduate Center, rm 9204/9205,

Modern communication technologies have made massive amounts of real-time 
news information in several languages readily available. This led to the 
need to develop news-monitoring system that allows users to monitor 
multilingual news media in near real-time and search over stored content.
In this talk I will briefly describe the architecture of a news-monitoring 
system and focus on its information extraction component. Information 
extraction is a crucial step toward understanding a text, as it identifies 
the important conceptual objects and relations between them in a 
I will address the portability of the used approach to different languages 
and show a method of propagating information into low resource languages 
from richer ones. Compared to other approaches that focus on clean-text, I 
will also show the robustness of our technique to less-well-formed input. 
For example, information extraction in a multilingual broadcast processing 
system has to deal with inaccurate automatic transcription and 
The resulting presence of non-target-language text in this case yields 
many false alarms, which raise the research problem of making information 
extraction robust to such noisy input text.

Imed Zitouni is a Principal Researcher at Microsoft since September 2013 
working on Relevance and Measurement techniques to improve Bing's search 
quality. Imed's current research interest includes information retrieval 
with focus on the use of statistics and machine learning techniques to 
develop web scale offline and online metrics for search engines. Imed is 
also interested in using Natural Language Processing (NLP) technologies to 
add a layer of semantics and understanding to search engines, with a 
belief that next generation search engines will be based on dialog and 
language understanding. Prior to joining Microsoft, Imed was a senior 
scientist at the Multilingual NLP group of IBM for almost a decade, where 
he served as team-lead in several NLP projects. Imed was key member of 
several government projects including the GALE program. Prior to IBM, he 
was a research member of Bell Laboratories, Lucent Technologies, for 
almost half dozen years working on language modeling, speech recognition, 
spoken dialog systems and speech understanding. Imed received his M.Sc. 
and Ph.D. with the highest-honors from the University-of-Nancy1 France.
Imed is a senior member of IEEE, served as a member of the IEEE Speech and 
Language Processing Technical Committee (99-11), the Information Officer 
of the ACL SIG on Semitic-Languages, associate editor of TALIP ACM journal 
and a member of ISCA and ACL. He also served as chair and 
reviewing-committee-member of several conferences and journals. He is the 
author/co-author of more than 80 papers in international conferences and 
journals. His recent book is "Multilingual Natural Language Processing 
Application: from Theory to Practice", by Prentice Hall.