We are glad to start our Fall2010 CUNY-NLP seminar series by two short 
talks by our own outstanding students Suzanne Tamang and Adam Lee.  

Time: 2pm-3pm, Friday, Oct 1
Place: Room 4102, CUNY Graduate Center, 365 Fifth Ave (34str&35str).

Speaker: Suzanne Tamang

Title: Adding Smarter Systems instead of Human Annotators: A Combined 
Approach to Slot Filling

Abstract:
The TAC-KBP2010 Slot Filling task requires a system to automatically 
distill information from a large document collection and return answers 
for a query entity with specified attributes (slots), and use them to 
expand the Wikipedia infoboxes. We describe two bottom-up Information 
Extraction style pipelines and a top-down Question Answering style 
pipeline to address this task. We propose several novel approaches to 
enhance these pipelines, including Wikipedia redirect link mining 
based query expansion, statistical answer re-ranking and Markov Logic 
Networks based cross-slot reasoning. We demonstrate that our system 
achieves 3.1% higher precision and 2.6% higher recall compared with 
the best system in the KBP2009 evaluation. In addition, we 
investigate the annotation challenges associated with this task and 
find that a single human annotator can only reach less than 50% 
recall; adding human annotators improved the coverage but easily 
converged to some recall upper-limit. We further propose a novel 
approach on combining annotations across top automated and 
human-systems. Surprisingly, filtering errors from system combination 
achieves higher relative gains in recall and is less costly than 
asking human annotators to conduct exhaustive search from scratch. 
This is based on the joint work with Zheng Chen, Adam Lee, Xiang Li, 
Marissa Passantino and Heng Ji at CUNY.

Speaker: Adam Lee

Title: Enhancing Multi-lingual Information Extraction via Cross-Media 
Inference and Fusion

Abstract:
We describe a new information fusion approach to integrate facts 
extracted from cross-media objects (videos and texts) into a coherent 
common represen-tation including multi-level knowledge (concepts, 
relations and events). Beyond standard information fusion, we exploited 
video extraction results and sig-nificantly improved gender detection 
from texts. We further extended our methods to multi-lingual environment 
(English, Arabic and Chinese) by presenting a case study on cross-lingual 
comparable corpora acquisition based on video comparison. This is based 
on the joint work with Marissa Passantino and Heng Ji at CUNY, and 
Guojun Qi and Thomas Huang at UIUC.