Date: Friday June 7
Time: 2:15pm
Location: GC Science Center (4th floor)
Title: Named Entity Recognition with Bilingual Constraints

Wanxiang Che (Harbin Institute of Technology)
Joint work with Mengqiu Wang and Chris Manning (Stanford)

Abstract: Different languages contain complementary cues about
entities, which can be used to improve Named Entity Recognition (NER)
systems. We propose a method that formulates the problem of exploring
such signals on unannotated bilingual text as a simple Integer Linear
Program, which encourages entity tags to agree via bilingual
constraints. Bilingual NER experiments on the large OntoNotes 4.0
Chinese-English corpus show that the proposed method can improve
strong baselines for both Chinese and English. In particular, Chinese
performance improves by over 5% absolute F1 score. We can then
annotate a large amount of bilingual text (80k sentence pairs) using
our method, and add it as uptraining data to the original monolingual
NER training corpus. The Chinese model retrained
on this new combined dataset outperforms the strong baseline by over
3% F1 score.