Title: Information Extraction from Financial Documents Speaker: Sarah Hoffman (FactSet) Time: 2:15pm-3:30pm, Friday, March 2, 2012 Place: Room 4102, CUNY Graduate Center, 365 Fifth Ave (34str&35str). Abstract: Information extraction from financial documents can be done in many different ways. At FactSet, we use a mixture of rule-based and machine learning techniques depending on what we are extracting. I will be discussing some of our techniques for information extraction as well as some challenges we faced and advantages of some of our approaches. Bio: Sarah Hoffman is a Senior Software Engineer and Engineering Manager for the Content Collection Services parsing group at FactSet Research Systems, where she has been working since June 2007. Sarah is also on the board of Women in Engineering NY at FactSet. Prior to FactSet, Sarah worked as an Information Technology Analyst at Lehman Brothers for three years. Sarah holds an MS degree from Columbia University in Computer Science with a focus on Natural Language Processing, where she did research on automatically detecting deceptive speech. Sarah also holds a BBA from Baruch College in Computer Information Systems.