Title: Information Extraction from Medical Documents: Why, what and how? Speaker: Faisal Farooq (Siemens) Time: 2:15pm-3:30pm, Friday, Dec 9 Place: Room 9204, CUNY Graduate Center, 365 Fifth Ave (34str&35str). Abstract: Information extraction from clinical text has recently received a lot of attention. Researchers are applying rules as well as machine learning and natural language processing for text mining in systems. With the recent impetus of Electronic Medical Record systems, the trail of electronic documentation for a single patient visit has increased manifold. For any meaningful use of this data, the right information needs to be extracted at the right time for the right people. This information can be used for monitoring and measuring quality adherence, clinical decision support as well as downstream analytics. I will present the overview of the information extraction system we built within Siemens. More than the technology I will share the challenges and hurdles unique to healthcare and how we tried to overcome them and introduce the system into the clinical and financial workflows. I will start of with a use case for quality measurement and describe the information extraction technology used. I will follow it up with improvements and different use cases of such a system. I will end the discussion with a brief overview of the capability of learning from multiple annotators in an active learning setting that we are currently adding to the system. Bio: Faisal Farooq is currently a scientist in the Knowledge Solutions Department of Siemens Healthcare. His current research is focused on information extraction and data mining from a wide variety of clinical data. His general areas of interest are in machine learning and pattern recognition. He has published various papers in areas of medical informatics, handwriting, biometrics and text analysis in multiple journals and conferences. He also serves on the review board of various conferences and journals like International Journal of Document Analysis and Recognition, Pattern Recognition, Pattern Analysis and Applications and IEEE Transactions on Systems, Man and Cybernetics. Faisal is a member of American Medical Informatics Association (AMIA). He has organized many machine learning in medical field based workshops at venues like ICML and NIPS. Faisal received his PhD from State University of New York (SUNY) at Buffalo in 2008 where he worked as a graduate research assistant in Center of Excellence for Document Analysis and Recognition (CEDAR) and the Center for Unified Biometrics and Sensors (CUBS). His work at SUNY involved applying machine learning and pattern recognition for documents and biometrics and his thesis was on information retrieval from handwritten documents. He also completed multiple research internships at the IBM T J Watson Research Center where he worked on various projects ranging from biometrics to natural language processing and text analysis.