Title: Information Extraction from Medical Documents: Why, what and how?
Speaker: Faisal Farooq (Siemens)
Time: 2:15pm-3:30pm, Friday, Dec 9
Place: Room 9204, CUNY Graduate Center, 365 Fifth Ave (34str&35str).

Information extraction from clinical text has recently received a lot
of attention. Researchers are applying rules as well as machine
learning and natural language processing for text mining in systems.
With the recent impetus of Electronic Medical Record systems, the
trail of electronic documentation for a single patient visit has
increased manifold. For any meaningful use of this data, the right
information needs to be extracted at the right time for the right
people. This information can be used for monitoring and measuring
quality adherence, clinical decision support as well as downstream
analytics. I will present the overview of the information extraction
system we built within Siemens. More than the technology I will share
the challenges and hurdles unique to healthcare and how we tried to
overcome them and introduce the system into the clinical and financial
workflows. I will start of with a use case for quality measurement and
describe the information extraction technology used. I will follow it
up with improvements and different use cases of such a system. I will
end the discussion with a brief overview of the capability of learning
from multiple annotators in an active learning setting that we are
currently adding to the system.


Faisal Farooq is currently a scientist in the Knowledge Solutions
Department of Siemens Healthcare. His current research is focused on
information extraction and data mining from a wide variety of clinical
data.  His general areas of interest are in machine learning and
pattern recognition. He has published various papers in areas of
medical informatics, handwriting, biometrics and text analysis in
multiple journals and conferences. He also serves on the review board
of various conferences and journals like International Journal of
Document Analysis and Recognition, Pattern Recognition, Pattern
Analysis and Applications and IEEE Transactions on Systems, Man and
Cybernetics. Faisal is a member of American Medical Informatics
Association (AMIA). He has organized many machine learning in medical
field based workshops at venues like ICML and NIPS. Faisal received
his PhD from State University of New York (SUNY) at Buffalo in 2008
where he worked as a graduate research assistant in Center of
Excellence for Document Analysis and Recognition (CEDAR) and the
Center for Unified Biometrics and Sensors (CUBS). His work at SUNY
involved applying machine learning and pattern recognition for
documents and biometrics and his thesis was on information retrieval
from handwritten documents. He also completed multiple research
internships at the IBM T J Watson Research Center where he worked on
various projects ranging from biometrics to natural language
processing and text analysis.