Feature Engineering for Large Scale Predictive
Modeling with Electronic Health Records

Dr. Fei Wang
Healthcare Analytics Research group
IBM T. J. Watson Research Center

1:00pm Wednesday, 26 March 2014, ITE325b, UMBC

Predictive modeling lies in the heart of many medical informatics problems, such as early detection of some chronic diseases and patient hospitalization/readmission prediction. Typically those predictive models are built upon patient Electronic Health Records (EHR), which are systematic collection of patient information including demographics, diagnosis, medication, lab tests, etc. We refer those information as patient features. High quality features are of vital importance to building successful predictive models. In this talk, I will present two feature engineering technologies to improve the quality of the raw features extracted from original patient EHRs: (1) feature augmentation, which constructs more effective derived features from existing raw features by exploring the event sequentiality; (2) feature densification, which imputes the missing feature values via knowledge transfer across similar patients. Along with each technique we also developed a visual interface to facilitate the user exploring the derived features. Finally I will conclude the whole talk with some future research directions.

Dr. Fei Wang is currently a research staff member in Healthcare Analytics Research group, IBM T. J. Watson Research Center. Before his current position he was a postdoc in Department of Statistical Science, Cornell University. He received his Ph.D. from Department of Automation, Tsinghua University in 2008. Dr. Wang’s major research interests include data mining, machine learning as well as their applications in social and health informatics. He actively publishes papers on the top venues of the relevant fields including AMIA, KDD, ICML and InfoVis, and he has filed over 20 patents (four issued). Dr. Wang has given seven tutorials on different topics at ICDM/SDM/ICDM, organized seven workshops on KDD/ICDM/SDM/WSDM, and edited three special issues on Journal of Data Mining and Knowledge Discovery. His Ph.D. thesis was awarded the National Excellent Doctoral Thesis in China. His research paper was selected as the recipient of the Honorable mention of the best research paper award in ICDM 2010, and best research paper finalist in SDM 2011. More information can be found on his homepage.

Host: Prof. Kostas Kalpakis,