Distinguished Lecture Series, UMBC Department of Information Systems

Statistical Methods for Integration and Analysis of Opinionated Text Data

Dr. ChengXiang Zhai
Professor and Willett Faculty Scholar
University of Illinois at Urbana-Champaign

10:00am Thursday 21 April 2016, ITE 459, UMBC

Opinionated text data such as blogs, forum posts, product reviews and online comments are increasingly available on the Web. They are very useful sources for public opinions about virtually any topics. However, because the opinions are scattered and abundant, it is a significant challenge for users to collect all the opinions about a topic and digest them efficiently. In this talk, I will present a suite of general statistical text mining methods that can help users integrate, summarize and analyze scattered online opinions to obtain actionable knowledge for decision making. Specifically, I will first present approaches to integration of scattered opinions by aligning them to a well- structured article or relevant ontology. Second, I will discuss several techniques for generating a concise opinion summary that can reveal the major sentiments and opinion points buried in large amounts of opinionated text data. Finally, I will present probabilistic generative models for analyzing review data in depth to discover latent aspect ratings and relative weights placed by reviewers on different aspects. These methods are general and can thus potentially help users integrate and analyze large amounts of online opinionated text data on any topic in any natural language

cheng

ChengXiang Zhai is a Professor of Computer Science at the University of Illinois at Urbana-Champaign, where he also holds a joint appointment at the Institute for Genomic Biology, Statistics, and the Graduate School of Library and Information Science. His research interests include information retrieval, text mining, natural language processing, machine learning, and bioinformatics, and has published over 200 papers in these areas with an H-index of 58 in Google Scholar. He is an Associate Editor of ACM Transactions on Information Systems, and Information Processing and Management, and the Americas Editor of Springer’s Information Retrieval Book Series. He is a conference program co-chair of ACM CIKM 2004, NAACL HLT 2007, ACM SIGIR 2009, ECIR 2014, ICTIR 2015, and WWW 2015, and conference general co-chair for ACM CIKM 2016. He is an ACM Distinguished Scientist and a recipient of multiple best paper awards, Rose Award for Teaching Excellence at UIUC, Alfred P. Sloan Research Fellowship, IBM Faculty Award, HP Innovation Research Program Award, and the Presidential Early Career Award for Scientists and Engineers (PECASE).