Computer Science PhD Dissertation Proposal
Online Unsupervised Coreference Resolution for
Semi-Structured, Heterogeneous Data
Jennifer Alexander Sleeman
1:00pm Tuesday, 22 May 2012, 325b ITE, UMBC
Coreference resolution, determining when an instance represents a real world entity, has been widely researched in multiple domains. Online coreference resolution that supports heterogeneous data is not as well researched though these aspects of coreference resolution are incredibly important. With the complexities of computing environments today, a more flexible coreference resolution algorithm is required to support data that is processed over time rather than all at once. We present an online unsupervised coreference resolution framework for heterogeneous semi-structured data. We describe a two phase clustering model that is both flexible and distributable. We also describe a multi-dimensional attribute model that will support robust schema mappings. As part of this framework we propose a way to perform instance consolidation that will improve recall measures by addressing data spareness. We also outline how our framework will support ’cold start' knowledge base population.
Committee: Professors Tim Finin (chair), Anupam Joshi, Charles Nicholas, Tim Oates, Yun Peng, and Dr. Rafael Alonso (SAIC)