wikipedia

Computer Science and Electrical Engineering
University of Maryland, Baltimore County
Ph.D. Dissertation Defense

A Rapidly Deployable Image Classification System Using Feature Views

Adrian Rosebrock

9:00am Friday, 18 April 2014, ITE 346, UMBC

Constructing an image classification system using strong, local invariant descriptors is both time consuming and tedious, requiring much experimentation and parameter tunings to obtain an adequate performing model. Furthermore, training a system in a given domain and then migrating the model to a separate domain will likely yield poor performance. As the recent Boston Marathon attacks demonstrated, large, unstructured image databases from traffic cameras, security systems, law enforcement officials, and citizens can be quickly amassed for authorities to review; however, reviewing each and every image is an expensive undertaking, in terms of both time and human effort. Inherently, reviewing crime scene images is a classification task. For example, authorities may want to know if a given image contains a suspect, a suspicious package, or if there are injured people in the photo. Given an emergency situation, these classifications will be needed as quickly and accurately as possible. In this work we present a rapidly deployable image classification system using “feature views”, where each view consists of a set of weak, global features. These weak global descriptors are computationally simple to extract, intuitive to understand, and require substantially less parameter tuning than their local invariant counterparts. We demonstrate that by combining weak features with ensemble methods we are able to outperform current state-of-the-art methods or achieve comparable accuracy with much less effort and domain knowledge. We then provide both theoretical and empirical justifications for our ensemble framework that can be used to construct rapidly deployable image classification systems called “Ecosembles”.

Finally, we recognize the fact that image datasets give us the relatively unique opportunity to extract multiple feature representations through the use of various descriptors. In situations where the original dataset is not available for further feature extraction or in cases where multiple feature views are ambiguous (such as predicting income based on geographical location and census data) the Ecosemble method cannot be applied. In order to extend Ecosembles to arbitrary datasets of diverse modalities, we introduce artificial feature views using kernel approximations. These artificial feature views are constructed from a single representation of the data, alleviating the need to explicitly extract multiple feature views. We then apply artificial feature views to a diverse range of non-image classification datasets to demonstrate our method is applicable to multiple modalities, while still outperforming current state-of-the-art methods.

Committee: Drs. Tim Oates (chair), Jesus Caban, Tim Finin, Charles Nicholas, Jian Chen