MS defense: Akshaya Iyengar, Estimating Temporal Boundaries for Twitter Events

MS Thesis Defense

Estimating Temporal Boundaries For Events Using Social Media Data

Akshaya Iyengar

10:00am Wednesday, 15 June 2011, ITE 325b

Social media websites like Twitter, Flickr and YouTube generate a high volume of user generated content as a major event occurs. Our goal is to automatically determine as accurately as possible when an event starts and when it ends by analyzing the content of social media data. Estimating these temporal boundaries segments the event-related data into three major phases: the buildup to the event, the event itself, and the post-event effects and repercussions.

We describe a technique that estimates the temporal boundaries of anticipated events and helps to monitor changes as events unfold. In our approach we train a multiclass support vector machine (SVM) to classify event data into the aforementioned phases. We then discuss an algorithm for choosing the two class boundaries, such that the total error is minimized. We apply our technique to six events – Hurricane Igor (2010), Superbowl XLV (2011), three games from ICC Cricket World Cup 2011 and the Royal Wedding (2011). We train individual classifies for each of these events. Finally we train a general classifier and compare its performance with the individual classifiers.

The contributions of this research are presenting a set of features for detecting temporal boundaries of events, determining a reasonable value of tradeoff parameter for multiclass SVMs, evaluating the effect of smoothing SVM predictions using sliding window of different sizes and presenting the results of our approach on real event data gathered from Twitter. Our approach can potentially be used to detect the presence and scope of significant sub-events occurring during the course of an event. When applied to natural disasters and man-made disturbances, the derived data can help organizations involved in mediation efforts to track and analyze evolving events.

MS Defense: Face Recognition for Mass Disaster Victims

MS Thesis Defense

Face Recognition using Gabor Jets for Images of Mass Disaster Victims

Kavita Dabke

11:00am Friday, 10 June 2011, ITE 325B

Mass disasters such as earthquakes, tsunamis, floods, landslides, blizzards and other natural calamities affect a large number of people in a short time duration. After such emergencies occur, people affected need medical aid and are admitted into hospitals. In such conditions, it becomes difficult to locate one's family members and friends. Hospitals and medical centers take triage pictures of people getting admitted for their records. The content of these images could be very disturbing for some people to see. Such pictures cannot be posted on notification walls or internet websites for people to identify their missing family members or friends. This thesis addresses this problem by developing methods for searching triage image databases using query images provided by friends or family of missing people. The dataset for this thesis consists of mug shot images of people affected by calamity. These are also called the triage images. The test dataset consist of clean or regular mug shot images of people.

To automate the process of locating missing people, our thesis has a goal of developing a face recognition system based on Gabor Jets to match a clean image to the existing triage images. Here, a clean image means a mug shot image of a person where all features such as eyebrows, eyes, nose, lips, skin, ears, etc. are seen. The system aims at pulling up the exact match from the triage dataset into the top N matches filtered out based on a similarity measure. Face recognition has been studied for clean images, where all features are visible. We have developed a system to work on the domain of triage images by experimenting with existing Gabor Jets-based similarity measures and modifying the algorithm to best fit our needs.

PhD proposal: Improving traffic flow forecasts for road networks with data assimilation

Ph.D. Dissertation Proposal

Improving traffic flow forecasts for road networks
with data assimilation

Shiming Yang

3:00pm Wednesday, 8 June 2011, ITE 325b

Macroscopic models for traffic flow in networks of roads are widely used in analyzing traffic phenomena and for the management and planning of transportation road systems. These models have various simplifying assumptions in order to be tractable. Moreover, we often have only partial and inaccurate knowledge of the model parameters. Consequently, there are modeling errors to be dealt with.

An approach to mitigate our partial knowledge and modeling uncertainties, is to collect measurements of the real traffic system and use computational methods to assimilate them with the model in order to derive more accurate forecasts of the state of the system.

In this proposal, we propose to design, develop, and analyze methods for assimilating measurements from road networks to improve the accuracy of short-term forecasting of traffic flow in road networks. The proposed methods will overcome challenges due to the non-linearity of traffic flow behavior, high dimensionality of the modeled state space, and anisotropic non-Gaussian modeling and measurement error processes.


  • Dr. Kostas Kalpakis (chair)
  • Dr. Milton Halem
  • Dr. Yaacov Yesha
  • Dr. James Smith

MS defense: Gas Detection and Concentration Estimation via Mid-IR-based Gas Detection System Analysis Model

MSEE Thesis Defense

On Gas Detection and Concentration Estimation via
Mid-IR-based Gas Detection System Analysis Model

Yi Xin

2pm Monday, 6 June 2011, ITE 325

Due to recent development in laser technology and infrared spectroscopy, Laser-based spectroscopy (LAS) has been used in a wide range of research and application fields. A particular application of interest is mid-IR laser-based gas detection systems for health and environment assessment. The NSF-ERC Mid-Infrared Technologies for Health and Environment (MIRTHE) project has engineers and researchers from different areas. As a participant in MIRTHE, we study the performance analysis and improvement possibilities of the integrated sensing system.

Herein, we have improved the previously-developed statistical analysis model, and then used our statistical analysis model for a generic mid-IR pulsed-laser gas detection system to predict trace gas detection and concentration estimation performance, and their sensitivity to system parameters. Based on PNNL (Pacific Northwest National Laboratory) data and the Beer-Lambert law, we defined three main spectral peaks of a trace gas for detecting a target gas and evaluate 3-peak joint detection performance in terms of P_D vs. P_FA. For concentration estimation we used the relationship between gas transmittance (beta), molar absorptivity (epsilon), concentration (c), the sample-mean measurement (x_N) from the photo-detector, and number of samples (N) as the basis. Using the standard confidence interval method, we evaluated estimation reliability, and then analyzed estimation errors.

Simulated gas-detection and concentration-estimation results are presented for 17 trace gases at 1ppm and 1ppb concentrations.


  • Dr. Joel M. Morris (chair)
  • Dr. Chuck LaBerge
  • Dr. Gymama Slaughter

PhD Proposal: Generating Linked Data by inferring the semantics of tables

Ph.D. Preliminary Examination

Generating Linked Data by inferring the semantics of tables

Varish Mulwad

9:30am Wednesday 25 May, 2011, ITE 325b

A vast amount of information is encoded in tables on the web, spreadsheets and databases. Considerable work has been focused on exploiting unstructured free text; however techniques that are effective for documents and free text do not work well with tables. In this research we present techniques to generate high quality linked data from tables by jointly inferring the semantics of column headers, table cell values (e.g., strings and numbers), relations between columns, augmented with background knowledge from open data sources such as the Linked Open Data cloud. We represent a table's meaning by mapping columns to classes from an appropriate ontology, linking cell values to literal constants or entities in the linked data cloud (existing or new) and discovering or and identifying relations between columns. The interpreted meaning is represented as linked RDF assertions. An initial evaluation of our preliminary baseline system demonstrate the feasibility of tackling the problem. Based on this work and its evaluation, we are further developing our framework grounded in the theory of graphical models and probabilistic reasoning.

Committee members:

  • Dr. Tim Finin (chair)
  • Dr. Anupam Joshi
  • Dr. Tim Oates
  • Dr. Yun Peng
  • Dr. L V Subramaniam (IBM Research India)
  • Dr. Indrajit Bhattacharya (Indian Institute of Science)

Context-Aware Middleware for Activity Recognition, MS defense, Radhika Dharurkar, 10:30am 5/19

MS Thesis Defense

Context-Aware Middleware for Activity Recognition

Radhika Dharurkar

10:30am Thursday, 19 May 2011, ITE 325B

Smartphones and other mobile devices have a simple notion of context largely restricted to temporal and spatial coordinates. Service providers and enterprise administrators can deploy systems incorporating activity and relations context to enhance the user experience, but this raises considerable collaboration, trust and privacy issues between different service providers. Our work is an initial step toward enabling devices themselves to represent, acquire and use a richer notion of context that includes functional and social aspects such as co-located social organizations, nearby devices and people, typical and inferred activities, and the roles people fill in them.

We describe a system that learns to recognize richer contexts using sensor data from a person's Android phone along with annotations on her calendar and general background knowledge. Geo-social locations include the concepts of 'home' and 'school' and can be extended to others like 'work' or 'a restaurant'.

Our framework combines data from the phone's sensors (GPS, WI-FI, Bluetooth, acceleration, proximity, etc.) with data mined from applications (e.g., calendar) to produce features that can be used in a machine learning system. Training data from several university students and staff was collected using a system that periodically prompted the user for her true geo-social location and activity. The resulting classifier models were used to predict the individual user's context from new sensor data. The data from a set of users was combined to create a generic model.

We report on an evaluation of the individual and generic models in the university setting for predicting context. Finally, we discuss how our extended context notion can be applied to many interesting applications for smart phone users.


  • Dr. Tim Finin (chair)
  • Dr. Anupam Joshi
  • Dr. Yelena Yesha
  • Dr. Laura Zavala

Equation Modeling in Resting State Motor Network in Healthy Subjects, MS defense, Tejaswini Kavallappa

MSEE Thesis Defense

Reliability of Structural Equation Modeling in Examining Resting State Motor Network in Healthy Subjects

Tejaswini Kavallappa

3pm Monday, 16 May 2011, ITE 325

Resting state connectivity studies are of growing significance and interest in the current neuroimaging literature due to their potential in explaining various underlying brain mechanisms and, therefore, their utility in clinical applications. While functional connectivity has been extensively examined in the human brain, effective connectivity is a burgeoning field in functional neuroimaging studies, and there is an increased interest in quantifying effective connectivity that takes into account the directional influences of various brain regions active in a particular functional network. Studies have shown the presence of multiple functional networks in the resting state, which have been shown to be consistent across subjects and between sessions. However, this is not the case with resting state effective connectivity.

In this thesis we evaluate effective connectivity of the resting state motor network in normal subjects using structural equation modeling (SEM), a linear statistical analysis method. It has been shown that signals related to cardiac pulsatality and respiration effects can confound functional MRI results. Thus, we have investigated the effect of various filtering strategies on the reliability of effective connectivity measurements. Our thesis examined the effect of four methods of physiological filtering of resting state data:

  • preprocessed data without any filtering,
  • removal of prospectively recorded cardiac and respiratory fluctuations using RETROICOR,
  • removal of global average signal from all the brain voxels time series,
  • regressing out average signal of the white matter (WM) and cerebrospinal fluid (CSF), and
  • temporal filtering to remove frequencies pertinent to cardiac and respiratory sources.

The resulting effect of each of these methods on the estimation of resting state motor network effective connectivity was examined in this thesis.


  • Dr. Joel M. Morris (chair)
  • Dr. Rao P. Gullapalli (co-advisor)
  • Dr. Tulay Adali
  • Dr. Alan B. McMillan

Group Recognition in Social Networking Systems, MS Defense by Nagapradeep Chinnam

MS Thesis Defense

Group Recognition in Social Networking Systems

Nagapradeep Chinnam

1:30pm Tuesday, 17 May 2011, ITE 325

Recent years have seen an exponential growth in the use of social networking systems, enabling their users to easily share information with their connections. A typical Facebook user, as an example, might have 300-400 connections which include relatives, friends, business associates and casual acquaintances. Sharing information with a such a large and diverse set of people without violating social norms or privacy can be challenging. Allowing users to define groups and restrict information sharing by group reduces the problem but introduces new ones: managing groups and their members, relations and information sharing policies. This thesis addresses the problem of maintaining group membership.

We describe a system that learns to classify a user's new connections into one or more existing groups based on the connection's attributes and relations. We demonstrate the approach using data collected from real Facebook users. The two major tasks are identifying the relevant features for the classification and selecting the learning mechanism that best suits the task. Another significant challenge is posed by hierarchical and overlapping groups. We show that our system classifies new connections into these groups with high accuracy even with only 10-20% of labeled data.


  • Dr. Tim Finin (chair)
  • Dr. Anupam Joshi
  • Dr. Tim Oates

Community Detection in Twitter, MS defense by Mohit Kewalramani, 1pm Mon 5/16

MS Thesis Defense

Community Detection in Twitter

Mohit Kewalramani

1:00pm Monday, 16 May 2011, ITE 346

Twitter has evolved into a source of social, political and real time information in addition to being a means of mass-communication and marketing. Monitoring and analyzing information on Twitter can lead to invaluable insights, which might otherwise be hard to get using conventional media resources. An important task in analyzing highly networked information sources like twitter is to identify communities that are formed. A community on twitter can be defined as a set of users that are more similar to other members than to non-members.

We present a technique to devise a similarity metric between any two users on twitter based on the similarity of their content, links and metadata. The link structure on Twitter can be characterized using the twitter notion of followers, being followed and the @Mentions, @Reply and @RT tags in tweets. Content similarity is characterized by the words in the tweets combined with the hash-tags they are annotated with. Meta-data similarity includes similarity based on other sources of user information such as location, age and gender. We then use this similarity metric to cluster users into communities using spectral and bottom-up agglomerative hierarchical clustering. We evaluate the performance of clustering using different similarity measures on different types of datasets. We also present a heuristic to find communities in twitter that take advantage of the network characteristics of twitter.


  • Dr. Tim Finin (chair)
  • Dr. Anupam Joshi
  • Dr. Tim Oates

MS defense: Lohr on Semantic Light, 2:15 Thu 5/12

MS Thesis Defense

Semantic Light: Building Blocks

Charles Lohr

2:15pm Thursday, 12 May 2011, ITE 346

The concept of Semantic Light is simply that lighting systems can be aware of what they are lighting. This offers a number of potential advantages over conventional lighting in quality and efficiency. Semantic Light requires fine grained control of the output of many lights and requires sensors to take in information about what is being lit. It uses this information to control the output lighting in great detail. By running various algorithms, Semantic Light can provide information to the user and has a number of applications including augmented reality.

Traditional lighting that is currently in wide use has limited control of quality and quantity of the light produced. Few lights for large-scale use are intended to control their output in any kind of detailed manner. Most area lighting only has a switch that must be manually turned on or off. While there are many commercial systems that allow for more fine grained control, they are typically limited to remote control, motion control and extra manual controls. These systems can be wasteful, or they may provide inappropriate amounts of light, or they may be on when no one is using them.

While other Semantic Lighting systems may focus on "green" or powern saving aspects, we concentrate instead on innovative roles Semantic Light could play as well as on the technology to make it possible to fill those roles. By emphasizing new utility and maximizing our speed to prototype, we have made several tradeoffs that will cause our system to be less efficient than it could be, even less efficient than traditional lighting systems. The ideas and concepts covered, however, could be adapted to different underlying technologies to produce a product that could provide considerable power saving over conventional lighting.

It is important to think of the many concepts covered as primary building blocks, rather than a complete commercial system. A number of refinements and extensions will be needed to produce a commercial viable product. We demonstrate all of the needed building blocks in a concise, prototyped system.


  • Mark Olano
  • Yelena Yesha
  • Zary Segall (advisor)
1 45 46 47 48 49 51