PhD Defense: Clustering and Visualization Techniques for Aggregate Trajectory Analysis

Ph.D. Dissertation Defense

Clustering and Visualization Techniques
for Aggregate Trajectory Analysis

David Trimm

1:00pm Thursday 15 March 15th 2012, ITE 365

Analyzing large trajectory sets enables deeper insights into multiple real-world problems. For example, animal migration data, multi-agent analysis, and virtual entertainment can all benefit from deriving conclusions from large sets of trajectory data. However, the analysis is complicated by several factors when using traditional analytic techniques. For example, directly visualizing the trajectory set results in a multitude of lines that cannot be easily understood. Statistical analysis methods and non-direct visualization techniques (e.g., parallel coordinates) produce conclusions that are non-intuitive and difficult to understand. By using two complementary processes—clustering and visualization—a new approach is developed to analyzing large trajectory sets. First, clustering techniques are developed and refined to group related trajectories together. From these similar sets, a trajectory composition visualization is created and implemented that clearly depicts the cluster characteristics including application-specific attributes. The effectiveness of the approach is demonstrated on two separate and unique data sets resulting in actionable conclusions. The first application, multi-agent analysis, represents a rich, spatial data that, when analyzed using this approach, shows ways to improve the underlying artificial intelligence algorithms. Student course-grade history analysis, the second application, requires tailoring the approach for a non-spatial data set. However, the results enable a clear understanding of which courses are most critical in a student's career and which student groups require assistance to succeed. In summary, this research contributes to methods for trajectory clustering, techniques for large-scale visualization of trajectory data, and processes for analyzing student data.

Committee

  • Dr. Penny Rheingans (chair)
  • Dr. Marie desJardins
  • Dr. Anupam Joshi
  • Dr. Marc Olano
  • Dr. Sreedevi Sampath

talk: Self-sustainable Cyber-physical System Design

Self-sustainable Cyber-physical System Design

Dr. Nilanjan Banerjee
University of Arkansas Fayetteville

1:00pm Tuesday 13 March 2012, ITE 325b UMBC

Renewable energy can enable diverse self-sustainable cyber-physical systems with applications ranging from healthcare to off-grid home energy management. However, there are several challenges that need to be addressed before such systems can be realized. For instance, how do we balance the small and often variable energy budgets imposed by renewables with system functionality? How can we design sensitive physical sensors and efficient harvesting circuits for mW energy sources such as sound and indoor light? For systems such as off-grid homes that interact with humans, how do we balance demand and supply while being cognizant to usability needs?

In this talk, I will present techniques that address these challenges. Specifically, I will propose a Hierarchical Power Management paradigm that combines platforms with varied energy needs to balance energy consumption and functionality, the design of an efficient harvester for sound scavenging, and sensitive ECG sensors. I will also present a measurement study that reveals the energy management challenges faced by off-grid home residents. Finally, I will conclude with the design of a solar replayer platform that allows immense flexibility in evaluating solar panel driven systems, and works for a wide range of panels.

Nilanjan Banerjee is an Assistant Professor in the department of Computer Science and Computer Engineering at University of Arkansas Fayetteville. He graduated with a M.S. and a Ph.D. from the University of Massachusetts at Amherst in 2009 and a BTech. (Hons.) from IIT Kharagpur in 2004. He has won the Yahoo! Outstanding dissertation award at UMass, a best undergraduate thesis award at IIT Kharagpur, and an Outstanding Researcher award at University of Arkansas. He is a 2011 NSF Career awardee and has won three other NSF awards (including the NSF I-Corp grant). His research interests span renewable energy driven systems, healthcare systems, and mobile systems.

Host: Anupam Joshi
See http://www.csee.umbc.edu/talks for more information

talk: Correlation Aware Optimizations for Analytic Databases

Correlation Aware Optimizations for Analytic Databases

Hideaki Kimura, Brown University

1:00pm Friday 9 March 2012, ITE 325b, UMBC

Recent years have seen that the analysis of large data-sets is crucially important in a wide range of business, governmental, and scientific applications. For example, research projects in astronomy need to analyze petabytes of image data taken from telescopes. Providing a fast and scalable analytical data management system for such users has become increasingly important.

The major bottlenecks for analytics on such big data are disk- and network-I/O. Because the data is too large to fit in RAM, each query causes substantial disk I/O. Traditional database systems provide indexes to speed up disk reads, but many analytic queries do not benefit from indexes because data is scattered over a large number of disk blocks and disk seeks are prohibitively expensive. Furthermore, such huge data sets need to be partitioned and distributed over hundreds or many thousands of nodes. When a query requires more than one data at once, such as a query involving a JOIN operation, the data management system must transmit a large amount of data over the network. For example, the Shuffle phase in Map-Reduce systems copies file blocks over the network and causes a significant bottleneck in many cases.

Our approach to tackling these challenges in big data analytics is to exploit correlations. I will describe our correlation-aware indexing, replication, and data placement which make big data analytics faster and more scalable.

Finally, if time allows, I will also introduce another on-going project to develop a scalable transactional processing system on modern hardware in collaboration with Hewlett-Packard Laboratories.

Hideaki Kimura is a doctoral candidate in the Computer Science Department at Brown University. His main research interests are in data management systems. His dissertation research with Prof. Stan Zdonik is on correlation-based optimizations for large analytic databases. He also worked on transaction processing systems exploiting modern hardware at HP Labs.

Host: Anupam Joshi
See http://www.csee.umbc.edu/talks for more information

talk: Energy Efficient and High Performance Architectures for DSP and Communication Applications

EE Graduate Seminar

Energy Efficient and High Performance Architectures
for DSP and Communication Applications

Tinoosh Mohsenin, PhD
Assistant Professor of Computer Engineering

CSEE Dept/UMBC

11:30am-12:45pm, 9 March 2012, ITE 231

Many emerging and future communication applications require a significant amount of high throughput data processing and operate with decreasing power budgets. This need for greater energy efficiency and improved performance of electronic devices demands co-optimization of algorithms, architectures, and implementations. This talk presents several design projects that illustrate the cross-domain optimization.

The design of System-on-Chip (SoC) blocks becomes increasingly sophisticated with emergent communication standards that have large real-time computational requirements. Two such algorithms, Low Density Parity Check (LDPC) decoding and Compressive Sensing (CS), have received significant attention. LDPC decoding is an error correction technique which has shown superior error correction performance and has been adopted by several recent communication standards. Compressive sensing is a revolutionary technique which reduces the amount of data collected during acquisition and allows sparse signals and images to be recovered from very few samples compared to the traditional Nyquist sampling. While both LDPC decoding and compressive sampling have several advantages, they require high computational intensive algorithms which typically suffer from high power consumption and low clock rates. This talk presents novel algorithms and architectures to address these challenges.

As future communication systems demand increasing flexibility and performance within a limited power budget, multi-core and many-core chip architectures have become a promising solution. The design and implementation of a many-core platform capable of performing DSP applications is presented. The low power and low area core processors are connected through a hierarchical network structure. The network protocol includes contention resolution for high data traffic between cores. The result is a platform with higher performance and lower power consumption than a traditional DSP with the ease of programmability lacking in an ASIC. Early post place and route results from a standard-cell design gives processor areas of 0.078 mm2 each using TSMCs 65 nm.

Dr. Mohsenin received the B.S. degree in electrical engineering from the Sharif University of Technology, Iran and the M.S. and PhD degrees in electrical and computer engineering from Rice University and University of California Davis in 2004 and 2010, respectively. In 2011, she joined the Department of Computer Science and Electrical Engineering at the University of Maryland Baltimore County where she is currently an Assistant Professor. Dr. Mohsenin's research interests lie in the areas of high performance and energy-efficiency in programmable and special purpose processors. She is the director of the Energy Efficient High Performance Computing (EEPC) Lab, where she leads projects in architecture, hardware, software tools, and applications for VLSI computation with an emphasis on DSP workloads. Dr. Mohsenin has been consultant to early stage technology companies and currently serves as a member of the Technical Program Committees of the IEEE Biomedical Circuits & Systems Conference (BioCAS), the Life Science Systems and Applications Workshop (LiSSA), and IEEE Women in Circuits and Systems (WiCAS).

Host: Prof. Joel M. Morris

Niels Kasch PhD Defense: Mining Commonsense Knowledge from the Web

Ph.D. Dissertation Defense

Mining Commonsense Knowledge from the Web:
Towards Inducing Script-like Structures From Large-scale Text Sources

Niels Kasch

10:00am Friday, March 9th, 2012, ITE 325B

Knowing the sequences of events in situations such as eating at a restaurant is an example of commonsense knowledge needed for a broad range of cognitive tasks (e.g., language understanding). This thesis outlines an approach to mine information about sequential, every day situations in a topic-driven fashion to produce declarative, script-like representations (c.f., Schank's scripts). Given a topic such as eating at a restaurant, we produce graphs of temporally ordered events involved with the activity referenced by the topic. Our work utilizes large-scale data sources (e.g., the Web) to avoid data sparseness issues of narrow corpora.

We describe steps that address the scale and noisiness of the Web to make it accessible for script extraction. Boilerplate elements (e.g., navigation bars and advertising) on web pages skew distributional statistics of words and obstruct information retrieval tasks. To make the web usable as a corpus, we introduce a machine learning technique to separate boilerplate elements from content in arbitrary web pages.

A key element for commonsense knowledge extraction is the generation of a topic-specific corpus that facilitates script extraction in a topic-driven manner. We introduce Concept Modeling for Scripts as an efficient method to induce concepts containing script elements (e.g., events, people, and objects) from topic-specific corpora. Our experiments and user studies conducted on the 2011 ICWSM Spinn3r dataset show that our method outperforms state of the art topic-modeling approaches such as Latent Dirichlet Allocation (LDA) on this task when applied to unbalanced (topic-specific) corpora.

Concept Modeling serves as a starting point for automated methods to discover events relevant to a script. We demonstrate event detection methods in topic-specific corpora based on (1) learned dependency paths indicative of individual event structures, (2) semantic cohesiveness of event pairs, and (3) surface structures indicative of golden sentences containing sequential information. Events extracted for a given topic can be arranged in a graph. The detection methods exploit graph analysis methods to identify strongly connected components to prune the event set such that related and central events are predominant in the structure. User studies demonstrate that (1) the Web is suitable for mining script-like knowledge and (2) the resulting graph structures portray events strongly related to a given topic.

Script-like structures, by definition, impose temporal ordering on the events contained within the structure. This work also presents a novel method to induce ordering information from topic-specific corpora based on a counting framework to judge the presence and strength of a temporal happens-before relation. The framework is extensible to several counting methods, where a counting method provides co-occurrence and ordering statistics. We present, among others, a novel naive counting methods that uses a simple sentence position assumption for temporal order. Comparisons to existing temporal resources show that our naive method, in conjunction with connected components analysis, induces temporal relationship with similar accuracy than more sophisticated methods, yet with a smaller computational footprint.

Committee

  • Dr. Tim Oates (chair)
  • Dr. Ronnie W. Smith
  • Dr. Matt Schmill
  • Dr. Tim Finin
  • Dr. Charles Nicholas

talk: Quantum Knots and Quantum Braids

Quantum Knots and Quantum Braids

Dr. Samuel J. Lomonaco
University of Maryland, Baltimore County

Noon-1:00pm Friday, March 9, 2012, MP 401

In this talk, we show how to reconstruct knot theory in such a way that it s intimately related to quantum physics. In particular, we give a blueprint for creating a quantum system that has the dynamic behavior of a closed knotted piece of rope moving in 3-space. Within this framework, knot invariants become physically measurable quantum observables, knot moves become unitary transformations, with knot dynamics determined by Schroedinger's equation. The same approach can also be applied to the theory of braids. Toward the end of the talk, we briefly look at possible applications to superfluid vortices and to topological quantum computing in optical lattices.

Professor Lomonaco received his PhD in Mathematics from Princeton University. He has been a full professor of Computer Science and Electrical Engineering at the University of Maryland Baltimore County (UMBC) since 1985, serving as Founding Chair of the CS Department 1985 to 1991.

talk: Interactive visual computing for knowledge discovery in science, engineering and training

Interactive visual computing for knowledge discovery
in science, engineering and training

Dr. Jian Chen
University of Southern Mississippi

1:00pm Wednesday 7 March 2012, ITE 325b UMBC

Advances in simulations and lab experiments are producing huge datasets at unprecedented rates, and deriving meanings from these data will have far-reaching impacts on our lives in many areas of science, engineering, and medicine. Visualization and interactive computing provide great tools for exploiting these data in scientific discovery and engineering innovations. A limiting factor in the scientific use of visualization tools is the lack of guiding principles to identify and assess visualization methods that are helpful in scientific tasks. In this talk, I present research designed to advance knowledge discovery through the design and evaluation of interactive visualizations. Experiments on image illumination and density are described that successfully address this limitation in brain imaging for medical diagnoses. I also present the theoretical foundations that have led to the various choices in visualization design. In the second part of the talk, I argue that most existing tools designed for scientific discovery fail to address the dynamic nature of the discovery workflow. I present a new visualization tool, VisBubbles, that integrates programming, visualization, and interaction in one environment to create fluid workflows in which new hypotheses can be tested efficiently. VisBubbles augments interactive computing and analysis of time-varying motion data of bat flights by enabling dynamic displays, thus facilitating scientists' quest for new knowledge. I present the design methods we have followed in our long-term collaboration with biologists and engineering scientists on motion analysis. Finally, I present future work I envision in interactive visualization that will be critical in developing future visualization tools for science, engineering, and training.

Jian Chen is an assistant professor in the School of Computing at the University of Southern Mississippi. She is the founder and director of Interactive Visual Computing Lab. Her research is in the broad area of interaction and visualization, with current focuses on the emerging field of scientific visualization theory and workflow analysis. She has published numerous articles in top journals and international conferences. Her panel on combining human-centered computing and scientific visualization received honorable mention at the 2007 IEEE Visualization Conference. She was a postdoc at Brown University with Drs. David H. Laidlaw (CS) and Sharon Swartz (BioMed) from 2006 to 2009. She has a Ph.D. degree in Computer Science from Virginia Tech and Master’s degrees in both Computer Science and Mechanical Engineering. Her research has been funded by DHS and NSF.

Host: Penny Rheingans

See here for more information

talk: Using Static Analysis to Diagnose Misconfigured Open Source Systems Software

Using Static Analysis to Diagnose
Misconfigured Open Source Systems Software

Ariel Rabkin, UC Berkeley

1:00pm Monday 5 March 2012, ITE 325b UMBC

Ten years ago, few software developers worked on distributed systems. Today, developers often run code on clusters, relying on large open-source software stacks to manage resources. These systems are challenging to configure and debug. Fortunately, developments in program analysis have given us new tools for managing the complexity of modern software. This talk will show how static analysis can help users configure their systems. I present a technique that builds an explicit table mapping a program's possible error messages to the options that might cause them. As a result, users can get immediate feedback on how to resolve configuration errors.

Ari Rabkin is a PhD student in Computer Science at UC Berkeley working in the AMP lab. His current research interest is the software engineering and administration challenges of big-data systems. He is particularly interested in applying program analysis techniques to tasks like log analysis and configuration debugging. His broader interests focus on systems and security, including improving system usability by making systems easier to understand, the connections between computer science research and technology policy, developing program analysis techniques that work acceptably well on large, complex, messy software systems.

Host: Anupam Joshi
See http://www.csee.umbc.edu/talks for more information

talk: Building and Testing Distributed Systems

Building and Testing Distributed Systems

Dr. Charles Killian
Purdue University, Computer Science

1:00pm Friday, 2 March 2012, ITE325 UMBC

Building distributed systems is particularly difficult because of the asynchronous, heterogeneous, and failure-prone environment where these systems must run. This asynchrony makes verifying the correctness of systems implementations even more challenging. Tools for building distributed systems must often strike a compromise between reducing programmer effort and increasing system efficiency. In my research, we strive to introduce a limited amount of structure and limitations to implementations to enable a wide range of analysis and development assistance. Most prominently, we have built the Mace language and runtime, which translates a concise, expressive distributed system specification into a C++ implementation. The Mace specification importantly exposes three key pieces of structure: atomic events, explicit state, and explicit messaging.

With a few additional contextual annotations, we show how we can support intra-node parallel event processing of these atomic events while still preserving sequenal event consistency—even using variably available computing resources distributed across a cluster. By leveraging these three structural elements, we have further built tools such as a model checker capable of detecting liveness violations in systems code, a performance tester, and an automated malicious protocol tester. Recent research has also explored applications of these key structures in legacy software, that has produced a log anaysis tool that can detect performance problems, and a malicious fault injector that can discover successful performance attacks. Mace has been in development since 2004 and has been used to build a wide variety of Internet-ready distributed systems both by myself and by researchers at places such as Cornell University, Microsoft Research (Redmond, Silicon Valley, and Beijing), HP Labs, UCLA, EPFL, and UCSD. This talk will give an overview of my research, presenting the execution model and its checker, support for event parallelization, and our more recent testing tools.

Charles Killian is an Assistant Professor in the Department of Computer Science at Purdue University. He received an NSF CAREER award in 2011, as well as an HP Open Innovation award. In 2008 he completed his Ph.D. in Computer Science from the University of California, San Diego under the supervision of Amin Vahdat. Before transferring to UCSD in August 2004, he completed his Masters in Computer Science from Duke University with Amin Vahdat. His systems and networking research focuses on building and testing distributed systems, and bridges this research with software engineering, security, data mining, and programming languages. Since 2004 he has implemented the Mace programming language and runtime, built numerous distributed systems, and designed MaceMC, the first model checker capable of finding liveness violations in unmodified systems code and 2007 best paper award at NSDI. Chip has built many additional tools and enhancements since then, including performance testing, work on parallel event processing, automated attack discovery, and data mining logs to discover performance problems.

talk: Spectrum Wars: LightSquared vs. GPS, 11:30am Fri 2/2

EE Graduate Seminar

Spectrum Wars: LightSquared vs. GPS

Professor Chuck LaBerge
Professor of the Practice, CSEE Dept/UMBC

11:30am-12:45pm Friday, 2 March 2012, ITE 231

The radio-frequency spectrum is a limited resource. Within the US, commercial use of the spectrum is administered by the Federal Communications Commission (FCC), while government use of the spectrum is administered by the National Telecommunications and Information Administration. Currently, the regulatory community is locked in a battle about spectrum utilization in the vicinity of 1.5 GHz. This struggle pits millions of users of GPS technology for position and time information against technical innovators desiring to bring 4G wireless communications to millions of users in underserved populations. So who wins the spectrum wars?

The talk will outline the technologies involved, and provide a time-line of the regulatory actions to date. There are some innovative things going on here, and some simple analysis will show why there are points of contention. A final resolution cannot be provided at this time, because the issue is currently an open discussion in FCC. And, as might expected, there are financial and political ramifications as well.

This talk will provide an interesting insight into how the 'real world' works.

Dr. LaBerge is Professor of the Practice of Electrical and Computer Engineering in the CSEE at UMBC, where he teaches a wide variety of courses ranging from Introductory Circuits to Error Correcting Codes. From 1975-2008, he was employed by Bendix, which became AlliedSignal, which became Honeywell through a series of corporate mergers. He retired in July 2008 as the Senior Fellow for Communications, Navigation, and Surveillance in Honeywell's Aerospace Research and Technology Center.

Dr. LaBerge has worked on precision landing systems and a wide variety of aeronautical radios and applications. He's recognized as an expert in issues involving interference to aeronautical systems. His technical, writing, and editorial contributions have received numerous citations from regulatory bodies, and he was the winner of the Best Paper of Conference at the 2000 IEEE/AIAA Digital Avionics Systems Conference.

Dr. LaBerge is a Senior Member of IEEE, a member of Tau Beta Pi, and an inductee in the Order of the Engineer. He received his BES-EE and MSE-EE, degrees, both with Honors, from The Johns Hopkins University and the PhD. in Electrical Engineering from UMBC. His three kids are older than his students. He's been married to his patient wife for almost 38 years.

Host: Prof. Joel M. Morris

1 47 48 49 50 51 58