From dralansherman@starpower.net Thu May  1 01:24:08 2008
Date: Thu, 1 May 2008 00:34:04 -0400
From: Dr. Alan T. Sherman <dralansherman@starpower.net>
To: CSEE ALL <csee@csee.umbc.edu>
Subject: [Csee-faculty-lecturer] CSEE Research Review - Poster Abstracts


CSEE Research Review - Poster Abstracts

Friday, May 2, 2008

 
Department of Computer Science and Electrical Engineering

University of Maryland, Baltimore County (UMBC)

 
BS Students

 
1. Stephen Sullivan, Ebiquity
Polvox: Identifying Political Affiliations within the Blogosphere

 
The Polvox project aims to develop tools and compare techniques for
predicting political affiliations as well as finding memes and
communities in the blogosphere through the application of semantic
analysis. The focus of our research, a part of the larger Polvox
effort, has been to use machine learning to identify the political
leanings of blogs. For the purposes of our research effort we are
only looking at democratic versus republican bias. In future work, we
plan to extend our research to explore other characteristics such as
political issues, candidates, geographical locations, and races. Our
system has several components: a greasemonkey script for humans to
use to tag if a site is democratic or republican, a component to
parse and index blogs, and a classifier. We are investigating the use
of several kinds of classifier techniques and combinations such as
bag-of-words, n-grams, and Hyperlinks. To train the classifiers we
are using data collected by humans that has been marked as being
either democratic leaning or republican leaning. We will report on
the results obtained and discuss which features seem to be the most
discriminative in terms of identifying political blogs.

 
MS Students

 
2. Jonathan Bronson (Advisor: Rheingans), VANGOGH Lab

Statistically Weighted Visualization Hierarchies

 
We are beginning to see an overload in the amount of information
packed into a given visualization. In many cases, it is no longer
possible to look at a single level of detail and obtain from it the
answers we are looking for. This problem is especially relevant to
datasets of high dimensionality. Not only does it become difficult to
hone in on a particular dimension of possible interest, but even more
difficult to find and understand the relationships between them. In
computer graphics, varying orders of magnitude has traditionally been
addressed by image hierarchies known as MipMaps. These hierarchies
are extremely fast and provide a seamless transition from one level
of detail to the next. Unfortunately, this approach does not carry
over to textures full of scientific data. This approach introduces a
series of errors which not only misrepresent and corrupt the
underlying data as visible to the viewer, but hide interesting
features which warrant further investigation. We propose an
alternative hierarchical approach, using statistical analysis to
generate more representative macroscopic views of extremely high
detailed data fields.

 
3. Richard T. Carback III (Advisor: Alan Sherman), Cyber Defense Lab

Scantegrity: Post-Election Voter Verifiable Optical-Scan Voting


Scantegrity is a security enhancement for optical scan voting
systems. It is part of an emerging class of post-election
"end-to-end" (E2E) independent election verification systems that
permit each voter to verify that her ballot was correctly recorded
and counted. On the Scantegrity ballot, each candidate position is
paired with a confirmation code that is shown to the voter after she
marks her ballot. Election officials confirm receipt of the ballot by
posting the confirmation code that is adjacent to the marked
position. Scantegrity is the first voting system to offer strong
post-election independent verification without changing the way
voters mark optical scan ballots, and it complies with legislative
proposals requiring "unencrypted'" paper audit records.

 
4. Sheetal Gupta, Ebiquity

Query Distribution Estimation and Predictive Caching in Mobile Ad Hoc
Networks


The problem of data management has been studied widely in the field
of mobile ad-hoc networks and pervasive computing.  The issue
addressed is that finding the data required by a device depends on
chance encounter with the source of data. Most existing research has
focused on acquiring the required data by specifying the user or
application intentions. These approaches take the semantics of data
into account while caching data onto mobile devices from the wired
sources. We propose a scheme by which mobile devices proactively
increase the availability of data by pushing and caching the most
popular data in the network. It involves a local distributed
technique for estimating global query distribution in the network.
The mobile devices have a finite sized cache to store the pushed data
and use their estimation of queries for prioritizing the data to
cache. We implement this technique in the network simulator, Glomosim
and show that our scheme improves data availability as well as the
response latency.

 
PhD Students

 
5. Jesus J. Caban (Advisor: Rheingans), VANGOGH Lab

Generating and Visualizing Statistical Volume Models

 
Large digital repositories of volumetric data continue to raise
questions about how differences, relationships, variability, and data
uncertainty are best discovered and visualized.  Understanding the
structural and statistical properties of a collection of 3D volumes
is a difficult task due to the large amount of data involved. 
Comparing specific members of the group or visualizing each member
independently does not provide the means required to effectively
learn the statistical properties of a given population.

     We introduce statistical volumes and present a framework to
generate and visualize statistical volumetric models.  Our technique
loads a population of volumetric data, aligns them into a common
coordinate system, and uses a volumetric decomposition to generate a
hierarchical representation of the statistical volume.  The
hierarchical model effectively captures the statistical properties of
the input data by creating a set of probability density functions for
each voxel or region of interest.  Visualization techniques are then
used to show statistical properties, to illustrate structural
attributes, to generate new instances of the group, and to
effectively display characteristic regions of the collection under
consideration.

 
6. Lushan Han (Advisor: Tim Finin), Ebiquity Lab

Predicting Appropriate Semantic Web Terms from Words

 
The Semantic Web language RDF was designed to unambiguously define
and use ontologies to encode data and knowledge on the Web. Many
people find it difficult, however, to write complex RDF statements
and queries because doing so requires familiarity with the
appropriate ontologies and the terms they define. We describe a
system that suggests appropriate RDF terms given semantically related
English words and general domain and context information. We use the
Swoogle Semantic Web search engine to provide RDF term and namespace
statistics, the WorldNet lexical ontology to find semantically
related words, and a naïve Bayes classifier to suggest terms. A
customized graph data structure of related namespaces is constructed
from Swoogle's database to speed up the classifier model learning and
prediction time.

 
7. Akshah Java (Advisor: Tim Finin)

Approximating the Community Structure of the Long Tail

 
Communities are central to online social media systems and detecting
their structure and membership is critical for many applications. The
large size of the underlying graphs makes community detection
algorithms very expensive. We describe an approach to reducing the
cost by estimating the community structure from only a small fraction
of the graph. Our approach is based on an important assumption that
large, scale-free networks are often very sparse. Such networks
consist of a small, but high degree set of core nodes and a very
large number of sparsely connected peripheral nodes (Borgatti &
Everett 2000). The insight behind our technique is that the community
structure of the overall graph is very well represented in the core.
The community membership of the long tail can be approximated by
first using the subgraph of the small core region and then analyzing
the connections from the long tail to the core. A set of vertices can
constitute a community if they

 
8. Palanivel Kodeswaran, Ebiquity

Utilizing Semantic Policies for Managing BGP Route Dissemination

 
Policies in BGP are implemented as routing configurations that
determine how route information is shared among neighbors to control
traffic flows across networks. This process is generally template
driven, device centric, limited in its expressibility, time consuming
and error prone which can lead to configurations where policies are
violated or there are unintended consequences that are difficult to
detect and resolve. In this work, we propose an alternate mechanism
for policy based networking that relies on using additional semantic
information associated with routes expressed in an OWL ontology.
Policies are expressed using SWRL to provide fine-grained control
where by the routers can reason over their routes and determine how
they need to be exchanged. In this paper, we focus on security
related BGP policies and show how our framework can be used in
implementing them. Additional contextual information such as
affiliations and route restrictions are incorporated into our policy
specifications which can then be reasoned over to infer the correct
configurations that need to be applied, resulting in a process which
is easy to deploy, manage and verify for consistency.

 
9. John Krautheim (Advisors: Dhananjay Phatak, Alan T. Sherman),
Cyber Defense Lab

Identifying Trusted Virtual Machines

 
Software operating on physical computer derives its identity from the
underlying hardware components. When the same software is operating
within a virtualized environment, the identity looses its binding to
the hardware due to the interaction of the virtual machine monitor.
We show that once software has been virtualized, it looses it unique
identity, which exposes it to a reincarnation attack which allows
licensed software and digital rights managed content protections to
be subverted. We propose to develop a mechanism to uniquely identify
instances of software running within virtualized environments.

     A typical mechanism to identify a platform configuration is to
utilize a Trusted Platform Module (TPM) to provide a unique identity.
A virtual machine (VM) operating on that same platform no longer has
the ability to uniquely identify itself as the virtual machine
monitor, or hypervisor, adds a layer of uncertainty to the trust
layer of the computing platform. The hypervisor operates below the
virtual machine at the highest privilege in the system; therefore, it
has the ability to subvert the normal protection mechanisms of
typical operating systems and application software running within the
virtual machine.

     This project proposes to develop an architecture and protocol
for enabling the identity normally bound to the hardware to be
extended to the virtual machine. Recent advances in VM technology
including Intel's Virtualization Technology (VT) and AMD's Pacifica
have enabled virtualization functions into hardware. Additionally,
Intel Trusted Execution Technology (TXT) provides mechanisms for a
verifiably reporting platform identity and configuration. By
leveraging Intel VT and TXT, the identity of the hardware can be
lifted to the VM presentation layer through a virtualized trusted
platform module. The identity can then be used to determine the trust
level of the virtual machine through remote attestation of the
platform configuration to a policy decision point or third party
authenticator.

 
10. Wenjia Li, Ebiquity

Gossip-Based Outlier Detection for Mobile Ad Hoc Networks

 
It is well understood that Mobile Ad Hoc Networks (MANETs) are
extremely susceptible to a variety of attacks. Many security schemes
have been proposed that depend on identifying nodes that are
exhibiting malicious behavior such as packet dropping, packet
modification, and packet misrouting.  We argue that in general, this
problem can be viewed as an instance of detecting nodes whose
behavior is an outlier when compared to others. In this paper, we
propose a gossip-based outlier detection algorithm for MANETs. The
algorithm leads to a common outlier view amongst distributed nodes
with a limited communication overhead. Simulation results demonstrate
that the proposed algorithm is efficient and accurate.

 
11. Justin Martineau (Advisor: Tim Finin), Ebiquity
Blog Link Classification


Blog links raise three key questions: Why did the author make the
link, what exactly is he pointing at, and what does he feel about it?
In response to these questions we introduce a link model with three
fundamental descriptive dimensions where each dimension is designed
to answer one question. We believe the answers to these questions can
be utilized to improve search engine results for blogs. While proving
this is outside the scope of this paper, we do prove that knowing the
rhetorical role of a link helps determine what the author was
pointing at and how he feels about it.

 
12. Don Miner (Advisor: Marie desJardins), MAPLE

Learning Abstract Rules for Swarm Systems

 
Rule abstraction is an intuitive new tool that we propose for
implementing swarm systems. The methods presented in this poster
encourage a new paradigm for designing swarm applications: engineers
can interact with a swarm at the abstract (swarm) level instead of
the individual (agent) level. This is made possible by modeling and
learning how particular swarm-level properties arise from low-level
agent behaviors. We have developed a procedure for building abstract
rules and discuss how they can be used. We also provide experimental
results showing that abstract rules can be learned by observation.

     The contribution of this work is the method of using rule
abstraction and rule hierarchies to intuitively control groups of
agents at the swarm level. We discuss how the connections between
abstract rules and low-level rules can be defined and learned. Also,
since rule abstraction is a feature of our Swarm Application
Framework, we give background on this development platform. Finally,
we describe a sample application that demonstrate the use of abstract
rules.

 
13. Michael Oehler (Advisor: Phatak S. Dhananjay), Cyber Defense Lab

Secret Key Authentication Using a Context Free Representation for
Secure VoIP Communication


This research defines a context free representation, one that is
independent of a spoken language, to authenticate a Diffie-Hellman
negotiated secret. This context free approach authenticates the
negotiated key by presenting an image in the VoIP user-agent and the
callers simply describe what they see. If they agree, the key is
authenticated and the secure media session continues. The strength of
the approach lies in the vocal recognition of the callers, and their
ability to confer the image displayed by their system. The necessary
degree of visual recognition is achieved by using basic shapes, color
and count. People, regardless of language, age, and culture will have
little difficulty identifying these images and can communicate them
with little effort. We believe that this approach reverses the
current trend in security to divest users from the underlying
cryptographic principles supporting secure systems by abstracting
these principles to a comprehensible form.  This research
demonstrates that the human factor can play a pivotal role in
establishing a secure link and that a single system can be employed
by people speaking many different languages. In this sense, the
approach ameliorates VoIP security, and does so without a significant
infrastructure for authentication. Our approach descends from the
English specific approach found in ZRTP, and could be incorporated
into ZRTP. Integration into other VoIP key agreement systems is also
possible. We have named this approach the Short Authentication
SymbolS VisuallY (SASSY.)

 
14. Randy Schauer, Ebiquity

 A Probabilistic Approach to Distributed System Management

 
The management of large-scale distributed systems is a critical
consideration when focusing on system reliability. As the number of
commodity components within clusters continues to grow, it becomes
increasingly difficult to track the multitude of parameters required
regularly to ensure optimal performance from the system. In this
paper, we discuss a distributed multi-agent system that utilizes
statistical inference to provide the most effective means of managing
these parameters. This solution uses Markov Logic Networks as the
inference technique to validate configurations and operating
environments. We showcase two examples, permission validation and
temperature monitoring, as preliminary examples of how this approach
is resolving differences between various compute nodes.

 
15. Zareen Saba Syed (Advisor: Tim Finin), Ebiquity
Wikipedia as an Ontology for Describing Documents

 
Identifying topics and concepts associated with a set of documents is
a task common to many applications. It can help in the annotation and
categorization of documents and be used to model a person's current
interests for improving search results, business intelligence or
selecting appropriate advertisements.  One approach is to associate a
document with a set of topics selected from a fixed ontology or
vocabulary of terms. We have investigated using Wikipedia's articles
and associated pages as a topic ontology for this purpose. The
benefits of this approach are that the ontology terms are developed
through a social process, maintained and kept current by the
Wikipedia community, represent a consensus view, and have meaning
that can be understood simply by reading the associated Wikipedia
page.  We use Wikipedia articles and the category and article link
graphs to predict concepts common to a set of documents. We describe
several algorithms that we implemented and evaluated to aggregate and
refine results, including the use of spreading activation to select
the most appropriate terms.  While the Wikipedia category graph can
be used to predict generalized concepts, the article links graph
helps by predicting more specific concepts and concepts not in the
category hierarchy. Our experiments show that it is possible to
suggest new category concepts identified as a union of pages from the
page link graph. Such predicted concepts can be used to define new
categories or sub-categories within Wikipedia.

 
    [ Part 2: "Attached Text" ]

_______________________________________________
Csee-faculty-lecturer mailing list
Csee-faculty-lecturer@cs.umbc.edu
http://www.cs.umbc.edu/mailman/listinfo/csee-faculty-lecturer