talk: Enhancing System Security & Privacy with Program Analysis, 12p Tue 3/31, UMBC

Enhancing System Security and Privacy with Program Analysis

Yinzhi Cao

Columbia University

12:00-1:00pm Tuesday, 31 March 2015, ITE 325b, UMBC

Cyber security and privacy have brought the attention from the general public these days. Melissa Hathaway, who advised both President Obama and President Bush, estimated in a report that governments and consumers lost $125 billion annually to cyber-attacks, including losses in tax revenue. In this talk, from the perspective of program analysis, I will discuss the security and privacy of two important computer systems: Web browser and Android system. In the first part, I will introduce how to prevent and detect drive-by download attacks, which penetrate the boundary of a browser principal. In particular, I will present JShield, a vulnerability-based detection engine that is more robust to obfuscated drive-by download attacks, when compared to various anti-virus software and most recent research papers. In the second part, I will introduce EdgeMiner, the first automatic tool that creates summaries of Android framework in the form of callback and registration pairs. With the summaries, existing static analysis system can correctly construct a control flow graph with hidden control flow dependencies introduced by callback methods.

Yinzhi Cao is a postdoctoral scientist at Columbia University. He earned his PhD in computer science at Northwestern University. Before that, he obtained his B.E. degree in electronics engineering at Tsinghua University in China. His research mainly focuses on the security and privacy of web, smart phones, and machine learning. He has published more than ten papers at various security conferences, such as Oakland, NDSS, ACSAC and DSN. His JShield system has been adopted by Huawei, the world’s largest telecommunication company. In the past, he served as a program committee member for IEEE CNS’14 and web chair for AsiaCCS SESP’13. Previously, he also conducted research at SRI International and UC Santa Barbara as a summer intern.

talk: Blind Hashing; securing passwords against offline attack, 11a Fri 3/27 MP101 UMBC


UMBC Cyber Defense Lab

Blind Hashing; a new way to secure
passwords against offline attack

Jeremy Spilman

Founder/CTO of TapLink

11-12 Friday 27 March 2015, M/P 101, UMBC

Industry best practice is to secure passwords using a tunable hashing algorithm; pick the right hashing algorithm, tune its cost factors so it runs slowly and makes optimal use of your hardware, and it’s possible to protect very strong passwords from being cracked. However when average password strength and login latency requirements face off against bot-nets and GPU powered dictionary attacks, the vast majority of passwords are easily cracked. Blind hashing entangles password hashes with a massive pool of random data, so large it cannot be stolen over the network. A simple protocol allows any number of sites to share a centralized petabyte-scale data pool, amortizing the cost for defenders, while protecting low-entropy passwords with minimal run-time cost. Blind hashing can also be used as a general-purpose PBKDF to protect against brute-force attacks, and providing the opportunity to add server-based access policies and revocability to the key derivation process. Following his talk, Jeremy will be happy to discuss potential research opportunities with the company for students interested in developing new implementations of blind hashing for password-based authentication and encryption services.

Jeremy Spilman is the Founder and CTO of TapLink, a startup company that is developing systems using its patented Blind Hashing technique, which can completely protect passwords against offline attack, even if the password database is stolen. He was a double major in Computer Science and Economics at Brandeis University.

talk: Fei Liu (CMU) Summarizing Information in Big Data, 12p Fri 3/27

Summarizing Information in Big Data: Algorithms and Applications

Dr. Fei Liu

School of Computer Science
Carnegie Mellon University

12:00p Friday, 27 March 2015, ITE 325b

Information floods the lives of modern people, and we find it overwhelming. Summarization systems that identify salient pieces of information and present it concisely can help. In this talk, I will discuss both algorithmic and application perspectives of summarization. Algorithm-wise, I will describe keyword extraction, sentence extraction, and summary generation, including a range of techniques from information extraction to semantic representation of data sources; application-wise, I focus on summarizing human conversations, social media contents, and news articles. The data sources span low-quality speech recognizer outputs and social media chats to high-quality content produced by professional writers. A special focus of my work is exploring multiple information sources. In addition to better integration across sources, this allows abstraction to shared research challenges for broader impact. Finally, I try to identify the missing links in cross-genre summarization studies and discuss future research directions.

Dr. Fei Liu is a postdoctoral fellow at Carnegie Mellon University, member of Noah’s ARK. Fei’s research interests are in the areas of natural language processing, machine learning, and data mining, with special emphasis on automatic summarization and social media. From 2011 to 2013, Fei worked as a Senior Research Scientist at Bosch Research, Palo Alto, California, one of the largest German companies providing intelligent car systems and home appliances. Fei received her Ph.D. in Computer Science from the University of Texas at Dallas in 2011, supported by Erik Jonsson Distinguished Research Fellowship. Prior to that, she obtained her Bachelors and Masters degrees in Computer Science from Fudan University, Shanghai, China. Fei has published over twenty peer reviewed articles, and she serves as a referee for leading journals and conferences.

Host: Nilanjan Banerjee and Mohamed Younis

talk: Large-Scale Measurement of Vulnerabilities and Design of Usable New Systems, Noon 3/23, ITE325b


Computer Science and Electrical Engineering
University of Maryland, Baltimore County

 Towards Large-Scale Measurement of Vulnerabilities
and Design of Usable New Systems

Prof. Chuan Yue
University of Colorado Colorado Springs

12:00-1:00 Monday, 23 March 2015, ITE325b, UMBC

Security and privacy vulnerabilities are pervasive in computer and network systems. In my research group, we aim to accurately measure and analyze the vulnerabilities of Web, Cloud, and Mobile systems on a large scale; we also aim to design usable new systems that provide better security and privacy protection to millions of users. In this talk, I will first present our research on analyzing the vulnerabilities of popular Web browsers’ built-in password managers and some third-party browser-and-cloud-based password managers. Next, I will present a framework for automatic detection of information leakage vulnerabilities in JavaScript-based browser extensions including password managers. I will explain why it is very challenging to accurately and automatically analyze JavaScript-based browser extensions, justify why our static and dynamic combined approach is practical and appropriate, and further discuss how we may increase the capabilities of this framework to automatically measure and analyze JavaScript related security and privacy vulnerabilities on a large scale. Finally, I will discuss some of our current and future projects on security and privacy research and education, for example, one project is on measuring users’ susceptibility to sophisticated and highly insidious phishing attacks.

Chuan Yue is an Assistant Professor of Computer Science at the University of Colorado Colorado Springs. His current research focuses on Web, Cloud, and Mobile Systems Security and Privacy. He received his B.E. and M.E. degrees in Computer Science from Xidian University, China, in 1996 and 1999, respectively, and his Ph.D. in Computer Science from the College of William and Mary in 2010. He worked as a Member of Technical Staff at Bell Labs China, Lucent Technologies for four years from 1999 to 2003, mainly on the design and development of the Web-based Distributed Service Management System for Intelligent Network.

For more information and directions: http://bit.ly/UMBCtalks.

Rick Forno discusses cyber warfare in The Diplomatic Courier


CSEE’s Dr. Rick Forno discussed cyber warfare in Ash Hunt’s latest policy paper ‘Cyber Quantifiable Restrictions: The Requirements to Generate Agreed Restrictions on the Use of Cyber Capabilities’ appearing in The Diplomatic Courier. Among other things, Hunt attempts to show that agreed restrictions should not blanket the use of cyber capabilities, but rather the unacceptable use of a range of capabilities that could be used to harm human life.

Recently, it has become apparent that “we’re in a [cyber] arms race” in a largely unregulated domain—the cyber wild west. With the increased diffusion of technology, nations have begun amassing offensive cyber capabilities: utilizing zero-day exploits, distributed denial of server (DDOS) attacks, and weaponized malware technology. Already, “the U.S. has poured billions of dollars into an electronic arsenal,” whilst the “stockpile of exploits runs into the thousands, aimed at every conceivable device.” This exponential growth of cyber arms is particularly dangerous considering the lack of rules and conventions governing the fifth arena of warfare. Dr. Richard Forno from the University of Maryland concedes, “there is no international agreement over what level of cyber warfare is acceptable.” He further recognizes that national systems such as power grids, water treatment plants and medical facilities “do not have adequate protection from hackers.” Clearly, “principles and agreements on cyber warfare must designate sensitive infrastructure as red lines.” It is necessary to afford our critical organizations the same level of protection from cyber hostility as we do from the multitude of other tangible threats.

Source: The Diplomatic Courier Volume 9, Issue 1, January/February 2015

talk: Topic Modeling with Structured Priors for Text-Driven Science


Topic Modeling with Structured Priors for Text-Driven Science

Michael Paul, JHU

12:00pm – 1:00pm, Monday, 2 March 2015, ITE 325

Many scientific disciplines are being revolutionized by the explosion of public data on the web and social media, particularly in health and social sciences. For instance, by analyzing social media messages, we can instantly measure public opinion, understand population behaviors, and monitor events such as disease outbreaks and natural disasters. Taking advantage of these data sources requires tools that can make sense of massive amounts of unstructured and unlabeled text. Topic models, statistical models that describe low-dimensional representations of data, can uncover interesting latent structure in large text datasets and are popular tools for automatically identifying prominent themes in text. However, to be useful in scientific analyses, topic models must learn interpretable patterns that accurately correspond to real-world concepts of interest.

In this talk, I will introduce Sprite, a family of topic models that can encode additional structures such as hierarchies, factorizations, and correlations, and can incorporate supervision and domain knowledge. Sprite extends standard topic models by formulating the Bayesian priors over parameters as functions of underlying components, which can be constrained in various ways to induce different structures. This creates a unifying representation that generalizes several existing topic models, while creating a powerful framework for building new models. I will describe a few specific instantiations of Sprite and show how these models can be used in various scientific applications, including extracting self-reported information about drugs from web forums, analyzing healthcare quality in online reviews, and summarizing public opinion in social media on issues such as gun control.

Michael Paul is a PhD candidate in Computer Science at Johns Hopkins University. He earned an M.S.E. in CS from Johns Hopkins University in 2012 and a B.S. in CS from the University of Illinois at Urbana-Champaign in 2009. He has received PhD fellowships from Microsoft Research, the National Science Foundation, and the Johns Hopkins University Whiting School of Engineering. His research focuses on exploratory machine learning and natural language processing for the web and social media, with applications to computational epidemiology and public health informatics.

– more information and directions: http://bit.ly/UMBCtalks

Two technical talks by Amazon senior staff, 4-6:30pm Tue 3/3

Senior Amazon staff members will give two technical talks on next week on Tuesday, March 3, in the UC Ballroom on topics of great practical interest and utility.

  • Lydia Fitzpatrick, Senior Technical Program Manager for Amazon Mobile Business will give a talk on “Web Performance Optimization” from 4:00pm to 5:00pm.
  • Leo Zhadanovsky, Senior Solutions Architect for Amazon Web Services will present an “Introduction to Amazon Web Services (AWS)” from 5:30pm to 6:30pm. The talk with introduce cloud computing and  discuss the various Networking, Compute, Database, Storage, Application, Deployment and Management services that AWS offers. It will demonstrate how to launch a full three tier LAMP stack in minutes, as well as how to setup a simple web server on AWS.  The presentation will also discuss several use-cases, demonstrating how customers such as Enterprises, Startups, and Government Agencies are using AWS to power their computing needs.

The talks will be preceded and followed by an open networking opportunity with Amazon Human Resource representatives. Amazon is interested in students for internships and full-time position who are majoring in Information Systems, Business Technology Administration, Computer Engineering, Computer Science, and Cybersecurity.

PhD proposal: User Identification in Wireless Networks

Ph.D. Dissertation Proposal

User Identification in Wireless Networks

Christopher Swartz

9:00-11:00pm Friday, 27 February 2015, ITE 325B

Wireless communication using the 802.11 specifications is almost ubiquitous in daily life through an increasing variety of platforms. Traditional identification and authentication mechanisms employed for wireless communication commonly mimic physically connected devices and do not account for the broadcast nature of the medium. Both stationary and mobile devices that users interact with are regularly authenticated using a passphrase, pre-shared key, or an authentication server. Current research requires unfettered access to the user’s platform or information that is not normally volunteered.

We propose a mechanism to verify and validate the identity of 802.11 device users by applying machine learning algorithms. Existing work substantiates the application of machine learning for device identification using Commercial Off-The-Shelf (COTS) hardware and algorithms. This research seeks the refinement of and investigation of features relevant to identifying users. The approach is segmented into three main areas: a data ingest platform, processing, and classification.

Initial research proved that we can properly classify target devices with high precision, recall, and ROC using a sufficiently large real-world data set and a limited set of features. The primary contribution of this work is exploring the development of user identification through data observation. A combination of identifying new features, creating an online system, and limiting user interaction is the objective. We will create a prototype system and test the effectiveness and accuracy of it’s ability to properly identify users.

Committee: Drs. Joshi (Chair/Advisor), Nicholas, Younis, Finin, Pearce, Banerjee

PhD proposal: Scalable Storage System for Big Scientific Data

Ph.D. Dissertation Proposal

MLVFS: A Scalable Storage System For Managing Big Scientific Data

Navid Golpayegani

3:00-5:00pm Tuesday 24 February 2015, ITE 346

Managing peta or exabytes of data with hundreds of millions to billions of files is a necessary first step towards an effective big data computing and collaboration environment for distributed systems. Current file system designs have focused on providing better and faster data distribution. Managing the directory structure for data discovery becomes an essential element of the scalability problems for big data systems. Recent designs are addressing the challenge of exponential growth of files. Still largely unexplored is the research for dealing with the organizational aspect of managing big data systems with hundreds of millions of files. Most file systems organize data into static directory structures making data discovery, when dealing with large data sets, hard and slow.

This thesis will propose a unique Multiview Lightweight Virtual File System (MLVFS) design to primarily deal with the data organizational management problem in big data file systems. MLVFS is capable of the dynamic generation of directory structures to create multiple views of the same data set. With multiple views, the storage system is capable of organizing available data sets by differing criteria such as location or date without the need to replicate data or use symbolic links. In ad- dition, MLVFS addresses scalability issues associated with the growth of the stored files by removing the internal metadata system and replacing it with generally avail- able external metadata information (i.e. data base servers, project compute servers, remote repositories, etc.). This thesis, moreover, proposes to add, plug in capabilities not normally found in file systems that make this system highly flexible, in terms of specifying sources of meta data information, dynamic file format streaming and other file handling features.

The performance of MLVFS will be tested in both simulated environments as well as real world environments. MLVFS will be installed on the BlueWave cluster at UMBC for simulated load testing to measure the performance for various loads. Simultaneously, stable version of MLVFS will run in real world production environ- ments such as those of the NASA MODIS instrument processing system (MODAPS). The MODAPS system will be used to show examples of real world use cases for MLVFS. Additionally, there will be other systems explored for the real world use of MLVFS, such as at NIST for research into Biomedical Image Stitching.

Committee: Drs. Milton Halem (Chair, Advisor), Yelena Yesha, Charles Nicholas, John Dorband, Daniel Duffy

talk: Understanding Social Spammers, Noon Tue 2/24, ITE325

Understanding Social Spammers: A Data Mining Perspective
Xia “Ben” Hu

Computer Science and Engineering
Arizona State University

12:00-1:00 Tuesday, 24 February 2015

With the growing popularity of social media, social spamming has become rampant on all platforms. Many (fake) accounts, known as social spammers, are employed to overwhelm legitimate users with unwanted information. Social spammers are unique due to their coordinated efforts to launch attacks such as distributing ads to generate sales, disseminating pornography and viruses, executing phishing attacks, or simply sabotaging a system’s reputation. In this talk, I will introduce a novel and systematic analysis of social spammers from a data mining perspective to tackle the challenges raised by social media data for spammer detection. Specifically, I will formally define the problem of social spammer detection and discuss the unique properties of social media data that make this problem challenging. By analyzing the two most important types of information, network and content information, I will introduce a unified framework by collectively using heterogeneous information in social media. To tackle the labeling bottleneck in social media, I will show how we can take advantage of the existing information about spam in email, SMS, and on the web for spammer detection in microblogging. I will also present a solution for efficient online processing to handle fast-evolving social spammers.

Xia Hu is a Ph.D. candidate in Computer Science and Engineering at Arizona State University, supervised by Professor Huan Liu. His research interests include data mining, machine learning, social network analysis, etc. As a result of his research work, he has published nearly 40 papers in several major academic venues, including WWW, SIGIR, KDD, WSDM, IJCAI, AAAI, CIKM, SDM, etc. One of his papers was selected for the Best Paper Shortlist in WSDM’13. He is the recipient of IEEE “Atluri Award” Scholarship, 2014 ASU’s President’s Award for Innovation, and Faculty Emeriti Fellowship. He has served on program committees for several major conferences such as WWW, IJCAI, SDM and ICWSM, and reviewed for multiple journals, including IEEE TKDE, ACM TOIS and Neurocomputing. His research attracts wide range of external government and industry sponsors, including NSF, ONR, AFOSR, Yahoo!, and Microsoft.

– more information and directions: http://bit.ly/UMBCtalks