PhD Defense: The Lightweight Virtual File System

Dissertation Defense

The Lightweight Virtual File System

Navid Golpayegani

10:00-12:00 Thursday, 20 July 2017, ITE 325, UMBC

 

A data center today is responsible for safely managing big data volumes and balancing the complex needs between data producers and consumers. This balance often involves reconciling the needs of easy access and rapid retrieval in ways desired by the consumers with the needs of long term availability, reliability, and expandability of data producers. The long term continuous support of data storage adds another layer of complexity for the file system. As storage architecture and big data volumes evolve, existing file system’s primary focus is performance while less attention is payed to addressing the problems of the above long term servicing needs of their clients.

I have developed the Lightweight Virtual File System (LVFS) to address these problems through the unique conceptual approach of separating the most common tasks involved in a file system; namely storing data, locating data, and organizing data. Standard file systems are developed as single monolithic systems performing all three tasks. LVFS replaces these tasks with an architecture which enables the dynamic combination of different algorithms for each of those tasks. Using this approach, LVFS is capable of constructing a storage system, which allows for ready availability, reliability, expandability, and long term support while, simultaneously, assuring the performance of a stable system customizable to meet the needs of data consumers.

After successful development and testing to allow for merging decades old storage architecture with new and incompatible ones, such as HGST Active Archive System, NASA Goddard Space Flight Center’s Terrestrial Information Systems Laboratory adopted LVFS for their production environment to create a single, integrated storage system without any software modifications. UMBC’s Center for Hybrid Multicore Productivity Research deployed an instance on the IBM iDataPlex ‘BlueWave’ cluster to utilize Seagate’s Active Drive systems as a storage and on-disk compute platform. With LVFS we show we were able to perform MapReduce computation directly on the drive with comparable performance to Hadoop running on BlueWave. It also shows a significant reduction in data leaving the active drive during computation thereby significantly increasing throughput.

Committee Members: Dr.s Milton Halem (Advisor), Yelena Yesha, John Dorband, Charles Nicholas, Curt Tilmes

PhD defense: Deep Representation of Lyrical Style and Semantics for Music Recommendation

Dissertation Defense

Deep Representation of Lyrical Style and Semantics for Music Recommendation

Abhay L. Kashyap

11:00-1:00 Thursday, 20 July 2017, ITE 346

In the age of music streaming, the need for effective recommendations is important for music discovery and a personalized user experience. Collaborative filtering based recommenders suffer from popularity bias and cold-start which is commonly mitigated by content features. For music, research in content based methods have mainly been focused in the acoustic domain while lyrical content has received little attention. Lyrics contain information about a song’s topic and sentiment that cannot be easily extracted from the audio. This is especially important for lyrics-centric genres like Rap, which was the most streamed genre in 2016. The goal of this dissertation is to explore and evaluate different lyrical content features that could be useful for content, context and emotion based models for music recommendation systems.

With Rap as the primary use case, this dissertation focuses on featurizing two main aspects of lyrics; its artistic style of composition and its semantic content. For lyrical style, a suite of high level rhyme density features are extracted in addition to literary features like the use of figurative language, profanity and vocabulary strength. In contrast to these engineered features, Convolutional Neural Networks (CNN) are used to automatically learn rhyme patterns and other relevant features. For semantics, lyrics are represented using both traditional IR techniques and the more recent neural embedding methods.

These lyrical features are evaluated for artist identification and compared with artist and song similarity measures from a real-world collaborative filtering based recommendation system from Last.fm. It is shown that both rhyme and literary features serve as strong indicators to characterize artists with feature learning methods like CNNs achieving comparable results. For artist and song similarity, a strong relationship was observed between these features and the way users consume music while neural embedding methods significantly outperformed LSA. Finally, this work is accompanied by a web-application, Rapalytics.com, that is dedicated to visualizing all these lyrical features and has been featured on a number of media outlets, most notably, Vox, attn: and Metro.

Committee: Drs. Tim Finin (chair), Anupam Joshi, Tim Oates, Cynthia Matuszek and Pranam Kolari (Walmart Labs)

PhD Proposal: Analysis of Irregular Event Sequences using Deep Learning, Reinforcement Learning & Visualization

Analysis of Irregular Event Sequences using Deep Learning, Reinforcement Learning, and Visualization

Filip Dabek

11:00-1:00 Thursday 13 July 2017, ITE 346, UMBC

History is nothing but a catalogued series of events organized into data. Amazon, the largest online retailer in the world, processes over 2,000 orders per minute. Orders come from customers on a recurring basis through subscriptions or as one-off spontaneous purchases, resulting in each customer exhibiting their own behavioral pattern when it comes to the way in which they place orders throughout the year. For a company such as Amazon, that generates over $130 billion of revenue each year, understanding and uncovering the hidden patterns and trends within this data is paramount in improving the efficiency of their infrastructure ranging from the management of the inventory within their warehouses, distribution of their labor force, and preparation of their online systems for the load of users. With the ever increasingly availability of big data, problems such as these are no longer limited to large corporations but are experienced across a wide range of domains and faced by analysts and researchers each and every day.

While many event analysis and time series tools have been developed for the purpose of analyzing such datasets, most approaches tend to target clean and evenly spaced data. When faced with noisy or irregular data, it has been recommended to undergo a pre-processing step of converting and transforming the data into being regular. This transformation technique arguably interferes on a fundamental level as to how the data is represented, and may irrevocably bias the way in which results are obtained. Therefore, operating on raw data, in its noisy natural form, is necessary to ensure that the insights gathered through analysis are accurate and valid.

In this dissertation novel approaches are presented for analyzing irregular event sequences using a variety of techniques ranging from deep learning, reinforcement learning, and visualization. We show how common tasks in event analysis can be performed directly on an irregular event dataset without requiring a transformation that alters the natural representation of the process that the data was captured from. The three tasks that we showcase include: (i) summarization of large event datasets, (ii) modeling the processes that create events, and (iii) predicting future events that will occur.

Committee: Drs. Tim Oates (Chair), Jesus Caban, Penny Rheingans, Jian Chen, Tim Finin

PhD defense: ACCESS: Adaptive Contactless Capacitive Electrostatic Sensing System

Dissertation Defense

ACCESS: An Adaptive Contactless Capacitive Electrostatic Sensing System

Alexander Nelson

10:30-12:30 Thursday, 13 July 2017, ITE 325, UMBC

Technological miniaturization and low-power systems have precipitated an explosive growth in capability and adoption of wearable sensors. These kinds of sensors can be applied to many medical and rehabilitative applications, including as an assistive interface. The overarching theme of this thesis is the development of fabric capacitor sensor arrays as a holistic, wearable, touchless sensing solution. These fabric sensors are lightweight, flexible, and can therefore be integrated into items of everyday use. Further, the capacitive sensing hardware is low-power, unobtrusive, and maintainable.

Additionally, gesture-recognition is expanded in this work to touchless capacitor sensor arrays through the ideation, development, and evaluation of an adaptive signal processing algorithm. The algorithm comprises a hierarchy of data reduction techniques that enable real-time processing on a low-power embedded microcontroller. Using a set of adaptive techniques, the system allows for recognition of gestures of different sizes and rotations as well as gestures with noisy or jittery motions. These adaptations enable a set of mobility that encompasses a large portion of people with upper extremity mobility impairments.

The system is developed as an assistive device, with application to environmental control as a Smart-Home controller.The research is conducted with advisement from medical professionals and private consultants, and evaluated in clinical trials by individuals with upper-extremity mobility impairment.

Committee: Drs. Nilanjan Banerjee (Chair), Ryan Robucci (Co-Chair), Chintan Patel, Sandy McCombe-Waller (University of Maryland Medical School), Susan Fager (Madonna Rehabilitation Hospital)

Workshop on Solvers for Large, Sparse Linear Systems, July 17-18

Workshop on Solvers for Large, Sparse Linear Systems

Monday and Tuesday, 17-18 July 2017
Engineering Room 022, UMBC

UMBC will host a free, two-day workshop for faculty and students on solvers for large, sparse linear systems on Monday and Tuesday, July 17-18 in Engineering 022 at UMBC. Thanks to UMBC Prof. Matthias Gobbert for organizing and to University of Kassel Prof. Andreas Meister for presenting. If you plan on attending, please RSVP online.

The simulation of real life applications possesses a crucial importance in a wide variety of scientific as well as industrial areas. Thereby, the performance of the whole numerical method is often decisively depend on the properties of the incorporated solver for linear systems of equations.

The course provides a comprehensive introduction to both classical and modern iterative solvers for a stable, efficient and reliable solution of linear systems and is design for students from many disciplines, including Mathematics, Engineering, Physics, Computer Science, Computer Engineering and Electrical Engineering.

The course content covers

  • Introduction to basics from numerical linear algebra
  • Splitting methods
  • Multi-grid schemes
  • Krylov subspace methods like CG, GMRES, BiCG, CGS, BiCGSTAB
  • Preconditioning

The lectures will be accompanied by practical exercises in MATLAB.

Monday, July 17, 2017

08:30-09:00 Coffee/tea
09:00-10:30 Lecture: Introduction to Splitting Methods
10:30-11:00 Coffee break
11:00-12:00 Lecture: Jacobi-, Gauss-Seidel Methods and Relaxation Techniques 12:00-13:30 Exercise on Splitting Methods
13:30-14:30 Lunch break (participants on their own)
14:30-15:30 Lecture: Method of Conjugate Gradients
15:30-16:00 Coffee break
16:00-17:30 Exercise on Method of Conjugate Gradients

Tuesday, July 18, 2017:

08:30-09:00 Coffee/tea
09:00-10:30 Lecture: Principles of Multigrid Methods
10:30-11:00 Coffee break
11:00-12:30 Lecture: GMRES, BICG, BICGSTAB
12:30-13:30 Lunch break (participants on their own)
13:30-15:00 Exercise on Multigrid and Krylov Subspace Methods
15:00-15:30 Coffee break
15:30-16:30 Lecture: Preconditioning
16:30-17:00 Concluding Discussion

The workshop will be presented by Prof. Dr. Andreas Meister from the Institute for Mathematics, University of Kassel, Germany.  He is an internationally renowned researcher in Numerical Analysis with a specialization including iterative solvers for linear system of equations. These methods are modern and form the basis of all numerical kernels in modern software, such as COMSOL, Matlab, PETSc, and many others. Prof. Dr. Meister has taught classes at UMBC during Fall 2013 when he spent a sabbatical at UMBC as part of the partnership between UMBC and the University of Kassel in Germany.

This workshop is hosted by the UMBC High Performance Computing Facility. Light refreshments are graciously sponsored by the UMBC Division of Information Technology.

Cybersecurity Scholarships for UMBC students

Applications sought for major UMBC cybersecurity scholarships

NSF CyberCorps: Scholarship For Service (SFS)

Scholarships for careers in cybersecurity. Earn full tuition, fees, stipends ($22,500 – $34,000), and more ($2000 books, up to $3000 health benefits, $4000 professional expenses).  For BS, MS, MPS, or PhD in CS, CE, IS, Cyber or related fields. USA citizenship or permanent residency required. Contact Dr. Alan Sherman,  who will send you an application.

In academic year 2017-2018, UMBC will support a total of about six additional SFS Scholars at the BS, MS, MPS, and PhD levels in CS and related fields. Each scholarship is potentially for up to the final two years (three years for PhD and combined BS/MS). Interested full-time degree students should contact  and visit the CISA scholarship page.

Each scholarship covers full tuition, fees, travel, books, and academic year stipend of $34,000 for MS/MPS/PhD, and $22,500 for BS. Applicants must be US citizens or permanent residents capable of obtaining a SECRET or TOP SECRET clearance. Each scholar must work for the federal, state, local, or tribal government (for pay) for one year for each year of award.

Awards made for 2017-2018 will be for one year only, with the potential of renewal if funding permits (we should know by August 31, 2017).  The number of awards to be made will be determined by available funds, since there are differences in costs depending on level and in-state status (we have approximately $352,000 to award in 2017-2018).

All applications must be submitted in paper form with official transcripts and signed original letters on letterhead—no staples, folders, or binders.

Application Deadline: 12noon, Friday, July 14, 2017.   If positions remain open after the deadline, we will continue to accept applications until classes start.

See https://www.sfs.opm.gov/  and http://www.cisa.umbc.edu for more details.

UMBC Data Science Graduate Programs Start in Fall 2017

 

UMBC Data Science Graduate Programs

UMBC’s Data Science Master’s program prepares students from a wide range of disciplinary backgrounds for careers in data science. In the core courses, students will gain a thorough understanding of data science through classes that highlight machine learning, data analysis, data management, ethical and legal considerations, and more.

Students will develop an in-depth understanding of the basic computing principles behind data science, to include, but not limited to, data ingestion, curation and cleaning and the 4Vs of data science: Volume, Variety, Velocity, Veracity, as well as the implicit 5th V — Value. Through applying principles of data science to the analysis of problems within specific domains expressed through the program pathways, students will gain practical, real world industry relevant experience.

The MPS in Data Science is an industry-recognized credential and the program prepares students with the technical and management skills that they need to succeed in the workplace.

Why Data Science?

  • Organizations have a growing need for employees who are experts in the management and interpretation of big data;
  • Our classes are taught by industry experts who combine their professional experience with theory to provide a rigorous classroom experience; and
  • Our small classes are taught with a mix of in-person and online instruction, providing students the best of an in-classroom experience while allowing for work-school life balance.

Why UMBC?

The Data Science graduate program at UMBC is designed to respond to the growing regional and national demand for professionals with data science knowledge, skills, and abilities. Bringing together faculty from a wide range of fields who have a deep understanding of the real-world applications of data analytics, UMBC’s Data Science program prepares students for the workplace through hands-on experience, rigorous academics, and access to a robust network of knowledgeable industry professionals. UMBC’s graduate programs in Data Science offers a wide variety of benefits:

  • Exceptional faculty. The Data Science curriculum brings together UMBC’s Departments of Computer Science & Electrical Engineering, Information Systems, Mathematics and Statistics, and several departments from the social sciences to provide students with a rigorous and thorough base of knowledge. Faculty have particular strengths in addressing critical social questions through the application of data science.
  • Rigorous research. UMBC is classified by the Carnegie Foundation as a Research University (High Research Activity).
  • National recognition. For six years running (2009-2014), UMBC was ranked #1 in the U.S. News and World Report’s list of “national up-and-coming” universities
  • Convenient classes. Classes are conveniently offered in the evening on UMBC’s main campus, located just ten minutes from BWI Airport, with easy access to I-95 and the 695 Beltway

For more information and to apply online, see the Data Science MPS site.

ABC features UMBC cybersecurity student scholars

Students at UMBC are learning how to hack into systems and prevent attacks. They study hardwarre, software and the tools in between.

 

Jamie Costello from ABC’s Baltimore affiliate WMAR has a short video feature, UMBC is on a mission to crack the code, on UMBC students who are studying and doing research on the cybersecurity of computing hardware, software and systems.

If you walk through your door and notice your home computer in pieces scattered throughout the house, call UMBC.

In the old days, parents wanted their children to grow up to become doctors and lawyers, now its about becoming cyber security experts.

A select group of students at UMBC knew this was for them. Some tore computers apart. Some knocked XBOX players off their game on purpose. And one student, while in high school and with the school’s blessing, hacked into the school’s security camera system.

Jobs are like gnats on a summer night, college graduates are swatting the offers away. And the pay is good, really good.

Students are learning how to hack into systems and then prevent such attacks. They are studying hardware, software and tools in between. The more we invent and tie into the internet, the more cyber security experts are needed.

Virtual Reality Design for Science student projects, 12-1:30 Wed. 5/10, ITE 201b

 

Everyone is invited to see presentations and demonstrations of  six class projects done by the 17 students in CMSC 491/691, Virtual Reality Design for Science, taught by CSEE Professor Jian Chen this spring.  The demonstrations and presentations will take place 12:00-1:30pm Wednesday, 10 May 2017 in the π² Immersive Hybrid Reality Lab located in room 201b in the ITE building. Join us in this new adventure to explore ideas and foster interaction and interdisciplinary science. Pizza will be provided.

  • Utilizing VR simulations to study the effect of food labeling on college students meal choices, by Elsie, Kristina, and Michael
  • Integrating spatial-and-non-spatial approaches for interactive quantum physics data analyses, by Henan, John, and Nick
  • Analyzing the benefits of immersion for environmental research, by Caroline, James, and Peter
  • CPR training effectiveness, by Joey, Justin, and Zach
  • Quantitative measurement of cosmological pollution visualization, by Kyle, Pratik, and Vineet
  • Memorable mobile-VR-based campus tour, by Abhinav and Vincent

Support for this new course was provided by an award from the UMBC Hrabowski Fund for Innovation to CSEE Professors Jian Chen, Marc Olano and Adam Bargteil.  The project-oriented class introduces students to the use of hybrid reality displays, 3D modeling, visualization and fabrication to conduct and analyze scientific research. The new course embraces the university’s goal of advancing interdisciplinary and multidisciplinary research activity.

The UMBC π² Immersive Hybrid Reality Lab is funded by a $360,000 NSF award, with additional support from Next Century Corporation. In the lab, users wear 3D glasses with sensors attached to them and operate handheld controls that allow them to sensorially immerse themselves in data, which appears on dozens of high-resolution screens that are precisely aligned to work together. Users control the data by manipulating it in the space around them. The user’s body is fairly stationary, but the brain thinks the body is moving within the virtual world. The lab brings together tools “that will allow humans and the computer to augment each other,” notes Dr. Chen.

talk: Data-Driven Applications in Smart Cities, 1pm Fri May 5

UMBC CSEE Seminar Series

Data-Driven Applications in Smart Cities—Data and Energy Management in Smart Grids

Zhichuan Huang
University of Maryland, Baltimore County

1:00-2:00pm, Friday, 5 May 2017, ITE 231

The White House announced the Smart Cities Initiative with an $160 million investment to address emerging challenges in this inevitable urbanization. Under the scope of this initiative, my work addresses emerging problems in the smart energy systems in connected communities with a data-driven approach, including sensing hardware design, streaming data collection to data analytics and privacy, system modeling and control, application design and deployments. In this talk, I will focus on an example of data driven solutions for data and energy management in smart grids. I will first show how to collect the energy data from large-scale deployed low-cost smart meters and minimize the communication and storage overhead. Then I will show how we can conduct energy data analytics with the collected energy data and utilize data analytics results for real-time energy management in a microgrid to minimize the operational cost. Finally, I will present the real-world impact of my research and some future work about CPS in smart cities.

 

Zhichuan Huang is a Ph.D. candidate in Department of Computer Science and Electrical Engineering at the University of Maryland, Baltimore County. He is interested in incorporating big data analytics in Cyber-Physical Systems (also known as Internet of Things under some contexts) for data driven applications in Smart Connected Communities. His current focus is on data driven solutions for smart energy systems including from sensing hardware design, streaming data collection to data analytics and privacy, system modeling and control, application design and deployments. His technical contributions have led to more than 20 papers, featuring 14 first-author papers in premier venues, e.g., IEEE BigData, ICCPS, IPSN, RTSS and best paper runner-up in BuildSys 2014.

Organizer: Tulay Adali

About the CSEE Seminar Series: The UMBC Department of Computer Science and Electrical Engineering presents technical talks on current significant research projects of broad interest to the Department and the research community. Each talk is free and open to the public. We welcome your feedback and suggestions for future talks.

1 2 3 35