DAX 2022: a one-day data science conference at UMBC, Sat. June 4

a one-day, in-person conference on data science, analytics, and data exploration with food, drinks, networking with experts in the field

DAX 2022

A one-day data science conference at UMBC

Saturday, 4 June 2022


The DAX 2022 Conference will focus on data science, analytics, and general data exploration. Engineers, data scientists, analytic developers, system architects, and business leaders are encouraged to share their experiences and present a topic that would be of interest to the local data community. Expected attendees include engineers, thought leaders, business leaders, and professionals from local government, government defense and intelligence agencies, start-up companies, large data analytic and data science companies, and local universities.

For more information and to register, see the DAX 2022 site. Special registration rate for students!

talk: Iterative Preconditioning for Accelerating Machine Learning Problems, 12-1 4/27

ArtIAMAS Seminar Series
Co-organized by UMBC, UMCP, and Army Research Lab

Iterative Preconditioning for
Accelerating Machine Learning Problems

Nikhil Chopra
Mechanical Engineering, UMCP

12-1 ET Wed. 27 April 2022, WebEx

We study a new approach to accelerating machine learning problems in this talk. The system comprises multiple agents, each with a set of local data points and an associated local cost function. The agents are connected to a server, and there is no inter-agent communication. The agents’ goal is to learn a parameter vector that optimizes the aggregate of their local costs without revealing their local data points. We propose an iterative preconditioning technique to mitigate the deleterious effects of the cost function’s conditioning on the convergence rate of distributed gradient-descent. Unlike the conventional preconditioning techniques, the pre-conditioner matrix in our proposed technique updates iteratively to facilitate implementation on the distributed network. In the particular case when the minimizer of the aggregate cost is unique, our algorithm converges superlinearly. We demonstrate our algorithm’s superior performance in machine learning, distributed estimation, and beamforming problems, thereby demonstrating the proposed algorithm’s efficiency for distributively solving nonconvex optimization problems.

Dr. Nikhil Chopra is a Professor in the Department of Mechanical Engineering at the University of Maryland, College Park. He received a Bachelor of Technology (Honors) degree in Mechanical Engineering from the Indian Institute of Technology, Kharagpur, India, in 2001, an M.S. degree in General Engineering in 2003, and a Ph.D. degree in Systems and Entrepreneurial Engineering in 2006 from the University of Illinois at Urbana-Champaign. His current research interests are in the areas of nonlinear control, robotics, and machine learning. He is the co-author of the book Passivity-Based Control and Estimation in Networked Robotics. He is currently an Associate Editor of Automatica and was previously an Associate Editor of IEEE Transactions on Control of Network Systems and IEEE Transactions on Automatic Control.

talk: Machine Learning: New Methodology for Physical & Social Sciences, 1pm ET 3/24

24 hour LIDAR backscatter profiles and PBLH points generated from image machine learning system

The Infusion of Machine Learning as a New Methodology for the Physical and Social Sciences

Dr. Jennifer Sleeman
CSEE, UMBC

1:00-2:00 pm ET, Wednesday, March 24
Online via WebEx


Machine learning has made improvements in many areas of computing. Recently attention has been given to infusing social science methodology with machine learning. In addition, the physical sciences have begun to embrace machine learning to augment their physical parameterization and to discover new features in their computations. I will describe my work that relates to these new emerging areas of research. I will first describe our machine learning research efforts related to understanding the changing role of climate and its effects on society. I will describe how this methodology was also applied to understanding cyber-related exploits. As part of this work, I developed an expertise in generative modeling, which led to a patent in generative and translation-based methods applied to imagery. These ideas were fundamental to a contribution in machine learning using quantum annealing. Quantum computing holds promise for deep learning to reach model convergence faster than classical computers. I will describe work related to developing a new hybrid method that overcame qubit limitations for image generation. 

In addition, I will describe my current work related to machine learning for the Physical Sciences. As part of a multi-disciplinary team from UMBC and other universities, my current work explores ways to augment and replace existing physical parameterizations with neural network based models. I have led a research effort to calculate the planetary boundary layer’s height (PBLH) used for ceilometer-based backscatter profiles and satellite-borne lidar instruments. This work addresses the largest uncertainty in climate change, namely the role of aerosols (dust, carbon, sulfates, sea salt, etc.). We employ a novel method that includes a deep segmentation neural network that uses near-time continuous profiles forming an image to determine boundary layer heights. This method overcomes limitations in wavelet approaches which are unable to identify the PBLH under certain conditions. I will also give a preview of two efforts related to Long Short Term Memory (LSTM) neural networks related to learning PBLH changes over time. These research efforts result from collaborations with two students in the UMBC CSEE department and are being published and presented at the AAAI 2021 Spring Symposium on Combining Artificial Intelligence and Machine Learning with Physics Sciences. 


Dr. Jennifer Sleeman is a Research Assistant Professor in Computer Science at the University of Maryland, Baltimore County (UMBC). Her research interests include generative models, natural language processing, semantic representation, image generation, and deep learning. Dr. Sleeman received the prestigious recognition of being a 2019 EECS Rising Star. She was also recognized in 2017 as one of the best Data Scientists in the Washington, DC region by DCFemTech. She defended her Ph.D. thesis, Dynamic Data Assimilation for Topic Modeling (DDATM) in 2017 under Tim Finin and Milton Halem. Her thesis-related work was awarded a Microsoft “AI for Earth” resource grant in 2017 and 2018 and also won the best paper award in the Semantic Web for Social Good Workshop presented at International Semantic Web Conference in 2018. She was an invited guest panelist at the AI for Social Good AAAI Fall Symposium in 2019 and was also an invited keynote speaker at the Sixth IEEE International Conference on Data Science and Engineering (ICDSE 2020), where she presented her ideas related to AI for Social Good and Science. She is an active research scientist in generative deep learning methods for which she holds a patent. She has over 12 years of machine learning experience and over 22 years of software engineering experience, in both academic and government/industry settings. She is currently funded by NASA and NOAA (PI). She also teaches Introduction to Artificial Intelligence at the University of Maryland, Baltimore County (UMBC) and currently mentors two Master’s students

Visiting Prof. Ed Raff’s forthcoming book: Inside Deep Learning



Visiting Prof. Ed Raff’s forthcoming book Inside Deep Learning


Congratulation to Dr. Edward Raff for his forthcoming book Inside Deep Learning being published by Manning. The first three chapters are now available free online via Manning’s Early Access Program, with more to come. Dr. Raff is a Chief Scientist at Booz Allen Hamilton and both an alumnus of and visiting assistant professor in the UMBC CSEE department. 

He describes the target audience for his book as “the middle between “give me a tool” and ‘CS/Stats/ML Ph.D. graduate book’ that gives utility and understanding.” He gives thanks to his UMBC students in his Computer Science and Data Science classes who have been “guinea pigs for this book/course material.”

Here’s how the publisher describes the book: “Inside Deep Learning is a fast-paced beginners guide to solving common technical problems with deep learning. Written for everyday developers, there are no complex mathematical proofs or unnecessary academic theory. You’ll learn how deep learning works through plain language, annotated code, and equations as you work through dozens of instantly useful PyTorch examples. As you go, you’ll build a French-English translator that works on the same principles as professional machine translation and discover cutting-edge techniques just emerging from the latest research. Best of all, every deep learning solution in this book can run in less than fifteen minutes using free GPU hardware!”

Ed Raff received a Ph.D. in Computer Science in 2018 with a dissertation on “Malware Detection and Cyber Security via Compression.” He is currently a Chief Scientist at Booz Allen Hamilton. He has done research on deep learning, malware detection, reproducibility in machine learning, detecting fairness and bias in machine learning models and data analytics, and high-performance computing. He has also been a visiting Assistant Professor at UMBC since 2018 and taught in both the Computer Science and Data Science programs. Dr. Raff has over 40 peer-reviewed publications, three best paper awards, and has presented at many major conferences.

talk: Medical Informatics – Promise and Barriers Towards Precise Medicine, 10am ET Mon 11/23, Webex

Dr. Mira Marcus-Kalish of Tel Aviv University

CARTA Distinguished Lecture Series

Medical Informatics – Promise and Barriers Towards Precise Medicine


Dr. Mira Marcus-Kalish
Director, International Research Affairs

Tel Aviv University


10:00 am-12:00 pm ET, Monday, 23 Nov. 2020

Online via Webex


The challenging time facing the pandemic forced us to relate to the human being’s broadband picture and his surrounding as one functioning system across countries and continents. The need is to relate both to the Micro (including in-body, physical, and mental conditions) and the Macro (such as environmental, cultural, and economic factors) providing a comprehensive understanding of the human body functioning in the surrounding, towards a precise, personalized “disease signature,” definition, especially these days. A systematic literature review on the “disease signature” term revealed no clear definition. In many articles, the “disease signature” phrase appears as a single biomarker (often genetic), mainly related to neurology or oncology. (Stemmer, A. at All, 2019. Journal of Molecular Neuroscience, 67(4)). The major goal is the unity of nature, science, and technology, from the nanoscale towards converging knowledge and tools, at a confluence of disciplines, as was envisioned by the NSF in 2001 (NBIC) and further at the joint EU-US WTEC effort “Converging of Knowledge, Technology, Society,” Roco et al., Springer 2013.

The COVID-19 global health emergency increased the need for early precise diagnosis and treatment while facing major physical and mental threat and stress, such as Post Traumatic Stress Disorder (PTSD). These understandings reemphasized the need to join all forces, converge, verify and embed all knowledge, expertise, and new advanced technologies in the various disciplines. Furthermore, it enforced to verify the data originated by various sources while bridging all cultural, conceptual, curation and technology barriers, preserving privacy and ethics regulations and ensuring reliable advanced analysis tools. All of the above provide profound insight into the human body and brain functioning in the surrounding and reliable “Disease Signature,” followed by suitable therapeutic treatment.

The question to be asked: Are we able to collect Big enough data, distributed and representative enough, while bridging all barriers and accurate analysis tools to ensure reliable, replicable, reproducible outcome towards precise, personalized medicine? The Brain Medical Informatics Platform (MIP), developed by the EU Human Brain Flagship Project, as part of the EBRAINS platform, is a key feasibility study along these lines. It involves broad clinical data collections from 30 hospitals, converging knowledge and data, embedding new technologies for data privacy, preservation, and curation, as well as sophisticated analysis tools. The MIP and EBRAINS framework goal is to identify “BRAIN Disease Signatures” towards reliable medical treatment. A 3C (Categorize, Classify, Cluster) Methodology, developed in our lab, is one of the tools available on the MIP. It incorporates expert medical knowledge and experience into the analysis process of disease manifestation and potential biomarkers towards reliable insights. The 3C approach was applied to the ADNI (Alzheimer’s disease Neuro Imaging) cohort, discovering association with new subtypes, which were later verified using the Rome Gemelli hospital labs clinical data. Other case studies were Parkinson’s Disease, genetic and biomarker research: (Tal Kozlovski, et al., 2019, Frontiers in Neurology, Movement Disorders), as well as PTSD research (Ben-Zion et al., 2020, Translational Psychiatry), both in collaboration with the Tel Aviv Medical Center. The COVID-19 global health emergency increased the need for early precise diagnosis and treatment while facing major physical and mental threat and stress, such as Post Traumatic Stress Disorder (PTSD). These understandings reemphasized the need to join all forces, converge, verify and embed all knowledge, expertise, and new advanced technologies in the various disciplines. Furthermore, it enforced to verify the data originated by various sources while bridging all cultural, conceptual, curation and technology barriers, preserving privacy and ethics regulations and ensuring reliable advanced analysis tools. All of the above to provide profound insight into the human body and brain functioning in the surrounding as well as reliable “Disease Signature”, followed by suitable therapeutic treatment.

Providing “Healthy Aging” to the elderly is a perfect example conceiving all, these days, as the elderly became one of the vulnerable groups at risk. The loneliness and isolation forced by the current pandemic results in severe conditions, including stress disorders and PTSD. Thus, an International “Healthy Aging” initiative was established at TAU, promoting broad interdisciplinary research, combining knowledge and data analysis as well as advanced technologies, from most areas of science: including economics, art, social sciences, mental and physical health, lifestyle, engineering, etc. All that to ensure the best fitted reliable treatment and a balanced quality of life to the elderly in general, and in these days, in particular.


Dr. Mira Marcus-Kalish is the Director of International Research Collaborations at Tel Aviv University. Her main areas of research are mathematical modeling, converging technologies, and data mining. Dr. Kalish holds a Ph.D. in Operations Research from the Technion, Israel Institute of Technology, where she developed one of the first computerized systems for electrocardiogram (ECG) diagnosis. Her postdoctoral training was at Harvard University, the MBCRR (Molecular Biology Computer Research and Resource) laboratory, and at the Dana Farber Cancer Institute. She was awarded her B.Sc. in Statistics and Biology from the Hebrew University of Jerusalem

talk: Exploding Blockchain Myths, 5:30pm Tue 10/13


UMBC Data Science Meetup Talk

Exploding Blockchain Myths

Maria Vachino and Dr. James P. Howard

5:30-7:00pm Tuesday, 13 October 2020


In this talk, Maria Vachino from Easy Dynamics and Dr. James P. Howard from APL will provide an overview of what blockchain is and isn’t, focusing on non-cryptocurrency use cases, will explain the results of their research for the DHS S&T Cybersecurity Directorate, and will provide insight into the value (or lack therefore) of the technology.

References:
https://ieeexplore.ieee.org/document/8965252/
http://jitm.ubalt.edu/XXX-3/article3.pdf

Maria Vachino is the Director of Digital Identity at Easy Dynamics where she is focused on Identity Credential & Access Management (ICAM) technologies, policies, & standards, Cybersecurity, and IT modernization for the US Federal Government. She started investigating applications for blockchain technology in 2015 as the Technical and Government Engagement Lead for the DHS S&T Cyber Security Directorate’s Identity Management Research & Development Program while a member of the Senior Professional Staff at the Johns Hopkins Applied Physics Lab. Maria has a BS in Computer Science from UMBC and an MS in Cybersecurity.

Dr. James P. Howard, II (UMBC Ph.D. ’14) is a scientist at the Johns Hopkins Applied Physics Laboratory. Previously, he was a consultant to numerous government agencies, including the Securities and Exchange Commission, the Executive Office of the President, and the United States Department of Homeland Security, and worked for the Board of Governors of the Federal Reserve System as an internal consultant on scientific computing. He is a passionate educator, teaching mathematics and statistics at the University of Maryland Global Campus since 2010 and has taught public management at Central Michigan University, Penn State, and the University of Baltimore. His most recent work has modeled the spread of infectious respiratory diseases and Ebolavirus, predicted global disruptive events, researched using blockchain for government services, and created devices for rescuing victims of building collapse. He is the author of two books.

UMBC Data Science Meetup: Data Analytics Challenges in Healthcare


Best Practices for Handling Data Analytics Challenges in Healthcare


Aaron Wilkowitz
Customer Engineer, Healthcare & Life Sciences, Google

5:30 – 7:00 pm EDT, Tuesday, 15 September 2020
free and online; register here to get the link


Aaron specializes in Healthcare & Federal and has worked with numerous private companies & federal agencies around reaching better healthcare outcomes and minimizing fraud through smarter data. Previously Aaron worked at a predictive analytics firm APT helping Fortune 200 companies drive to better data-driven decisions.

Agenda
5:30 – 5:35 Welcome
5:35 – 6:30 Aaron Wilkowitz Talk
6:30 – 6:45 Q&A

Webex talk: John Mitchell: Will Blockchain Change Everything? Fri 3/27 1-2pm

Lockheed Martin Distinguished Speaker Series

Will Blockchain Change Everything?

Dr. John Mitchell
Mary and Gordon Crary Family Professor
Departments of Computer Science & Electrical Engineering
Stanford University

1:00-2:00pm EST, Friday, 27 March 2020
Webex meeting hosted by Anupam Joshi
https://umbc.webex.com/meet/joshi

Far from serving only as a foundation for cryptocurrency, blockchain technology provides a general framework for trusted distributed ledgers. Over the past few years, their popularity has grown tremendously, as shown by the number of companies and efforts associated with the Linux Foundation’s Hyperledger project, for example. From a technical standpoint, a blockchain combines a storage layer, networking protocols, a consensus layer, and a programmable transaction layer, leveraging cryptographic operations. The distributed state machine paradigm provides atomicity and transaction rollback, while consensus supports distributed availability as well as certain forms of fair access. From an applications perspective, blockchains appeal to distributed networks of independent agents, as arise in supply chain, credentialing, and decentralized financial services. The talk will look at the potential for radical change as well as specific technical challenges associated with verifiable consensus protocols and trustworthy smart contracts.

John Mitchell is the Mary and Gordon Crary Family Professor in the School of Engineering, Professor of Computer Science, co-director of the Stanford Computer Security Lab, and Professor (by courtesy) of Education. He was Vice Provost at Stanford University from 2012 to 2018. Mitchell’s research focusses on programming languages, computer, and network security, privacy, and education. He has published over 200 research papers, served as editor of eleven journals, including Editor-in-Chief of the Journal of Computer Security, and written two books. He has led research projects funded by numerous organizations and served as advisor and consultant to successful small and large companies. His first research project in online learning started in 2009 when he and six undergraduate students built Stanford CourseWare, an innovative platform that served as the foundation for initial flipped classroom experiments at Stanford and helped inspire the first massive open online courses (MOOCs) from Stanford. Professor Mitchell currently serves as Chair of the Stanford Department of Computer Science.

❌ Canceled: UMBC Data Science Meetup: Rapid Data Exploration with Apache Drill ❌

❌ Canceled: UMBC Data Science Meetup:
Rapid Data Exploration with Apache Drill

5:30-7:00 pm 11 March 2020, UC 310, UMBC

Join Charles Givre for a hands-on introduction to data exploration with Apache Drill. Becoming a data-driven business means using all the data you have available, but a common problem in many organizations is that data is not optimally arranged for ad-hoc analysis. Through a combination of lecture and hands-on exercises, you’ll gain the ability to access previously inaccessible data sources and analyze them with ease. You’ll learn how to use Drill to query and analyze structured data, connect multiple data sources to Drill, and perform cross-silo queries. Study after study shows that data scientists and analysts spend between 50% and 90% of their time preparing their data for analysis. Using Drill, you can dramatically reduce the time it takes to go from raw data to insight. This workshop will show you how.

UMBC University Center, Room 310
March 11, 2020, from 5:30 pm to 7:00 pm
(5:30 – 6:00 pm) Social
(6:00 – 6:50 pm) Workshop: Rapid Data Exploration with Apache Drill
(6:50 – 7:00 pm) Question and Answer Session

Register on the Meetup page.

Note that we formally end our Q&A session at 7 pm (so that graduate students can catch their classes starting at 7:10 pm) but in our previous events we’ve seen that one-on-one and group discussions with the speaker(s) continue even after Q&A session

Speaker: Mr. Charles Givre works as a manager at JP Morgan Chase. Prior to joining Deutsche Bank, Mr. Givre worked as a Senior Lead Data Scientist for Booz Allen Hamilton for the last seven years where he works in the intersection of cybersecurity and data science. Mr. Givre taught data science classes at BlackHat, the O’Reilly Security Conference, the Center for Research in Applied Cryptography and Cyber Security at Bar Ilan University. One of Mr. Givre’s research interests is increasing the productivity of data science and analytic teams, and towards that end, he has been working extensively to promote the use of Apache Drill in security applications and is a committer and PMC Chair for the Drill project. Mr. Givre teaches online classes for O’Reilly about Drill and Security Data Science and is a coauthor for the O’Reilly book Learning Apache Drill. Mr. Givre holds a Masters Degree in Middle Eastern Studies from Brandeis University, as well as a Bachelors of Science in Computer Science and a Bachelor’s of Music both from the University of Arizona. He blogs at thedataist.com and tweets @cgivre.

Complimentary food, such as pizza and chips, and non-alcoholic beverages will be provided

Visitor parking spaces are located at Administration Drive Garage upper level, Commons Garage first level, Walker Avenue Garage upper level, Lot 9 and Lot 7 on Walker Avenue. Visitors do not need to pay for parking after 4:00 pm.

Join the UMBC Data Science Meetup group and register for this event here.

UMBC Data Science Meetup: Enabling Value-Based Health Care Using Modern Analytics Tools

Enabling Value-Based Health Care
Using Modern Analytics Tools

Daniel Pichardo and Dr. Xue Yang

UMBC will have its second Data Science meetup on February 27 at UMBC University Center, Room 310. Attendance is free, register here. Visitors can park for free (after 4:00 pm) at the parking lots marked with black arrows in the event photo. The program is as follows

(5:30 – 6:00 pm) Social
(6:00 – 6:50 pm) Talk & Demo: Enabling Value-Based Health Care Using Modern Analytics Tools
(6:50 – 7:15 pm) Question and Answer Session

Speakers: Daniel Pichardo and Dr. Xue Yang

Speaker Bios: Danny Pichardo is a Senior Data Scientist at Newwave. He previously worked as a Statistician at the American Urological Association. He holds a B.Sc. In Statistics from UMBC. His interest and experience include prediction modeling and causal inference using real world health data.

Xue Yang is a data scientist with a solid medical background. At NewWave, Xue works on data analysis and AI/ML model exploration that support data exchange/feedback platform and AI/ML projects. Before joining NewWave, she was building AI/ML models using medical insurance claims data at CareFirst. Xue has a M.P.S. in Data Science from UMBC, a Ph.D. on Genetics and an M.D. from China, and was a postdoc at Johns Hopkins Medical School at the Institute of Genetic Medicine.

Abstract: Value-based healthcare is a healthcare delivery model in which providers are paid based on patient health outcomes. Value-based healthcare requires measuring clinical outcomes and spotting population trends while incentivizing health care providers for the delivery of better healthcare.

New Wave is building next-generation data platforms to improve health outcomes and reduce waste by transforming the wealth of data CMS currently collects, which allows the Center for Medicare and Medicaid Innovation to fulfill its objectives of delivering value-based health care for the citizens. We leverage cloud-based tools such as Snowflake, Looker, and Databricks to provide health care providers a flexible platform to explore their patient’s data, as well as enable data scientists to perform efficient data analyses, model development, and reporting. We will demo how these tools can seamlessly work together, enabling every step of the data science process. Please join us on this journey of transformation as we attempt to modernize and innovate in healthcare.

Parking: Visitor parking spaces are located at Administration Drive Garage upper level, Commons Garage first level, Walker Avenue Garage upper level, Lot 9 and Lot 7 on Walker Avenue. Visitors do not need to pay for parking after 4:00 pm. Register here.

1 2 3 11