MS Thesis Defense

Android Malware Detection and Classification
using Machine Learning Techniques

Satyajit Padalkar

10:30am Wednesday, 30 July 2014, ITE 325b

Android is popular mobile operating system and there exists multiple marketplaces for Android applications. Most of these market places allow applications to be signed using self-signed certificates. Due to this practice there exists little or very limited control over the kind of applications that are being distributed. Also advancement of Android root kits are increasingly making it easier to repackage existing Android application with malicious code. Conventional signature based techniques fail to detect such malware. So detection and classification of Android malware is a very difficult problem. We present a method to classify and detect such malware by performing a dynamic analysis of the system call sequences. Here we make use of machine learning techniques to build multiple models using distributions of syscalls as features. Using these models we predict whether given application is malicious or benign. Also we try to classify given application to specific known malware family. We also explore deep learning methods such as stacked denoising autoencoder algorithms (SdA) and its effectiveness. We experimentally evaluate our methods using a real dataset of 600 applications from 38 malware families and 25 popular benign applications from various areas. We find that a deep learning algorithm (SdA) is most accurate in detecting a malware with lowest false positives while AdaBoost performs better in classifying a malware family.

Committee: Drs. Anupam Joshi (chair), Tim Finin and Charles Nicholas