A Platform for Android Application Analysis
Update (June 13, 2016): Final Version of DroidSafe released
Today, we updated our public repo such that it represents the final version of our project source code. We have also released our modifications to Soot for our object-sensitive points to analysis as a separate repo.
This final version of the code does not exactly reproduce the results in our NDSS 2015 paper. If you need the version of our code that was used for the NDSS 2015 paper, please email the group.
Note that the DroidSafe project is no longer active, and this code is no longer actively maintained. However, please email the group if you have any questions as we encourage users.
The DroidSafe project developed novel program analysis techniques to diagnose and remove malicious code from Android mobile applications. The DroidSafe project was developed by MIT’s Center for Resilient Software, the Kestrel Institute, and Global InfoTek, Inc. The core of our system is a static information-flow analysis that operates on either Java bytecode for an application or an application’s APK. The DroidSafe team co-designed a semantic model of Android runtime behaviors and a static information-flow analysis to achieve acceptable precision, accuracy, and scalability for real-world Android applications.
Goals and Disclaimer
The goal of the DroidSafe project was to achieve a practical balance of scalability, precision, accuracy, and comprehensiveness, given the applications that DARPA wishes to analyze. Android is a huge framework, and many Android applications are huge, stressing modern static analysis techniques. The DARPA program under which DroidSafe was developed placed sensible restrictions on programs to be analyzed:
- Up to 50K LOC (including libraries),
- Android 4.4.1 only,
- No apps that require Google Play, must run on stock AOSP,
- Limited use of reflection,
- No dynamic code loading,
- Use a limited set of Android API classes, and
- No native code.
This set of applications is important to DARPA because they can place restrictions on their third-party application developers, and for these apps, DARPA wanted an advanced automated analysis system.
The DroidSafe analysis is a product of these app restrictions. DroidSafe was designed to scale for small / medium-sized (complete) programs and really push accuracy and precision.
Please do not expect to run DroidSafe on large apps from the Google Play store, and expect DroidSafe to complete and/or give you accurate results. It was not designed for this, and really, no static, whole-program information flow analysis can achieve all the goals of scalability, precision, accuracy, and comprehensiveness at this time.
The DroidSafe system includes:
Comprehensive, accurate, and precise Android runtime semantics model. The model was seeded with the Java code from the Android Open Source Project’s (AOSP) implementation of Android 4.4.1. The DroidSafe team then automatically and manually added semantics to this model to account for native code semantics and runtime code semantics not included in the AOSP Java code. The model includes a manually-verified core that accounts for over 98% of API calls in Android applications. The model provides a single language solution for Android static analysis.
A comprehensive set of sensitive source method calls defined on the Android API version 4.4.1.
A comprehensive set of sink method calls that can exfiltrate information beyond application boundaries defined on Android API version 4.4.1
Scalable and precise global static analysis optimized for the information flow problem on Android. This includes a deeply object-sensitive global points-to analysis with a custom solver, and a global call-site sensitive, object-sensitive, field-sensitive, and flow-insensitive taint analysis.
A plugin for the Eclipse IDE designed to help a trusted human analyst rapidly triage an unknown Android application. The plugin, called the DroidSafe Navigator, presents our information-flow analysis and points-to analysis results overlaid on the source code for an application. The DroidSafe Navigator also includes features to guide an analyst to sensitive portions of an application based on API usage and implementation idioms.
Our recent publication below demonstrates that the DroidSafe information-flow analysis system achieves unprecedented precision and accuracy for Android information-flow analysis (as measured on a standard previously published set of benchmark applications). Furthermore, DroidSafe detects all malicious information flow leaks inserted into 24 real-world Android applications by three independent, hostile Red-Team organizations. The previous state-of-the art analysis, in contrast, detects less than 10% of these malicious flows.
- Information Flow Analysis of Android Applications in DroidSafe. Michael I. Gordon, Deokhwan Kim, Jeff Perkins, Limei Gilham, Nguyen Nguyen, and Martin Rinard. NDSS 2015, San Diego, CA, February, 2015.
We include a detailed introduction to running our analysis and inspecting an application in our Eclipse plugin here. We recommend running DroidSafe on a machine with at least 64 GB of memory.
DroidSafe Analysis Source
DroidSafe is built on top of the Soot Java program analysis framework. We have made extensive modifications to Soot, and include those modifications as a jar in our github repository. We will soon make available the source code for our Soot modifications. The modifications include a new object-sensitive points-to analysis built on top of Soot’s SPARK framework.
DroidSafe also incorporates the Java String Analyzer to resolve strings in Android programs. We have made modifications to JSA for DroidSafe and include a jar file in our repository.
If you would like to inspect or extend the DroidSafe analysis, the root of the source code is here. We in the process of improving the code documentation.
DroidSafe Android Semantic Model (Android Device Implementation)
Our source repository includes the Java source code for our semantic model of the Android API and runtime. The model is appropriate for flow-insensitive analyses of Android applications focused on data-flow and allocation effects in the API / Runtime. All semantics are represented in Java. However, there are precision / accuracy increasing that run as part of our analysis. The Java implementation includes annotations on methods and fields that denote sensitive sources and sinks. The source code is rooted here and follows the package structure of the Android API.
The DroidSafe team contributed 40 micro-applications to the DroidBench Android Information Flow benchmark suite. The malicious Android applications from APAC cannot be release at this time due to our contract with DARPA.
Authors and Contributors
@mgordon is the DroidSafe project leader at MIT.
Support or Contact