MtdScout: Complementing the Identification of Insecure Methods in Android Apps via Source-to-Bytecode Signature Generation and Tree-based Layered Search

Abstract

Modern Android apps consist of both host app code and third-party libraries. Traditional static analysis tools conduct taint analysis for API misuses on the entire app code, while third-party library (TPL) detection tools focus solely on library code. Both approaches, however, are prone to some inherent false negatives: taint analysis tools may neglect third-party libraries or face timeouts/errors in whole app-based analysis, and TPL detection tools are not designed for pinpointing specific vulnerable methods. These challenges underscore the need for enhanced identification of insecure methods in Android apps, particularly for app markets addressing open-source security incidents.

In this paper, we aim to complement the identification of missed false negatives in both TPL detection and taint analysis by directly identifying clones of insecure methods, regardless of whether they are in the host app code or a shrunk library. We propose MtdScout, a novel cross-layer, method-level clone detection tool for Android apps. MtdScout generates bytecode signatures for flawed source methods using compiler-style interpretation and abstraction, and efficiently matches them with target app bytecode using signature-mapped search trees. Our experiment using ground-truth apps shows that MtdScout achieves the highest accuracy among three tested clone detection tools, with a precision of 92.5% and recall of 87.2%. A large-scale experiment with 23.9K apps from Google Play demonstrates MtdScout's effectiveness in complementing both LibScout and CryptoGuard by identifying numerous false negatives they missed due to app shrinking, method-only cloning, and inherent timeouts and failures in expensive taint analysis. Additionally, our experiment uncovers four security findings that highlight the disparities between MtdScout's method-level clone detection and package-level library detection.

Bibtex

@inproceedings{zhang2024mtdscout,
    author = {Zicheng, Zhang and Haoyu, Ma and Daoyuan, Wu and Debin, Gao and Xiao, Yi and Yufan, Chen and Yan, Wu and Lingxiao, Jiang},
    title = {{MtdScout}: Complementing the Identification of Insecure Methods in Android Apps via Source-to-Bytecode Signature Generation and Tree-based Layered Search},
    year = {2024},
    address = {Vienna, Austria},
    booktitle = {Proceedings of the 9th IEEE European Symposium on Security and Privacy},
    series = {EuroS&P '24}
}

Related Works

  • BlockScope: Detecting and Investigating Propagated Vulnerabilities in Forked Blockchain Projects [pdf]
  • When Program Analysis Meets Bytecode Search: Targeted and Efficient Inter-procedural Analysis of Modern Android Apps in BackDroid [pdf]
  • An Empirical Study of Potentially Malicious Third-Party Libraries in Android Apps [pdf]