Finding Unknown Malice in 10 Seconds: Mass Vetting for New Threats at the Google-Play Scale

Our malware detection platform, MassVet, detected over 100,000 potentially harmful apps (PHA). Among them, 10,000 PHAs are missed by VirusTotal. MassVet can also detect app piracy in a large scale, over millions of apps, in seconds. For more details, please refer to our paper. Or you can use MassVet to analyze Android apps at Android Malware Detection Platform.


Media Coverage (incomplete list)


An app market’s vetting process is expected to be scalable and effective. However, today’s vetting mechanisms are slow and less capable of catching new threats. In our research, we found that a more powerful solution can be found by exploiting the way Android malware is constructed and disseminated, which is typically through repackaging legitimate apps with similar malicious components. As a result, such attack payloads often stand out from those of the same repackaging origin and also show up in the apps not supposed to relate to each other.

Based upon this observation, we developed a new technique, called MassVet, for vetting apps at a massive scale, without knowing what malware looks like and how it behaves. Unlike existing detection mechanisms, which often utilize heavyweight program analysis techniques, our approach simply compares a submitted app with all those already on a market, focusing on the difference between those sharing a similar UI structure (indicating a possible repackaging relation), and the commonality among those seemingly unrelated. Once public libraries and other legitimate code reuse are removed, such diff/common program components become highly suspicious. In our research, we built this “DiffCom” analysis on top of an efficient similarity comparison algorithm, which maps the salient features of an app’s UI structure or a method’s control-flow graph to a value for a fast comparison. We implemented MassVet over a stream processing engine and evaluated it over 1.2 million apps from 33 app markets around the world, the scale of Google Play. Our study shows that the technique can vet an app within 10 seconds at a low false detection rate. Also, it outperformed all 54 scanners in VirusTotal (NOD32, Symantec, McAfee, etc.) in terms of detection coverage, capturing over a hundred thousand malicious apps, including over 20 likely zero-day malware and those installed millions of times. A close look at these apps brings to light intriguing new observations: e.g., Google’s detection strategy and malware authors’ countermoves that cause the mysterious disappearance and reappearance of some Google Play apps.