Virtualization Detection vs. Blue Pill Detection
The message of the first part was that we don’t believe it’s possible to implement effective kernel protection on any general purpose OS based on monolithic kernel design.
The second part, the one about virtualization, had several messages...
- The main point was that detecting virtualization is not the same as detecting virtualization based malware. As hardware virtualization technology gets more and more widespread, many machines will be running with virtualization mode enabled, no matter whether blue pilled or not. In that case blue pill-like malware doesn’t need to cheat that virtualization is not enabled, as it’s actually expected that virtualization is being used for some legitimate purposes. In that case using a "blue pill detector", that in fact is just a generic virtualization detector is completely pointless.
Obviously in such scenarios blue pill-like malware must support nested hypervisors. And this is what we have implemented in our New Blue Pill. We can run tens of blue pills inside each other and they all work! You can try it by yourself, but you should disable comport debug output to do more then twenty nested pills. We still fail at running Virtual PC 2007 as a nested hypervisor (when it’s guest switches to protected mode), but we hope to have this fixed in the coming weeks (please note that VPC’s hypervisor doesn’t block blue pill from loading – see our slides for more info).
In other words, if somebody announces to the world that they can fight virtualization based malware using generic virtualization detectors, it’s like if they said that they can detect e.g. a botnet agent, just by detecting that an executable is using networking! - We have also decided to discuss how blue pill could potentially cheat those generic virtualization detectors, even though we don’t believe it would be necessary in the coming years, as everything will be virtualized anyways (see previous point). But, we still decided to look into some of the SVM detection methods. First, we found out that many methods that people described as a way to detect virtualization do not work in the simple form as they were described. We took a closer look e.g. at the TLB profiling methods that were suggested by several researchers as a reliable method for virtualization detection. However all the papers that were describing this method missed the fact that some of the caches are not fully associative and one needs to use special effort (which means additional complexity) to make sure to e.g. fill the whole TLB L2 buffer. Obviously we provided all the necessary details of how to write those detectors properly (we even posted one such detector).
In other words - we believe that it will always be possible to detect virtualization mode using various tricks and hacks, but: 1) those hacks could be forced to be very complex and 2) in case virtualization is being used on the target computer for some legitimate purposes all those methods fail anyway (see point 1). - Some people might argue that maybe then we should build these virtualization detectors into all the legitimate hypervisors (e.g. Virtual PC hypervisor), so that they know at least whether they are being run on a native machine or maybe inside blue pill. However this approach contradicts the rules we use to build secure and effective hypervisors. These rules say that hypervisors should be as small as possible and there should be no 3rd party code allowed there.
Now imagine that A/V company try to insert their virtualization detectors (which BTW would have to be updated from time to time to support e.g. new processor models) into hypervisors – if that ever happened, it would be a failure of our industry. We need other methods to address this threat, methods that would be based on documented, robust and simple methods. Security should not be built on bugs, hacks and tricks!
We posted the full source code of out New Blue Pill here. We believe that it will help other researchers to to analyze this threat and hopefully we will find a good solution soon, before this ever become widespread.
Happy bluepilling!
On a side note: now I can also explain (if this is not clear already) how we were planning to beat our challengers. We would simply ask them to install Virtual Server 2005 R2 on all the test machines and we would install our New Blue Pill on just a few of them. Then their wonderful detectors would simply detect that all the machines have SVM mode enabled, but that would be a completely useless information. Yes, we still believe we would need a couple of months to get our proof-of-concept to the level we would be confident that we will win anyway (e.g. if they used memory scanning for some “signature).
BTW, you might be wondering why I introduced the “no CPU peek for more then 1s” requirement? I will leave finding an answer as an exercise from a psychology to my dear readers ;)