More Thoughts on CPU backdoors
Let me stress that what Loic writes about in the paper are only hypothetical backdoors, i.e. no actual backdoors have been found on any real CPU (ever, AFAIK!). What he does is he considers how Intel or AMD could implement a backdoor, and then he simulate this process by using QEMU and implementing those backdoors inside QEMU.
Loic also focuses on local privilege escalation backdoors only. You should however not underestimate a good local privilege escalation — such things could be used to break out of any virtual machine, like VMWare, or potentially even out of a software VMs like e.g. Java VM.
The backdoors Loic considers are somewhat similar in principle to the simple pseudo-code one-liner backdoor I used in my previous post about hardware backdoors, only more complicated in the actual implementation, as he took care about a few important details, that I naturally didn't concern. (BTW, the main message of my previous post about was how cool technology this VT-d is, being able to prevent PCI-based backdoors, and not about how doomed we are because of Intel- or AMD-induced potential backdoors).
Some people believe that processor backdoors do not exist in reality, because if they did, the competing CPU makers would be able to find them in each others' products, and later would likely cause a "leak" to the public about such backdoors (think: Black PR). Here people make an assumption that AMD or Intel is technically capable of reversing each others processors, which seems to be a natural consequence of them being able to produce them.
I don't think I fully agree with such an assumption though. Just the fact that you are capable of designing and producing a CPU, doesn't mean you can also reverse engineer it. Just the fact that Adobe can write a few hundred megabyte application, doesn't mean they are automatically capable of also reverse engineering similar applications of that size. Even if we assumed that it is technically feasible to use some electron microscope to scan and map all the electronic elements from the processor, there is still a problem of interpreting of how all those hundreds of millions of transistors actually work.
Anyway, a few more thoughts about properties of a hypothetical backdoors that Intel or AMD might use (be using).
First, I think that in such a backdoor scenario everything besides the "trigger" would be encrypted. The trigger is something that you must execute first, in order to activate the backdoor (e.g. the CMP instruction with particular, i.e. magic, values of some registers, say EAX, EBX, ECX, EDX). Only then the backdoor gets activated and e.g. the processor auto-magically escalates into Ring 0. Loic considers this in more detail in his paper. So, my point is that all the attacker's code that executes afterwards, think of it as of a shellcode for the backdoor, that is specific for the OS, is fetched by the processor in an encrypted form and decrypted only internally inside the CPU. That should be trivial to implement, while at the same time should complicate any potential forensic analysis afterwards — it would be highly non-trivial to understand what the backdoor actually have done.
Another crucial thing for a processor backdoor, I think, should be some sort of an anti-reply attack protection. Normally, if a smart admin had been recording all the network traffic, and also all the executables that ever got executed on the host, chances are that he or she would catch the triggering code and the shellcode (which might be encrypted, but still). So, no matter how subtle the trigger is, it is still quite possible that a curious admin will eventually find out that some tetris.exe somehow managed to breakout of a hardware VM and did something strange, e.g. installed a rootkit in a hypervisor (or some Java code somehow was able to send over all our DOCX files from our home directory).
Eventually the curious admin will find out that strange CPU instruction (the trigger) after which all the strange things had happened. Now, if the admin was able to take this code and replicate it, post it to Daily Dave, then, assuming his message would pass through the Moderator (Hi Dave), he would effectively compromise the processor vendor's reputation.
An anti-replay mechanism could ideally be some sort of a challenge-response protocol used in a trigger. So, instead having you always to put 0xdeadbeaf, 0xbabecafe, and 0x41414141 into EAX, EBX and EDX and execute some magic instruction (say CMP), you would have to put a magic that is a result of some crypto operation, taking current date and magic key as input:
Magic = MAGIC (Date, IntelSecretKey).
The obvious problem is how the processor can obtain current date? It would have to talk to the south-bridge at best, which is 1) nontrivial, and 2) observable on a bus, and 3) spoof'able.
A much better idea would be to equip a processor with some sort of an eeprom memory, say big enough to hold one 64-bit or maybe 128-bit value. Each processor would get a different value flashed there when leaving the factory. Now, in order to trigger the backdoor, the processor vendor (or backdoor operator, think: NSA) would have to do the following:
1) First execute some code that would read this unique value stored in eeprom for the particular target processor, and send this back to them,
2) Now, they could generate the actual magic for the trigger:
Magic = MAGIC (UniqeValueInEeprom, IntelSecretKey)
3) ...and send the actual code to execute the backdoor and shellcode, with the correct trigger embedded, based on the magic value.
Now, the point is that the processor will automatically increment the unique number stored in the eeprom, so the same backdoor-exploiting code would not work twice for the same processor (while at the same time it would be easy for NSA to send another exploit, as they know what the next value in the eeprom should be). Also, such a customized exploit would not work on any other CPU, as the assumption was that each CPU gets a different value at the factory, so again it would not be possible to replicate the attack and proved that the particular code has ever done something wrong.
So, the moment I learn that processors have built-in eeprom memory, I will start thinking seriously there are backdoors out there :)
One thing that bothers me with all those divagations about hypothetical backdoors in processors is that I find them pretty useless in at the end of the day. After all, by talking about those backdoors, and how they might be created, we do not make it any easier to protect against them, as there simply is no possible defense here. Also this doesn't make it any easier for us to build such backdoors (if we wanted to become the bad guys for a change). It might only be of an interest to Intel or AMD, or whatever else processor maker, but I somewhat feel they have already spent much more time thinking about it, and chances are they probably can only laugh at what we are saying here, seeing how unsophisticated our proposed backdoors are. So, my Dear Reader, I think you've been just wasting time reading this post ;) Sorry for tricking you into this and I hope to write something more practical next time :)