Today I got an advertising email from GlobalSign (where I previously bought a code signing certificate for Vista kernel drivers some years ago) highlighting their new (?) type of certificates for signing of Adobe PDF files. It made me curious, because, frankly, I've been recently more and more missing this feature. After a quick online research it turned out that this whole Adobe Certified Documents Services (CDS) seem to be nothing new, as apparently even Adobe Reader 6.0 had support for verifying those CDS certificates. The certificates are also available from other popular certification authorities like e.g. Entrust and Verisign, and a couple of others.

So, I immediately felt stupid that I haven't been aware of such a great feature, which apparently is out there for a few years now. Why I thought it was so great a feature? Consider the following scenario…

At our Invisible Things Lab resources page we offer a handful of files to download — slides and some proof of concept code. The website is served over a plaintext HTTP. This means that if you're downloading anything over a public WiFi (hotel, airport lounge, etc) you never know if the PDF you actually get has not been infected somewhere in the middle, e.g. by a guy in the lobby that is messing with the hotel WiFi.

So, one might argue that I should have paid a few hundred bucks and get an SSL certificate for my website and start serving it over HTTPS. But here's the problem — I, as zillions of other small businesses and individuals, host my website on some 5-dollar-a-month one-of-the-thousands hosting provider. I have zero knowledge about what people work there and if they can be trusted, and I also know nothing (and have zero impact) on how secure (or not, for that matter) the server is. (Same applies to my cell phone carrier, ISP, etc, BTW).

Now, the SSL certificate for the website "knows" nothing about how the files on my website should look like, in particular if they are compromised or not. All the SSL certificate does is to give assurance to the remote client that he or she downloaded the actual files that were on the server in the moment of downloading — whether they were the original ones authored by me, or perhaps maliciously modified by somebody who got access to the server.

So, the solution with an SSL certificate would work only if I trusted my web server, which could be assumed only if I run my own dedicated server. That, however, would be an overkill for a small company like ITL, especially that our business is not based on our web presence — in fact the website is maintained mainly for other researchers and students, who can easily download our papers and code from there, and also for the reporters so they can e.g. download a press release from there.

Surprisingly, the website has never been compromised, probably because it doesn't present an interesting target for any skilled person (or maybe exceptionally skilled people work at the hosting provider?). But I cannot know for sure, as I don't constantly monitor all the hashes of all the files, as this would require… well a dedicated server that would be running an SHA1 calculating script in a loop for 24/7 :)

Of course, zillions of other websites works this very same way and present the very same problems.

Now, ability to sign PDFs would be just a great solution here, because I could sign all those files with my certificate, and then all the people downloading stuff from ITL could know they are getting original PDFs that were created on one of the ITL members desktop computers, no matter how compromised the web server or the network connection is.

For the same reasons, I would welcome if others started doing the same, as currently I simply must assume every PDF I download from the net (and PDFs account for the majority of file downloads I do) to be potentially malicious. So, I always open them in my Red or Yellow VM (depending on the source of the download), and only if it "looks good" (very fuzzy term, I know), I might decide to move it to my host desktop (it's easier to work with PDFs on your host, and actually you should use your host desktop for something).

(Yes, I know, Kostya Kortchinsky, or Rafal, can sometimes escape from VMWare, but still I believe that today the best isolation I can get on a desktop, without sacrificing much convince, is via a type II hypervisor. It's horribly inelegant, but well, that's life).

So, I read some more about this Adobe CDS, being all excited about it, and ready to spend a few hundred euros on a certificate, only to realize that it doesn't look as good as I thought.

First disappointment comes from the fact that you must create a PDF using Adobe Acrobat software (not the Reader, but the commercial one). I've created all my PDFs using either Office (in the past) or iWork (today), and none of them seem to offer a way to digitally sign the PDF. I would like to get a simple tool, say pdfsign.exe, that I could use to sign any PDF I have, no matter how I generated it. Also, not surprisingly, the Mac native PDF viewer (Preview) doesn't seem to recognize the digital signature, and I bet some Linux PDF viewers do not as well.

Worst of all, even the Acrobat Reader 9, that I tested under Windows, and that correctly displayed all the CDS information, does one unbelievably stupid thing — it parses and renders the whole PDF before displaying the signature info. So, if you downloaded a malicious PDF, Acrobat Reader will happily open it and parse, without asking you a question of whether you would like to open it (as it is perhaps unsigned). At least I was unable to find an option that would force it to do that. So, if this PDF contained an exploit for the reader, it surely would get executed. Compare this with the (correct) behavior of Vista UAC where it presents the executable signature details before executing it.

You can see how your software works with Adobe PDF signatures, e.g. by looking at this exemplary file signed by GlobalSign.

So, Adobe CDS, in the form they are today, seem to be pretty useless, as far as protection from potentially malicious PDFs is considered (they surely have other positive applications, e.g. to certify about authenticity of e.g. a diploma).

But wouldn't it be great to have such a file signing mechanism globally adopted and not only for PDFs, but for any sort of files, including ZIPs, tgz's, heck, even plain text files? And have our main OSes generically recognize those signatures and display unified prompts of whether we want to allow an application to to open the file or not? Perhaps, in some situations, we could even define policies for specific applications. This seems easy to do from the technical point of view — we just need to "hook" (oh, God, did I say "hook"?) high-level OS API's like e.g. open() or CreateFile().

What about PGP and possibility of using this for signing any sort of files? Well, we use PGP a lot at ITL, but mainly for securing peer-to-peer communication (e.g. between us and our clients). There really is no good way to publish one's PGP key — the concept of Web of Trust might be good for some closed groups of people, but not for publishing files "to the world". And, of course, the first thing that an attacker who subverted PDFs on our website will do is to also subvert the PGP key displayed on the website. I also tried once to publish a PGP key to a key server, but got discouraged immediately after I noticed it didn't use SSL for submission. BTW, anybody knows if the key servers today use SSL? If not, how the trust is established? Maybe email clients, e.g. Thunderbird, come with built in PGP keys for select key servers?

So, I guess that was the main point of writing this post — to express how madly I would welcome a generic, OS-based, non-obligatory, signature verification for files, based on PKI :)

Ah, before a dozen of people jumps to the comment box to tell me that digital signatures do not assure non-maliciousness of anything — please don't do that, because I actually know that. In fact, it is not possible to assure non-maliciousness of pretty much anything, especially without strictly defining an ethical system we would like to use first. What the signatures provide is the liability, so that I know who to sue, in case my naked holiday pictures got leaked to the public because of some malicious PDF exploiting my system. In that case I can sue either the actual person who signed the PDF (if this person is identifiable) or the certification authority who issued the certificate to a wrong (unidentifiable) person.