Experimenting with WebP

A few years ago, Google put out the WebP image format. I won’t dive in to the merits of WebP, Google does a good job of that.

For now, I wanted to focus on how I could support it for my website. The thinking that if I am happy with the results here then I can use it in other more useful ways. The trick with WebP is it isn’t supported by all browsers, so a flat “convert all images to WebP” approach wasn’t going to work.

Enter the Accept request header. When a browser makes a request, it includes this header to indicate to the server what the browser is capable of handling, and the preference for the content. Chrome’s Accept header currently looks like this:


Chrome explicitly indicates that it is willing to process WebP. We can use this to conditionally rewrite what file is returned by the server.

The plan was to process all image uploads and append “.webp” to the file. So, foo.png becomes foo.png.webp. We’ll see why in a bit. The other constraint is I don’t want to do this for all images. Images that are part of WordPress itself such as themes will be left alone, for now.

Processing the images was pretty straightforward. I installed the webp package then processed all of the images in my upload directory. For now we’ll focus on just PNG files, but adapting this to JPEGs is easy.

find . -name '*.png' | (while read file; do cwebp -lossless $file -o $file.webp; done)

Note: This is a bit of a tacky way to do this. I’m aware there are probably issues with this script if the path contains a space, but that is something I didn’t have to worry about.

This converts existing images, and using some WordPress magic I configured it to run cwebp when new image assets are uploaded.

Now that we have side-by-side WebP images, I configured NGINX to conditionally serve the WebP image if the browser supports it.

map $http_accept $webpext {
    default         "";
    "~*image/webp"  ".webp";

This goes in the server section of NGINX configuration. It defines a new variable called $webpext by examining the $http_accept variable, which NGINX sets from the request header. If the $http_accept variable contains “image/webp”, then the $webpext variable will be set to .webp, otherwise it is an empty string.

Later in the NGINX configuration, I added this:

location ~* \.(?:png|jpg|jpeg)$ {
	add_header Vary Accept;
	try_files $uri$webpext $uri =404;
    #rest omitted for brevity

NGINX’s try_files is clever. For PNG, JPG, and JPEG files, we try and find a file that is the URI plus the webpext variable. The webpext variable is empty if the browser doesn’t support it, otherwise it’s .webp. If the file doesn’t exist, it moves on to the original. Lastly, it returns a 404 if neither of those worked. NGINX will automatically handle the content type for you.

If you are using a CDN like CloudFront, you’ll want to configure it to vary the cache based on the Accept header, otherwise it will serve WebP images to browsers that don’t support it if the CDN’s cache is primed by a browser that does support WebP.

So far, I’m pleased with the WebP results in lossless compression. The images are smaller in a non-trivial way. I ran all the images though pngcrush -brute and cwebp -lossless and compared the results. The average difference between the crushed PNG and WebP is 15,872.77 bytes (WebP being smaller). The maximum is 820,462. The maximum was 164,335 bytes, and the least was 1,363 bytes. Even the smallest difference was a whole kilobyte. That doesn’t seem like much, but its a huge difference if you are trying to maximize the use of every byte of bandwidth. Since non of the values were negative, WebP outperformed pngcrush on all 79 images.

These figures are by no means conclusive, it’s a very small sample of data, but it’s very encouraging.

Sites Changes

Eventually I’ll blog about them in detail, but I’ve made a few changes to my site.

First, I turned on support for HTTP/2. Secondly, I added support for the CHACHA20_POLY1305 cipher suite. Third, if your browser supports it, images will be served in the WebP format. Currently the only browser that does is Chrome.

My blog tends to be a vetting process for adopting things. If all of these things go well, then I can start recommending them in non-trivial projects.

Exchange Issues on El Capitan with Office 365

I just upgraded to El Capitan, and for the most part I haven’t run into any issues, except my Exchange accounts seemed wrong now. It was asking for my password, and it changed the username from kevin@thedomain.com to just “kevin”.

To fix this, I just removed the Exchange account and thought to re-add it. However, it would not accept my username and password.

Unable to verify account name or password

The error was always “Unable to verify account name or password”, and I knew the password and username were correct.

Completing the dialog with the Internal and External URL worked, however. For Office 365, setting the URLs to “https://outlook.office365.com/EWS/Exchange.asmx” and completing the rest, and using my full email address for the username worked.

FiddlerCertGen, a certificate generator for Fiddler

Eric Lawrence wrote a pretty interesting post on how Fiddler intercepts HTTPS traffic. Essentially, it generates a root x509 certificate and asks the user to trust it, then generates end-entity certificates on-the-fly for each domain visited with the root as the signer. One bit caught my interest especially:

If you’re so inclined, you can even write your own certificate generator (say, by wrapping OpenSSL) and expose it to Fiddler using the ICertificateProvider3 interface.

In my ever expanding interest in writing extensions for Fiddler, I decided this would be something fun to try, especially because I’ve written such code in the past. It should have been easy to just take that code and fit it to the interface.

The result of that is a GitHub project called FiddlerCertGen. It provides a few advantages over the built in ones that Fiddler provides.

For Windows Vista (and Server 2008) and later, it uses Microsoft’s CNG API (Cryptographic Next Generation) instead of CAPI. CNG offers some benefits, such as being able to use elliptic curve cryptography. In fact, for Vista+, the project will use ECDSA 256-bit keys. The key generation here is slightly faster than that of RSAs, so it may offer a slight bump in performance from Fiddler having to generate RSA keys. For Windows XP and Windows Server 2003, it will fall back to RSA 2048.

It’s fairly easy to change this, if you want. The static constructor for for the FiddlerCertificate class initializes the configuration:

_algorithm = PlatformSupport.HasCngSupport ? Algorithm.ECDSA256 : Algorithm.RSA;
_keyProviderEngine = PlatformSupport.HasCngSupport ? KeyProviders.CNG : KeyProviders.CAPI;
_signatureAlgorithm = HashAlgorithm.SHA256;

You can use RSA keys with CNG – in fact if you are on Windows Vista or greater I’d recommend using CNG no-matter-what. What you cannot do is use ECDSA with CAPI. The signature algorithm is set to SHA256 by default, however if for whatever reason you need to test something on Windows XP pre-service pack 2, then you can set it to SHA1; or you can set it to SHA384.

The only compatibility thing to keep in mind is that some browsers do not support end-entity certificates that are ECC 384 keys. Root / intermediates are fine, but end-entities of this key length don’t work. A particular one with that behavior is Safari (both desktop and mobile).

The extension is broken down into two projects, a core .NET 2.0 project that does the bulk of the certificate generation, and a .NET 4.0 one that implements the Fiddler interface. If you are using Fiddler 2, you should be able to change the Framework Target for the .NET 4.0 one to 2.0 and it will work fine. Just remove the Fiddler reference and add the one for the .NET 2.0 Fiddler.

The GitHub repository’s README contains installation instructions.

Examining the details of public key pinning

I’ve been working on adding some features around HTTP Public Key Pinning to my FiddlerCert Fiddler extension.

Specifically, I added indicators next to each certificate if it was pinned by either the Public-Key-Pins or Public-Key-Pins-Report-Only header as well as display the SPKI hash

Fiddler Cert Pinned SPKI

While implementing this, I learned quite a bit about what a pinned key is.

So, what is a pinned key? They are called a public key pin, but the pin is more than a hash of just the public key. You could for example, look at the public key of my domain:

04 35 D7 8F 8C 16 18 9D 1E 95 95 67 1C 39 D8 83
B3 32 1C 89 BA A3 56 78 8D C2 43 DB 20 4F 1D FA
80 93 6B 23 AF 1C 5A 59 F9 1B 74 A7 6F 62 38 97
A9 1B 29 2A 0F DA 40 B0 6F F9 6A 98 CE 45 48 48

However if we were to take these bytes and hash them, it would not produce the same hash that OpenSSL does when following the recommended guidelines for producing a Public-Key-Pin hash.

The hash needs to include a little more information. For example, hashing just the public key doesn’t include certain other components, like what algorithm the key is, such as RSA or ECDSA. Other relevant information should be included as well, such as the public key exponent for RSA, or the identifier of the elliptic curve for ECDSA, or algorithm parameters.

Why do we need these extra values? Adam Langley has the details on his blog:

Also, we’re hashing the SubjectPublicKeyInfo not the public key bit string. The SPKI includes the type of the public key and some parameters along with the public key itself. This is important because just hashing the public key leaves one open to misinterpretation attacks. Consider a Diffie-Hellman public key: if one only hashes the public key, not the full SPKI, then an attacker can use the same public key but make the client interpret it in a different group. Likewise one could force an RSA key to be interpreted as a DSA key etc.

One tricky thing with hashing is that the data format must always be consistent. Since a hash operates on pure data and knows nothing about the structure of the data, the structure must be consistent and platform independent. Should the public key exponent be included before or after the public key? What endianness is the data? How is the data consistently represented?

For all of the grief it gives people, that’s what ASN.1 encoding is exactly for. What actually ends up getting hashed is a SubjectPublicKeyInfo (SPKI) portion of the X509 certificate. This includes all of the data that we need. In OpenSSL, it’s this part:

Subject Public Key Info:
    Public Key Algorithm: id-ecPublicKey
        Public-Key: (256 bit)
        ASN1 OID: prime256v1
        NIST CURVE: P-256

In ASN.1 form, it looks like this.

30 59 30 13 06 07 2a 86 48 ce 3d 02 01 06 08 2a
86 48 ce 3d 03 01 07 03 42 00 04 35 d7 8f 8c 16
18 9d 1e 95 95 67 1c 39 d8 83 b3 32 1c 89 ba a3
56 78 8d c2 43 db 20 4f 1d fa 80 93 6b 23 af 1c
5a 59 f9 1b 74 a7 6f 62 38 97 a9 1b 29 2a 0f da
40 b0 6f f9 6a 98 ce 45 48 48 2c

The green portion should look familiar, it’s the public key. The rest of the bytes are part of the ASN.1 encoding.

ASN.1 Primer

ASN.1 is simple in concept, but difficult to write code for in a secure manner. ASN.1 consists of tags. Each tag consists of three things: the type of the tag (tag identifier), the length of the tag’s value, and the value of the tag. Tags can consist of other tags. Take this example:

06 07 2a 86 48 ce 3d 02 01

This is an OBJECT_IDENTIFIER. 0x06 is the identifier for an OBJECT_IDENTIFIER. The next value is 7. The rest of the tag is the value, 2a 86 48 ce 3d 02 01 and we see that it is exactly 7 bytes long, just as the length value said it was. The data in an OBJECT_IDENTIFIER is a variable-length-quantity. This particular OBJECT_IDENTIFIER’s value is an OID of 1.2.840.10045.2.1, which is the OID for an ECC Public Key, or as we saw from OpenSSL’s output, id-ecPublicKey.

0x30, as the data starts with, is the tag identifier for a SEQUENCE. A SEQUENCE is a collection of other tags. The length value of a sequence is not the number of items in the sequence, but rather the total byte length of it’s contents. The only way to get the number of items in a sequence is to examine each of the tags in it and parse them out.

How the data in the ASN.1 data is stored is defined by the encoding rules. It can be DER, which is what is used with X509, or BER.

Breaking down the ASN.1 data, it looks like this:

		OBJECT_IDENTIFIER 1.2.840.10045.2.1
		OBJECT_IDENTIFIER 1.2.840.10045.3.1.7
	BIT STRING 000435D78F8C16189D1E9595671C39D883B3321C89BAA356788DC243DB204F1DFA80936B23AF1C5A59F91B74A76F623897A91B292A0FDA40B06FF96A98CE4548482C

This is for an ECC public key. For an RSA public key, it would not have two OBJECT_IDENTIFIERs, but rather also include the public exponent.

FiddlerCert needed to do this, but backwards. Given a public key blob and the parameters, convert it to ASN.1 data so it can be hashed. It was tempting to try assembling the data myself, but I quickly realized this was very error prone. Instead, I settled on using the CryptEncodeObject Win32 function with platform invoke. This made it much simpler. All I had to do was construct a CERT_PUBLIC_KEY_INFO structure that had all of the values and it happily produced the ASN.1 data that was needed.

The implementation details of this are on GitHub in the CertificateHashBuilder class.