Compression Tests

About Test Results Resources


Some very influential sources have been promoting the gzip compression format as the end-all and be-all to our HTTP 1.1 compression needs; some tout gzip as the superior compression format ("Gzip is the most [...] effective compression method..." [source: Best Practices for Speeding Up Your Website]). This, however, is not necessarily true. There are 2 other compression formats commonly available for use on the web.

The Three Compression Formats:

7-10+ byte header
Compressed Data
Checksum* and InputSize**

HTTP 1.1 deflate (aka zlib format)
Compressed Data
Compressed Data
4 byte trailer (Adler-32* checksum)

raw deflate (deflate format)
Compressed Data

You can see that raw deflate can always be the fastest-to-encode format and result in the smallest output.

The issues stems from the fact that, according to spec (HTTP 1.1's RFC 2616), raw deflate should not be supported by browsers. Essentially, a browser should be expecting the zlib format for data received with a "Content-Encoding: deflate" response header.

Now, the problem with RFC 2616 is that Microsoft didn't actually implement HTTP 1.1 deflate. Instead, they chose to implement raw deflate. A server responding with HTTP 1.1 deflate (zlib) will result in, well, nothing...the browser won't be able to decode the data. On top of that, .NET's System.IO.Compression.DeflateStream implementation of deflate also uses raw deflate and Apache's moddeflate confusingly sends gzip; and no, you can't force moddeflate to send HTTP 1.1 deflate or raw deflate (at least as of September 13, 2010).

What about the other browsers that include "deflate" in their Accept-Encoding request header? Well, most of them can decode both HTTP 1.1 deflate and raw deflate with no problems.

So, what is the best compression format to send to HTTP 1.1 capable browsers that send an "Accepting-Encoding: gzip, deflate" request header?

The Answer: raw deflate!

UPDATE February 10, 2012: zOompf published an in-depth look into HTTP compression that is worth a look. While RAW DEFLATE is faster and lighter, it appears that GZIP is more stable, especially in modern browsers.

View test data here.

* Checksum is a CRC (cyclic redundancy check) - it can be used to check for errors in the payload's data (wiki).
** InputSize is the size of the original (uncompressed) input data modulo 2^32 ( actualInputSize % (2/32) ).
*** Alder 32 is a very efficient checksum algorithm (wiki).