Home  Java compression intro  Deflater how-to  Deflater algorithm  Deflater configuration  Text compression performance  GZIP files  ZIP files

Search this site:
Threads Database Profiling Regular expressions Random numbers Compression Exceptions C Equivalents in Java

Configuring a Deflater

On the previous page, we saw how to use a Deflater to compress data in Java (and how to read the compressed data back in again). We mentioned that the reason for explicitly constructing the Deflater object separately to the DeflaterOutputStream is that in this way, the Deflater can be configured. The main two parameters that we can configure are as follows:

  • the compression level defines how much of a tradeoff we are prepared to make between speed of compression and size of compressed output;
  • the compression strategy allows us to effectively switch out part of the deflater algorithm, which may be appropriate in some circumstances where we are wrapping the deflater around our own compression/transform mechanism.

Compression level

In our discussion of how the Deflater works, we saw that a key part is dictionary compression. In this type of compression, the compressor looks for previously-occurring sequences that are the same as the sequence about to be encoded.

When constructing a Deflater, you can specify a compression level value, from 1 to 9. The compression level, roughly speaking, defines how rigorously the compressor looks to find the longest string possible. As a general rule of thumb:

  • Compressing at the maximum level (9) requires around twice as much processor time as compressing at the minimum level (1);
  • For typical textual input, compressing at the maximum as opposed to the minimum level adds around 5% to the compression ratio.

You have to decide whether, for your application, the extra processor time is worth the extra few per cent of compression.

Compression strategy

Generally speaking, it will not be necessary to change the compression strategy. However, in certain advanced cases, you can actually improve compression by transforming your data so that Huffman encoding works better on it. In such cases, we'll see that explicitly specifying the FILTERED or HUFFMAN_ONLY strategy can be beneficial.

comments powered by Disqus

Written by Neil Coffey. Copyright © Javamex UK 2012. All rights reserved.