Game Development Reference
from sequences of varying lengths that are relatively common. Basically, data com-
pression aims at finding algorithmic transformations of a dataset that will produce
a more compact representation of the original dataset.
Choosing the best compression algorithm depends on a number of factors, such
as expected patterns and regularities in the data, storage and data persistence
requirements, and both CPU and memory limits. This chapter briefly covers some
data compression theory, but it mostly covers implementation of data compres-
sion using the built-in C# components.
Types of Compression
Data compression basically comes in two flavors, lossy and lossless.
Lossy compression is a representation of the original dataset that is “close enough”
in comparison. File sizes are significantly reduced by losing a reasonable amount
of data in the compression process. Lossy compression can produce far more com-
pact dataset representations than lossless compression. The main problem with
lossy compression is that valid data is actually lost and unrecoverable, but this lim-
itation is all right for images, sound files, and video clips where data loss is accept-
able because humans can only perceive a subset of the actual data anyway. In the
data persistence world, where data cannot be lost or corruption would occur, lossy
compression algorithms will not suffice. Storing a “close enough” representation of
a data file would be useless. Lossy compression also does not generally provide a
decompression algorithm because of the data loss.
Lossless compression is a representation of the original dataset that enables repro-
duction of the exact contents of the original dataset by performing a decompres-
sion transformation. No data is ever lost in the compression process, making it the
perfect solution for compressing data that must maintain integrity. This chapter
only covers lossless data compression, because we generally want tools to maintain
100 percent data integrity unless we are dealing with image compression.
GZipStream Compression in .NET 2.0
Microsoft .NET 1.1 did not include any data compression components other than
third-party solutions. Recently introduced in .NET 2.0 is the System.IO.Compression
namespace that provides compression and decompression services for streams.
There are currently two supported algorithms: deflate and gzip . This chapter cov-
ers the gzip algorithm exclusively.