ZPAQ

Open Standard Programmable Data Compression

ZPAQ is an open format for compressed data (like deflate), supported by an archiver (like zip) and a library API for developers (like zlib). Like deflate/zlib/zip, ZPAQ is free, open source, and not encumbered by patents, but it is designed for for high compression ratios rather than high speed.

ZPAQ Format

The ZPAQ format was designed in 2009 based on 9 years of experiments with the PAQ class of context-mixing compression algorithms. In 2004, PAQ began to surpass PPM as the best known algorithm on many benchmarks including the Calgary Compression Challenge. But PAQ was not widely used in practice because of its poor speed and experimental nature. None of the over 160 versions developed between 2000 and 2010 could read archives produced by any other version.

ZPAQ solved the compatiblity problem by including a description of the decompression algorithm in the archive header. It describes a network of modeling (bit prediction) components commonly used in most PAQ versions, and a virtual machine byte code (called ZPAQL) for computing arbitrary contexts and post-processing the output. The format has been stable since it was first published on March 12, 2009.

All compression algorithms involve a 3 way trade-off between compression ratio, speed, and memory usage. The format allows developers to make these compromises and to customize the algorithm to the data. For example, it would allow an application to search a large space of relatively fast algorithms and select the best one.

The ZPAQ format is described by a specification and reference decoder. Developers can test their implementation by decompressing calgarytest2.zpaq to the Calgary corpus. This test case is designed to test all of the features of the format.

libzpaq API

libzpaq is a public domain application programing interface (API) that provides byte stream compression and decompression services to C++ programs. The developer supplies classes to read or write bytes to arbitrary sources, such as strings or files. The developer can supply custom algorithms, expressed as ZPAQL byte code, or choose from one of three built-in compression models. libzpaq includes just-in-time (JIT) compilation of ZPAQL byte code on x86-32 and x86-64 hardware in Windows and Linux, making it about twice as fast as the reference decoder.

libzpaq Documentation (HTML).

libzpaq501.zip contains the following files:

unzp.cpp illustrates how to use libzpaq to decompress a single file.

zpaq Archiver and Development Tool

zpaq is an open source, command line program for Windows and Linux to compress and decompress archives in the ZPAQ format. It has 4 built-in compression levels (faster vs. smaller) that are adequate for most users. To help developers interested in designing new compression algorithms, it also accepts configuration files containing algorithms written in ZPAQL. It has tools to test and debug ZPAQL code and display model statistics. It is licensed under GPL v3.

Multi-threaded. The ZPAQ format specifies that archives can be divided into independent blocks that can be compressed or decompressed in parallel. A block can contain a file, a part of a file, or multiple files. Larger blocks compress better but allow fewer processors to work in parallel. By default, the faster compression levels (-m1 and -m2) use 16 MB blocks. The higher levels (-m3, -m4, and configuration files) use one block per file. You can change the block size or make solid archives that put all of the files in one block. You can also limit the number of threads that can run at one time. The default is to automatically detect the number of processors.

JIT. Starting with libzpaq v4.00, ZPAQL byte code is automatically translated into x86-32 or x86-64 code and directly executed. This makes zpaq twice as fast as the reference decoder on these architectures. Earlier versions used source-level JIT that required an external C++ compiler and made installation complex. Now you can just download the program and run it.

Incremental update for large backups. Staring with zpaq v4.01, files are first compared with the archive contents and then compressed or decompressed only if they have changed. Comparison is fast because it only requires computing the SHA-1 checksum of the file and comparing it to the checksum stored in the archive. zpaq is the only archiver that updates incrementally based on actual file content rather than metadata such as timestamps or archive flags. Solid archives cannot be updated.

Robust. Supports the following features:

Benchmarks. The following table compares the compressed size, compression and decompression times (in real seconds), memory used (in MB) and number of threads used by zpaq compared with other popular, open source compressors on the 14 file Calgary corpus and on the 1 GB text file enwik9 from the Large Text Benchmark.

  Compressor         calgary      enwik9       CTime DTime Memory Threads Algorithm
  ----------        ---------  -------------   ----  ----  ------ ------- ---------
  uncompressed      3,141,622  1,000,000,000
  compress          1,272,772    448,136,005     35    12      5    1     LZW
  zip -9            1,020,495    322,592,132     60    12     10    1     deflate (LZ77+Huffman)
  gzip -9           1,017,624    322,591,995     55    12     12    1     deflate (LZ77+Huffman)
  bzip2/pbzip2        828,347    254,067,824     59    26    174    4     RLE+BWT+MTF+RLE+Huffman
  7zip/p7zip          824,279    227,905,645    704    20    224    1.5   LZMA (LZ77+CM1)
  zpaq -m1            843,460    218,015,074    120   150    320    4     BWT+RLE+CM1 (16 MB blocks)
  zpaq -m2            784,261    203,217,493    209   245    320    4     BWT+CM2 (16 MB blocks)
  zpaq -m2 -b125                 184,091,822    231   291   2686    4     BWT+CM2 (8 blocks)
  zpaq -m3 -b250                 180,995,477    674   700    444    4     CM8 (4 blocks)
  zpaq -m3            722,085    180,279,221   1372  1417    111    1     CM8 (1 block)
  zpaq -m4 -b250                 167,035,989   1781  1823    984    4     CM22 (4 blocks)
  zpaq -m4            666,512    165,887,505   3445  3538    246    1     CM22 (1 block)
  zpaq -m4 -bs        644,433                                             CM22 (solid archive)
  zpaq -mmax_enwik9              149,376,058   6327  6528   2002    1     CM (custom)

Benchmark details. Times, memory, and number of threads are shown for enwik9 on a 2.66 GHz Core i7 M620 (2 cores, 2 hyperthreads per core) with 4 GB memory under 64 bit Ubuntu Linux. Memory and threads are shown for compression. zpaq uses the same memory and number of threads to decompress; some programs may use less. For the Calgary corpus, files were compressed in Windows. compress, gzip, and bzip2 compress each file separately. The reported size is the total. zip, 7zip, and zpaq are archivers. The reported size is for a single archive. Option -9 selects best compression for zip and gzip, which compresses slower but does not affect decompression speed. pbzip2 and p7zip are Linux multithreaded versions of the Windows programs bzip2 and 7zip, respectively. Memory usage and number of threads are reported for the Linux versions. Tested versions are the most recent as of 2011. For zpaq, options -m1 through -m4 select compression level from fastest to best. -b select block size in MB. Each thread can process one block at a time. -b0 selects one block per file. -bs selects one block for the entire archive. The default block size is -b16 for -m1 and -m2 and -b0 otherwise. Thus, block size has no effect on the Calgary corpus because all files are smaller than 16 MB. -bs has no effect on enwik9 because there is only one file in the archive. -mmax_enwik9 requires max_enwik9.cfg

Download

zpaq.exe v4.04 command line archiver for 32 bit Windows.

zpaq v4.04 statically linked executable for 64-bit Linux.

zpaq user's guide (HTML).

zpaq404.zip contains:

Sample configuration files. zpaq is also a development tool allows you to create, test, debug, and optimize new compression algorithms that are compatible with all existing ZPAQ decompressors such as zpipe, zpsfx, and the reference decoder. An algorithm is described by a configuration (.cfg) file and sometimes an external preprocessor. Neither file is needed to decompress. The zpaq user's guide describes how to write them. For example:

fast.cfg, mid.cfg, and max.cfg describe the built in models for levels 1, 2, and 3 in libzpaq and zpipe. mid and max are equivalent to compression levels -m3 and -m4 respectively in zpaq. They are context mixing models without preprocessors using 2, 8, and 22 components respectively. With zpaq, you can pass an argument to increase memory usage to improve compression of large files.

bwt.1.zip contains 4 BWT based configurations with preprocessors (C++ source and Win32 executables). Two of them (bwtrle1, bwt2) are equivalent to zpaq -m1 and -m2 respectively. Two other modes provide intermediate level compression and speed. All are based on the Burrow-Wheeler transform, which sort the input by its following context, then compress with a low order model. The preprocessor uses Yuta Mori's libdivsufsort (open source MIT license).

bwt_slowmode1.zip is a low memory BWT compression algorithm by Jan Ondrus. It is based on on BBB slow mode with block sizes up to 1 GB using 1443 MB memory, instead of the usual 5 times block size. It compresses enwik9 to 163,565,006 bytes in a single block (compare with zpaq -m2).

bmp_j4a.zip by Jan Ondrus uses a color transform preprocessor and a context model based on paq8px_v64 for compressing .bmp images. It achieves the best known result for the rafale.bmp benchmark, 522,029 bytes.

exe_j1.zip by Jan Ondrus contains a E8E9 preprocessor that improves compression of .exe and .dll files by translating x86 JMP and CALL addresses from relative to absolute. The context model is the same as max.cfg, but it typically compresses these file types 7-10% smaller at the same speed.

jpg_test2.zip by Jan Ondrus compresses JPEG images (which are already compressed) by an additional 15%. It uses a preprocessor that expands Huffman codes to whole bytes, followed by context modeling.

pi.cfg compresses pi.txt from the Canterbury miscellaneous corpus from 1 MB to 114 bytes. No other compressor is known to compress it to less than 415,241 bytes. It uses a postprocessor that ignores the decoded data and computes pi to 1 million digits. Warning: it may take several hours or days to run. It works only on this file. Discussion.

lz1.zip is a fast LZ77 based model. It compresses enwik9 to 271,702,398 bytes in 391 seconds and decompresses in 227 sec. on a 2.0 GHz T3200 (Win32) in a single thread using 66 MB memory to compress and 18 MB to decompress. It include lz1.cfg and a preprocessor lzpre.cpp and lzpre.exe. The preprocessor can also compress by itself with less compression but better speed (327,501,489 bytes, 215 sec, 26 sec). The preprocessed format is a byte-aligned LZ77 using 2, 3, or 4 bytes to encode matches of up to 64 bytes with offsets up to 16 MB as explained in the source comments.

Miscellaneous Applications

zpipe (user's guide) is a simple application that compresses or decompresses from standard input to standard output. It illustrates the use of libzpaq. Contains GPL source and a Windows executable (updated to libzpaq v5.01 on Feb. 2, 2012).

zpsfx v1.01 (Apr. 4, 2012) is a self extracting archive stub for Windows. To use: compile, then append an archive to the executable. Be sure that the archive begins with a header locator tag (included by default). To extract, run the program. For example:

  g++ -O2 -s zpsfx101.cpp libzpaq.cpp -o zpsfx.exe
  upx zpsfx.exe
  zpaq c archive files...
  copy/b zpsfx.exe+archive.zpaq archive.exe
To extract:
  archive.exe

For either program, you need libzpaq to compile.

tiny_unzpaq.cpp v1.0, Mar. 21, 2012, is a small ZPAQ level 2 extractor derived from unzpaq200.cpp, with error checking, extra code, comments, and white space removed to make the source code compress as small as possible. It compiles by itself. It is written by Matt Mahoney and released to the public domain. To run: tiny_unzpaq archive.zpaq. It will create files as named in the archive.

Technology

ZPAQ is based on the PAQ context mixing model, which was developed over the period 2000-2009. A set of context models independently predict the next bit of input. Then the predictions (probabilities) are combined and used to arithmetic code the bit. Using lots of predictors improves compression but takes more time. ZPAQ allows up to 255 components. The most important types are:

A configuration file describes how the components are connected. For example, mid.cfg describes an order 0 indirect context model whose output is adjusted by a chain of indirect secondary symbol estimators with increasing context order, a high order match model, and a final mixer. The context hashes are computed by a ZPAQL program that is called once for each uncompressed byte.

Preprocessing and postprocessing can be used to implement other algorithms such as LZ77, LZP, BWT, and various specialized transforms like E8E9 for x86, Huffman code expansion for JPEG, and color transforms for BMP. Postprocessing after decompression is written in ZPAQL. Preprocessing before compression is built in to zpaq (BWT for -m1 and -m2) or done by an external program. zpaq will verify that postprocessing restores an externally preprocessed file prior to compression. zpaq also has a debugger that allows you to trace ZPAQL code or run it as a stand-alone program.

A ZPAQ archive is designed to be read quickly in a single pass like a tar file. It consists of a sequence of blocks that can be decompressed independently in parallel. Each block consists of a sequence of segments that must be decompressed in order. Each segment has an optional filename, an optional comment, compressed data, and an optional SHA-1 checksum. A segment without a filename denotes a continuation of the same file from the previous segment. The end of the compressed data is marked by a sequence of 4 zero bytes, allowing it to be found quickly without having to decompress the data. The arithmetic coder is designed so that it never encodes such a sequence. A block header can be marked with a 13 byte tag that allows it to be found when embedded in other data such as a self extracting archive.

Frequently Asked Questions

What is zpaq for?
zpaq is an archiver. It collects one or more files together and compresses them so they take up less space. You can use it to make backups, or to create a package of files to send to someone as an email attachment or to download from a website. The recipient will need zpaq to extract the files.

I clicked on the zpaq icon and nothing happened.
zpaq is a command-line program. You have to run it from a command window in Windows or a Linux shell.

Why isn't there a graphical user interface (GUI)?
Because people who know how to use a command line interface usually prefer to use it that way instead of doing the same thing with a cumbersome combination of typing and mouse clicks. Besides, I'm not very good at writing GUI code. I'm hoping that other archivers that already have a GUI will add support for ZPAQ. I wrote libzpaq to make that easier.

What is the difference between the 32 and 64 bit versions?
The 64 bit versions will not run on a 32 bit operating system. The 32 bit versions can only use 2 GB of memory, no matter how much you actually have. None of the built-in compression modes need that much memory, but some configuration files like max_enwik9.cfg might.

Why isn't there a 64 bit Windows version or a 32 bit Linux version?
Because my computers only have 32 bit Windows and 64 bit Linux installed. If you want a 64 bit Windows version or a 32 bit Linux version, you can try compiling it yourself. I can't guarantee they will work because I haven't tested it.

Will it work on a Macintosh?
I don't know. You can try the Linux version or compile the source code.

How do I compile zpaq?
Get zpaq, libzpaq, and libdivsufsort above. See readme.txt in the zpaq distribution. You can try:

  g++ zpaq.cpp libzpaq.cpp divsufsort.c -O3 -s -DNDEBUG -o zpaq
In Linux you also need the options -Dunix -fopenmp

zpaq crashes on my Windows 95 PC
zpaq (actually libzpaq) uses SSE2 instructions, which are not available on Intel processors made before 2001 or AMD processors before 2003. If zpaq works with -m1 and -m2 but not -m3 or -m4, then that is the problem. You can try recompiling with -DNOJIT to disable the just-in-time (JIT) optimization feature, which uses SSE2. It will run about twice as slow.

How do I create an archive?
For example:

  zpaq c archive file1 file2
will create a new file named archive.zpaq containing compressed copies of file1 and file2 Then the command:
  zpaq x archive
would create file1 and file2. If you want to see what files the archive contains, use the command:
  zpaq l archive

How do I back up my files?
Go to the parent directory of the files you want to back up and specify an archive on an external drive. Then use the a command to specify the files you want to add or update using relative paths. For example:

  cd \users\joe
  zpaq a e:backup pictures\* documents\*
will compare all of the files in these two directories and update e:backup.zpaq only with those files that are new or have changed since the last backup. This is much faster than having to compress all of the files. You can also use:
  zpaq u e:backup pictures\*
to update all of the files in the archive, and then add files from pictures. This means that if any files in documents have changed or been deleted, then the copy in the archive will also be updated or deleted. But if any new files have been created in documents, then they will not be added to the archive.

I updated the archive and now it is empty.
Be sure to back up from the same directory every time. For example:

  cd \users
  zpaq u e:backup joe\documents\*
would delete pictures\me.jpg from the archive because there is no file \users\pictures\me.jpg. Names are saved as specified on the command line and have to match exactly.

Alternatively, you could create an archive with absolute paths and not worry about what directory you are in, but you would lose the flexibility to extract the directory tree to another location. For example:

  zpaq u e:backup \users\joe\documents\* \users\joe\pictures\*

My archive has duplicate files.
File names in zpaq are case sensitive, even if Windows isn't. So,

  zpaq a e:backup pictures\*
  zpaq a e:backup Pictures\*
would result in 2 copies of every picture.

Why don't you fix that?
Because there are still too many different ways to name the same file, using different paths, symbolic and hard links (junctions), mounts, and so on. There are also too many operating system dependencies. Linux is case sensitive, and Windows accepts either forward slashes or backslashes.

How do I back up a directory tree?
Use -r

How do I back up the whole disk?
Make a .tar.zpaq file. zpaq backs up files, not file systems. It does not preserve metadata like time stamps, ownership, and access rights. It does not back up hidden or system files unless you specify them.

How do I restore a file that I deleted?
Go to the directory where you made the backup and use x.

  cd \users\joe
  zpaq x e:backup
This will compare every internal file in the archive and extract only those that don't exist externally. It will report whether an external file is identical or not, but it will not overwrite it in either case. You can also extract one file like this:
  zpaq x e:backup output filename
This looks for a file named exactly filename in the archive and extract it to a file named output.

How can I tell zpaq to overwrite existing files during extraction?
Use -f

Which options give me the best compression?

  zpaq -m4 -bs -n c archive files...
-m4 selects best (and slowest) compression. -bs selects a solid archive. -n says don't save file sizes, checksums, or block header error recovery tags, each of which takes a few bytes. All of the files are packed into a single block and compressed using only one thread. Solid archives cannot be incrementally updated. The c command deletes any previous contents of the archive, which would otherwise not be updated if they had not changed.

Which options give me the fastest compression?

  zpaq -m1 -bN a archive files...
where N is the total size of the files divided by the number of processors. For example, to compress a 100 MB file on a machine with 4 cores, use -b25 or smaller. This will divide the input into 4 blocks and compress each one on a different core. It will decompress the same way. (Note that -m1 -b16 is the default, so it would already be as fast as possible). The a command will skip compressing any files that are already in the archive and unchanged.

-bs doesn't work.
-bs is ignored at the default compression level (-m1) and at level -m2.

-bs doesn't improve compression.
It only helps if you compress 2 or more files at once and those files have some shared information such as common strings.

Why aren't all of the processors being used?
zpaq will use all available cores if it can. However, only one core can be used on each block at a time. By default, files larger than 16 MB are split into blocks that can be compressed or decompressed in parallel. Smaller files are one block each. You can select smaller blocks with -b but compression will be worse.

Out of memory?
Compression levels -m1 (default) and -m2 are block sorting algorithms that use 5 times the block size per thread, after rounding up the block size to a power of 2. The default block size is -b16, selecting a 16 MB block. This uses 80 MB per thread. The same memory is required to decompress. Options -b0 (no blocks) or -bs (solid archive) are not possible, so instead these select the maximum block size of -b256, which requires 1.25 GB per thread (or less for files smaller than 128 MB). Furthermore, 32 bit Windows applications cannot use more than 2 GB of memory, no matter how much you have, or whether you are running a 64 bit version of Windows. You can use the -t option (for example, -t1) to reduce memory usage by reducing the number of threads.

Compression levels -m3 and -m4 use 111 MB and 246 MB memory per thread respectively, independent of block size.

What is libzpaq?
It is an application programming interface (API) in C++ that allows developers to easily compress and decompress data in the ZPAQ format, either in memory or on disk.

Why is libzpaq separate from zpaq?
Because compression should be transparent to the user. An application could compress documents whenever users save their work. Other applications could then read those files because they are compressed in a standard format, even if a newer version of the first application starts compressing with a better algorithm. libzpaq is designed to make this easy. zpaq is a tool to allow developers to test their algorithms.

Will libzpaq work for phone or tablet apps?
Maybe. I have not tested it on the ARM processor. You will at least need to compile with -DNOJIT to disable the x86-32/64 just-in-time optimization feature.

How is zpaq licensed?
GPL v3. It is free to use. You can also modify it and distribute modified versions, but any distribution must be free and under the same license and include source code.

How is libzpaq licensed?
It is effectively public domain. It is actually the MIT license modified to remove the restriction that the copyright notice must be included. This was the only restriction. You can do anything you want with the code.

Is ZPAQ covered by patents?
As far as I know, no. I have not filed for any patents on any of the methods used (including PAQ, from which it is derived) and will not do so. I could not, even if I wanted to, because the methods have already been disclosed for over a year.

Why haven't you patented it?
Because patents would kill any chance of ZPAQ ever becoming a standard. That is what happened to CTW, the sort transform, and the arithmetic coding modes of BZIP and JPEG. deflate succeeded precisely because it was not patented.

Why do you keep all of the old versions of the software?
To establish proof of disclosure of prior art so that nobody can claim any patents on the technology.


History

ZPAQ development began on Feb. 15, 2009 with a series of mutually incompatible experimental programs (v0.01 through v0.09, now obsolete). Archives created with these versions could only be read by the same version. The level 1 standard became fixed on Mar. 12, 2009 with version 1.00. All versions from 1.00 onward are level 1 compliant and can extract each other's archives.

The SHA-1 code used in these older versions (prior to libzpaq 1.00) is derived from RFC 3174, which is copyright (C) 2001, The Internet Society. Please see this document for the full license.

zpaq v0.01 Open source (C++) and Win32 executables, Feb. 15, 2009.
zpaq v0.02 adds E8E9 transform. Fully supports post-processing. Not compatible with v0.01. Feb. 19, 2009.
zpaq v0.03 modifies MIX, MIX2, IMIX to fix poor compression on large files. Not compatible with v0.02. Feb. 19, 2009.
zpaq v0.04 modifies train() and squash() for improved compression. Not compatible with v0.03. Feb. 21, 2009.
zpaq v0.05 modifies probability representation and mixer weights to prevent mixer overflow and to improve compression for highly redundant data. Not compatible with v0.04. Feb. 26, 2009.
zpaq v0.06 adds SHA1 checksums, replaces IMIX2 with ISSE. Not compatible with v0.05. Feb. 27, 2009.
zpaq v0.07 improves ISSE and bit-history state table. Not compatible with v0.06. Feb. 28, 2009.
zpaq v0.08 adds LZP transform and minor improvements. Not compatible with v0.07. Mar. 8, 2009.
zpaq v0.09 removes counters from ISSE and ICM to improve speed. Not compatible with v0.09. Mar. 9, 2009.
zpaq v1.00 (first level 1 compliant version) includes unzpaq1 candidate reference decoder. Simplified bit history tables. Not compatible with earlier versions. Mar. 12, 2009.
fast.cfg written Apr. 26, 2010. Now part of libzpaq distribution.
unzpaq 1.01 updates reference decoder comments and help message and fixes some VS2005 compiler issues. Compatible with 1.00. Apr. 27, 2009.
unzpaq 1.02 and zpaq 1.02 closes extracted files immediately after decompression instead of when program exits. Fixes g++ 4.4 warnings. Compatible with 1.00 and 1.01. June 14, 2009.
unzpaq 1.03 and zpaq 1.03 has a default compression mode (mid.cfg), supports compressing files in segments to separate blocks and extracting them as suggested in part 7 of the spec. Does not store paths by default. Does not extract to absolute paths by default. Some minor improvements. Sept. 7, 2009 (added zpaq.exe Sept. 8, 2009).
zpaq103b adds zpaqsfx 1.03, a stub for creating self extracting archives. No changes to zpaq or unzpaq. Sept. 14, 2009.
zpaq104 can list and extract from self extracting archives without running them. Added progress meter. zpaqsfx.exe stub is slightly smaller. unzpaq unchanged. Sept. 18, 2009.
zpaq105 removes built in x and p preprocessors and makes them separate programs called from config files with compile time postprocessor testing. Adds if-else-endif and do-while to ZPAQL. Many small changes. Sept. 28, 2009.
zpaq106 adds "ta" to append locater tags to allow ZPAQ streams to be detected when embedded in arbitrary data. zpaq1.pdf revision 1 adds this recommendation. unzpaq106.cpp implements it. Sept. 29, 2009.
zpaqsfx 1.06 self extracting archive stub is now separate from the ZPAQ distribution. Updated Sept. 29, 2009, posted Oct. 26, 2009. (Replaced by zpsfx in libzpaq 2.01)
zpipe v1.00, a simple streaming compressor, Sept. 30, 2009. Linux patch added Jan. 18, 2010.
zpaq107 adds config file parameters and fixes some bugs. From now on the specification and reference decoder are not included unless they change. Oct. 2, 2009.
bwt_j2 is a config file (by Jan Ondrus) and preprocessor for BWT compression. Posted Oct. 7, 2009.
bwt_j3 is a bug fix for bwt_j2 to accept multiple files. Jan Ondrus, Oct 7, 2009.
exe_j1 is a config file and preprocessor for .exe and .dll files. It extends the E8E9 transform in exe.cfg to conditional jumps. Jan Ondrus, Oct. 7, 2009.
unzpaq108.cpp removes undefined behavior of ZPAQL shifts larger than 31 bits on non x86 hardware. Oct. 14, 2009.
zpaq108 generates optimized code that runs about twice as fast on systems with a C++ compiler installed. Oct. 14, 2009.
bmp_j4, configuration for .bmp files by Jan Ondrus, Oct. 14, 2009.
bwt_slowmode1 BWT compression based on BBB slow mode. Jan Ondrus, Oct. 15, 2009.
jpg_test2 JPEG config by Jan Ondrus, Oct. 20, 2009, posted Oct. 26, 2009.
zpaq109 Linux port and some cosmetic bug fixes. Oct. 21, 2009.
zpaq110 bug fix for Linux/g++ 4.4.1, Dec. 28, 2009.
zp v1.00 simple ZPAQ compatible archiver with 3 optimized compression levels, Apr. 26, 2010.
Added a license file to zpaq 1.10, zpipe 1.00, zpaqsfx 1.06, and zp 1.00 distributions on May 23, 2010. No software changes.
libzpaq 0.01, Sept. 27, 2010.
libzpaq 0.02, Sept. 28, 2010.
zpipe 2.00 updated to use libzpaq 0.02, Sept. 28, 2010.
libzpaq 1.00, Sept. 29, 2010. Package includes libzpaq, ZPAQ specification, reference decoder, zp, zpipe, and fast, mid, max config files.
libzpaq 1.01, Oct. 14, 2010. Updates libzpaq interface to use inheritance instead of templates, requiring changes to zp and zpipe. Now compiles faster.
libzpaq 1.02, Oct. 20, 2010. Adds zpsfx self extracting archive stub. Separates optimized models from libzpaq.cpp to libzpaqo.cpp.
libzpaq 2.00, Oct. 30, 2010. Ports zpaq to libzpaq, replacing zp.
libzpaq 2.01, Nov. 5, 2010. Added optimized self extracting archives. Simplified installation.
libzpaq 2.02, Nov. 13, 2010. zpaq shows compression component statistics. Libzpaq support added.
zpaq 2.03, Dec. 23, 2010, adds Linux support. The remaining code is split into libzpaq 2.02, zpipe 2.01, zpsfx 1.00, and configuration files min, fast, mid, and max.
zpaq 2.04, Dec.29, 2010, adds support for Visual C++, Borland, and Mars compilers in addition to g++. A Windows install script is added.
zpaq 2.05, Jan. 5, 2011. Fixed a bug in which zpaq crashed when decompressing an unnamed file (as created with zpipe or zpaq nc) without renaming. Separated zpaq.1.pod. (Updated corrupted install.sh on Jan. 13, 2011).
libzpaq 2.02a, Jan. 6, 2011. Updates the documentation.
pzpaq 0.01 parallel file compressor, Jan. 21, 2011.
pzpaq 0.02, Jan. 26, 2011, adds large file support (over 2 GB) to Windows.
pzpaq 0.03, Feb. 2, 2011, optimizes decompression for nonstandard compression levels by recompiling itself with g++ (like "zpaq ox").
pzpaq 0.04, Feb. 4, 2011, Windows version uses native threads and no longer requires pthreads-win32.
pzpaq 0.05, Feb. 10, 2011 removes -s option, puts temporary files in $TMPDIR or %TEMP%.
bwt v1, Mar. 16, 2011. 4 BWT based configurations.
unzp 1.00, May 10, 2011, a block level parallel decompresser optimized for fast, mid, max, bwtrle1, bwt2 models with source level JIT for other models.
zp 1.01, May 12, 2011, a block level parallel compressor with 4 levels (bwtrle1, bwt2, mid, max). With unzp replaces pzpaq.
zp 1.02, May 16, 2011. Fixed -t option.
May 18, 2011. Undated zp.102.zip and unzp.100.zip with static x86-64 Linux binaries.
zp 1.03, May 26, 2011. Merges the compressor and decompresser unzp into one program.
wbpe 1.00, June 12, 2011. Dictionary preprocessor for text files.
wbpe 1.10, June 21, 2011.
zpaq 3.00, July 16, 2011. Combines features of zpaq v2.05 and zp v1.03. zp support is discontinued. Windows only.
zpaq 3.01, July 21, 2011. Adds 64 bit Linux support. Includes libzpaq 3.00.
bmp_j4a, July 21, 2011. Updated bmp_j4 .bmp configuration for zpaq v3.01.
libzpaq 3.00, July 28, 2011, from zpaq v3.01 but as a separate download.
libzpaq 4.00, Nov. 13, 2011. libzpaq.cpp, libzpaq.h, libzpaq.3.pod. Replaces source-level JIT with internal JIT for x86-32 and x86-64.
zpaq 4.00, Nov. 13, 2011. zpaq.cpp, zpaq.1.pod for use with libzpaq 4.00. Removes source generation, b and e commands and -j option.
calgarytest.zpaq, Nov. 13, 2011. Test case for ZPAQ compliance.
zpipe v2.01, Nov. 13, 2011. zpipe.exe linked to libzpaq v4.00. Source unchanged.
zpaq v4.01, Nov. 26, 2011. Source code adds incremental update and extraction.
zpaq v4.02, Nov. 28, 2011. Source code adds commands c, x output/, list hcomp/pcomp. Updated pi.cfg for this version.
libzpaq v4.01, Dec. 20, 2011. Fix for Mac OS (MAP_ANONYMOUS -> MAP_ANON).
zpaq v4.03, Dec. 21, 2011. Adds -n, -r, and -f options. Fixed bug in u (did not save filenames with no args).
lz1.zip, Dec. 29, 2011. LZ77 model.
ZPAQ level 2 standard, unzpaq200.cpp reference decoder, libzpaq 5.00 support, and calgarytest2.zpaq test case, Feb. 1, 2012. Level 2 allows the COMP section to be empty to store uncompressed (but possibly preprocessed) data to support faster compression models.
libzpaq 5.01, Feb. 2, 2012. Removed debugging code from libzpaq.cpp.
tiny_unzpaq.cpp v1.0, Mar. 21, 2012.
zpaq v4.04, Mar. 26, 2012. Fixed bug in r command that truncated output file.
zpsfx v1.01, Apr. 4, 2012. Self extractor modified by Klaus Post to create directories as needed.

ZPAQ is intended to replace PAQ and its variants (PAQ8, PAQ9A, LPAQ, LPQ1, etc) with similar or better compression in a portable, standard format. Current versions of PAQ break archive compatibility with each compression improvement. ZPAQ is intended to fix that. I no longer maintain the older PAQ code.

ZPAQ was written by Matt Mahoney, mattmahoneyfl (at) gmail (dot) com