Calgary Compression Challenge

This is a continuation (without prize money) of the Calgary Compression Challenge, a contest run by Leonid A. Broukhis from May 21, 1996 through May 21, 2016. The goal of the contest is to produce the smallest possible archive containing a program that when run taking input from only other files in the archive (if any), outputs the 14 file Calgary corpus.

Leaderboard

SizeDateAuthor
759881Sep 1997Malcolm Taylor
692154Aug 2001Maxim Smirnov
680558Sep 2001Maxim Smirnov
653720Nov 2002Serge Voskoboynikov
645667Jan 10, 2004Matt Mahoney
637116Apr 2, 2004Alexander Rhatushnyak
608980Dec 31, 2004Alexander Rhatushnyak
603416Apr 4, 2005Przemysław Skibiński
596314Oct 2005Alexander Rhatushnyak
593620Dec 3, 2005Alexander Rhatushnyak
589863May 2006Alexander Rhatushnyak
580170Jul 2, 2010Alexander Rhatushnyak

Rules

Submissions must improve on the previous best result by at least 1000 bytes.

An archive is a file or set of files that may be processed by any of the following: unzip, bunzip2, unrar, and PPMd var. I. If submitting more than one file, then the size of the archive is calculated as the sum of the file sizes, plus the lengths of the file names, plus 4 per file.

A program is a 32 or 64 bit Linux or Windows executable program or a source program written in C, C++, or Perl. It must run to completion in 6 hours or less on a Core i7 M620 with 4 GB memory. If the archive contains one or more other files, then the program will be run once for each file with the file name passed as a command line argument. Otherwise it will be run with no arguments. The program must not take any input other than from the file whose name is passed to it.

I reserve the right to change these rules or to reject submissions not in keeping with the spirit of the contest.

Send your submission to Matt Mahoney at mattmahoneyfl at gmail.com. If accepted, I will post it and add your name to the leaderboard.

This page last updated on May 19, 2016.