bzip2 Vs pbzip2 File Compression Software: Efficiency Test

By Partho, Gaea News Network
Monday, June 7, 2010

bzip2bzip2 is an open source data compression algorithm and program. bzip2 is preferred for its ability to compresses most files effectively than the older LZW and Deflate (.zip and .gz) compression algorithms. However, it is considerably slower. bzip2 compresses data in blocks of size between 100 and 900 kB. It uses the Burrows-Wheeler transform to convert frequently-recurring character sequences into strings of identical letters. The major shortcomings with bzip2 is that it takes a large CPU time required for compression. This can be given to the fact that bzip2 is unable to use multiple core processors.
bzip2 excepts a list of file names to accompany the command-line flags. Each file is replaced by a compressed version of itself with the name. Each compressed file has the same modification date, permissions and when possible ownership as the corresponding original. This ensures that properties are retained after decompression time. The file name handling is naive in the sense that there is no mechanism for preserving original file names, permissions, ownerships or dates in filesystems that lacks the concepts or might have serious file name length restriction.

This inspired a modified version was created in 2003 called pbzip2 that supports multi-threading. pbzip2 is a parallel version of bzip2 that shared memory machines. It provides near-linear speedup when used on true multi-processor machines and 5-10% speedup on Hyperthreaded machines. Files compressed in pbzip are fully compatible with the regular bzip2 data so any files created with pbzip2 can be uncompressed by bzip2 and vice-versa. It provides near-linear speedup when used on true multi-processor machines and 5 to 10 % speedup on Hyperthreaded machines.

We conducted a test to see the difference in timings of the two file compressors bzip2 and pbzip2

System configuration: Intel(R) Core(TM)2 Duo CPU E7500  @ 2.93GHz,  4GB RAM

Timings for compressing a 1GB .txt file using bzip2

Time bzip2 1040955758.txt

real    2m19.142s
user    2m10.659s
sys     0m2.195s

Timings for extracting the .txt file using bzip2

real    0m55.514s
user    0m52.729s
sys     0m2.461s

Timings for compressing a 1GB .txt file

Time pbzip2 1040955758.txt using pbzip2

real    1m18.978s
user    2m29.859s
sys     0m5.853s

Timings for extracting the .txt file using pbzip2

real    0m37.159s
user    1m9.508s
sys     0m2.817s

YOUR VIEW POINT
NAME : (REQUIRED)
MAIL : (REQUIRED)
will not be displayed
WEBSITE : (OPTIONAL)
YOUR
COMMENT :