Probably this software existed for a quite long time but I didn’t know its existence ’til now: pbzip2
it’s basically a bzip2 algorithm implementation with pthreads support. This mean, in a always more SMP world, that you can greatly improve your bzipping perfomances (divide the zipping time by the number of cores you have et voilà!)
Compression syntax is totally compatible:
$ pbzip2 big.file
while to unzip you have to do
$ pbzip2 -d big.file.bz2
Use with caution (or with -l and -p switches) cause you can easily saturate your 4xSix-cores monster.
4 thoughts on “pbzip2: parallel bzipping”
I love pbzip2, been using it for a couple of years now, after stumbling across it randomly. Even on dual-core it can nicely shift the bottleneck away from the CPU and plonk if firmly in the direction of disk IO. It’s in the Ubuntu repositories, haven’t checked to see if it’s in any other distro’s repo.
Utilizing /dev/shm to remove the disk bottleneck and using a 559Mb mysqldump output, on a core2duo box:
It’s in Debian repos as well and on the author’s page there are precompiled packages for almost any distro out there. I did some benchmarking as well on my dual core desktop machine and the results are similar to yours.
Also check out lbzip2: http://lacos.hu/
Yes i found pbzip2 first, then came across lbzip2. Out of the 2, i prefer lbzip2 – i did some benchmarks for bzip2 vs pbzip2 vs lbzip2 at http://vbtechsupport.com/1614/