Thread: GZIP of pre-zipped output

GZIP of pre-zipped output

From
Dave Crooke
Date:

If you are really so desparate to save a couple of GB that you are resorting to -Z9 then I'd suggest using bzip2 instead.

bzip is designed for things like installer images where there will be massive amounts of downloads, so it uses a ton of cpu during compression, but usually less than -Z9 and makes a better result.

Cheers
Dave

On Mar 21, 2010 10:50 AM, "David Newall" <postgresql@davidnewall.com> wrote:

Tom Lane wrote:
>
> I would bet that the reason for the slow throughput is that gzip
> is fruitlessl...
Indeed, I didn't expect much reduction in size, but I also didn't expect a four-order of magnitude increase in run-time (i.e. output at 10MB/second going down to 500KB/second), particularly as my estimate was based on gzipping a previously gzipped file.  I think it's probably pathological data, as it were.  Might even be of interest to gzip's maintainers.

Re: GZIP of pre-zipped output

From
Craig Ringer
Date:
On 22/03/2010 1:04 AM, Dave Crooke wrote:
> If you are really so desparate to save a couple of GB that you are
> resorting to -Z9 then I'd suggest using bzip2 instead.
>
> bzip is designed for things like installer images where there will be
> massive amounts of downloads, so it uses a ton of cpu during
> compression, but usually less than -Z9 and makes a better result.

bzip2 doesn't work very well on gzip'd (deflated) data, though. For good
results, you'd want to feed it uncompressed data, which is a bit of a
pain when the compression is part of the PDF document structure and when
you otherwise want the PDFs to remain compressed.

Anyway, if you're going for extreme compression, these days 7zip is
often a better option than bzip2.

--
Craig Ringer

Re: GZIP of pre-zipped output

From
Scott Marlowe
Date:
On Sun, Mar 21, 2010 at 8:46 PM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
> On 22/03/2010 1:04 AM, Dave Crooke wrote:
>>
>> If you are really so desparate to save a couple of GB that you are
>> resorting to -Z9 then I'd suggest using bzip2 instead.
>>
>> bzip is designed for things like installer images where there will be
>> massive amounts of downloads, so it uses a ton of cpu during
>> compression, but usually less than -Z9 and makes a better result.
>
> bzip2 doesn't work very well on gzip'd (deflated) data, though. For good
> results, you'd want to feed it uncompressed data, which is a bit of a pain
> when the compression is part of the PDF document structure and when you
> otherwise want the PDFs to remain compressed.
>
> Anyway, if you're going for extreme compression, these days 7zip is often a
> better option than bzip2.

There's often a choice of two packages, 7z, and 7za, get 7za, it's the
later model version.