Re: pg_dump far too slow - Mailing list pgsql-performance

From Craig Ringer
Subject Re: pg_dump far too slow
Date
Msg-id 4BA62596.8050604@postnewspapers.com.au
Whole thread Raw
In response to Re: pg_dump far too slow  (David Newall <postgresql@davidnewall.com>)
Responses Re: pg_dump far too slow  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On 21/03/2010 9:17 PM, David Newall wrote:
> Thanks for all of the suggestions, guys, which gave me some pointers on
> new directions to look, and I learned some interesting things.
>

> Unfortunately one of these processes dropped eventually, and, according
> to top, the only non-idle process running was gzip (100%.) Obviously
> there were postgress and pg_dump processes, too, but they were throttled
> by gzip's rate of output and effectively idle (less than 1% CPU). That
> is also interesting. The final output from gzip was being produced at
> the rate of about 0.5MB/second, which seems almost unbelievably slow.

CPU isn't the only measure of interest here.

If pg_dump and the postgres backend it's using are doing simple work
such as reading linear data from disk, they won't show much CPU activity
even though they might be running full-tilt. They'll be limited by disk
I/O or other non-CPU resources.

> and wonder if I should read up on gzip to find why it would work so
> slowly on a pure text stream, albeit a representation of PDF which
> intrinsically is fairly compressed.

In fact, PDF uses deflate compression, the same algorithm used for gzip.
Gzip-compressing PDF is almost completely pointless - all you're doing
is compressing some of the document structure, not the actual content
streams. With PDF 1.5 and above using object and xref streams, you might
not even be doing that, instead only compressing the header and trailer
dictionary, which are probably in the order of a few hundred bytes.

Compressing PDF documents is generally a waste of time.

--
Craig Ringer

pgsql-performance by date:

Previous
From: David Newall
Date:
Subject: Re: pg_dump far too slow
Next
From: Dave Crooke
Date:
Subject: Re: pg_dump far too slow