Re: Is there anything special about pg_dump's compression? - Mailing list pgsql-sql

From Jean-David Beyer
Subject Re: Is there anything special about pg_dump's compression?
Date
Msg-id 473D92F3.3020707@verizon.net
Whole thread Raw
In response to Re: Is there anything special about pg_dump's compression?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Is there anything special about pg_dump's compression?  (Shane Ambler <pgsql@Sheeky.Biz>)
List pgsql-sql
Tom Lane wrote:
> Jean-David Beyer <jeandavid8@verizon.net> writes:
>> I turned the software compression off. It took:
>> 524487428 bytes (524 MB) copied, 125.394 seconds, 4.2 MB/s
> 
>> When I let the software compression run, it uses only 30 MBytes. So whatever
>> compression it uses is very good on this kind of data.
>> 29810260 bytes (30 MB) copied, 123.145 seconds, 242 kB/s
> 
> Seems to me the conclusion is obvious: you are writing about the same
> number of bits to physical tape either way. 

I guess so. I _am_ impressed by how much compression is achieved.

> The physical tape speed is
> surely the real bottleneck here, and the fact that the total elapsed
> time is about the same both ways proves that about the same number of
> bits went onto tape both ways.

I do not get that. If the physical tape speed is the bottleneck, why is it
only about 242 kB/s in the software-compressed case, and 4.2 MB/s in the
hardware-uncompressed case? The tape drive usually gives over 6 MB/s rates
when running a BRU (similar to find > cpio) when doing a backup of the rest
of my system (where not all the files compress very much)? Also, when doing
a BRU backup, the amount of cpu time is well under 100%. If I am right, the
postgres server is running 100% of the CPU and the client (pg_dump) is the
one that actually compresses (if it is enabled in software) is either 40% or
12%.
> 
> The quoted MB and MB/s numbers are not too comparable because they are
> before and after compression respectively.
> 
> The software compression seems to be a percent or two better than the
> hardware's compression, but that's not enough to worry about really.

Agreed. The times for backup (and restore) are acceptable. Being new to
postgres, I am just interested in how it works from a user's point-of-view.

> What you should ask yourself is whether you have other uses for the main
> CPU's cycles during the time you're taking backups.  If so, offload the
> compression cycles onto the tape hardware.  If not, you might as well
> gain the one or two percent win.

Sure, I always have something to do with the excess cycles, though it is not
an obsession of mine.

But from intellectual curiousity, why is the postgres _server_ taking 100%
of a cpu when doing a backup when it is the postgres _client_ that is
actually running the tape drive -- especially if it is tape IO limited?

--  .~.  Jean-David Beyer          Registered Linux User 85642. /V\  PGP-Key: 9A2FC99A         Registered Machine
241939./()\ Shrewsbury, New Jersey    http://counter.li.org^^-^^ 07:40:01 up 24 days, 58 min, 0 users, load average:
4.30,4.29, 4.21
 


pgsql-sql by date:

Previous
From: "Bart Degryse"
Date:
Subject: Re: trap for any exception
Next
From: Shane Ambler
Date:
Subject: Re: Is there anything special about pg_dump's compression?