Re: Curiosity: what is PostgreSQL doing with data when "nothing" is happening? - Mailing list pgsql-novice

From Kevin Grittner
Subject Re: Curiosity: what is PostgreSQL doing with data when "nothing" is happening?
Date
Msg-id 20121126131545.127720@gmx.com
Whole thread Raw
Responses Re: Curiosity: what is PostgreSQL doing with data when "nothing" is happening?  (Gavan Schneider <pg-gts@snkmail.com>)
List pgsql-novice
Gavan Schneider wrote:
> On Sunday, November 25, 2012 at 04:27, Kevin Grittner wrote:
>
>> You might want to consider grabbing the source for
>> pg_clearxlogtail from pgfoundry and piping through that followed
>> by gzip in your archive script. An "empty" WAL file tends to be
>> about 16KB that way, rather than (for us) ranging between 4MB
>> and 8MB. That way you can keep the archive_timeout fairly short.

>   240 -rw-------+ 1 postgres _postgres  120852 Nov 26 14:56 0000000100000001000001C0.bz2
>  3096 -rw-------+ 1 postgres _postgres 1583557 Nov 26 15:06 0000000100000001000001C1.bz2
>     8 -rw-------+ 1 postgres _postgres     161 Nov 26 15:26 0000000100000001000001C2.bz2
>     8 -rw-------+ 1 postgres _postgres     158 Nov 26 15:26 0000000100000001000001C3.bz2

> PostgreSQL server idle throughout, and spitting out a WAL file
> for archive every 10 minutes

> Impressive what happens when pg_clearxlogtail gets into the mix
> halfway through! :)

I think I can spot where you changed the script.  :-)

Impressive that bzip2 does two orders of magnitude better than gzip
with this. In my benchmarks I found that piping WAL files through
pg_clearxlogtail followed by gzip was consistently faster than just
gzip -- even when the file was full. I put it down to a difference
in how buffering was handled. I'll have to check the performance of
bzip2 compared to gzip.

> Would the universe as we know it be upset if there was a
> postgresql.conf option such as:
>
>  archive_zero_fill_on_segment_switch = on|off # default off
>
> This would achieve the idle compression result much more
> elegantly (I know, it's still a hack) for those who have the
> need, without, as far as I can tell, breaking anything else.

The problem is that this would involve 16MB of writes to the OS for
every WAL-file switch, which might even result in that much actual
disk activity in many environments. The filter I wrote doesn't add
any disk I/O and (as I mentioned above), actually decreased CPU
time, at least in my benchmarks on SLES 10. I guess in environments
where the hit wouldn't be noticed and the WAL files were being
compressed it might be convenient, but the difference between that
and piping through the extra filter in the archive script is
minimal enough to make it questionable whether it is worth it.

-Kevin


pgsql-novice by date:

Previous
From: Gavan Schneider
Date:
Subject: Re: Curiosity: what is PostgreSQL doing with data when "nothing" is happening?
Next
From: Ennio-Sr
Date:
Subject: Re: Postgresql-8.4: File System Level Backup (& recovery failure)