Re: Quick-and-dirty compression for WAL backup blocks - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Quick-and-dirty compression for WAL backup blocks
Date
Msg-id 26740.1117899967@sss.pgh.pa.us
Whole thread Raw
In response to Quick-and-dirty compression for WAL backup blocks  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Quick-and-dirty compression for WAL backup blocks
Re: Quick-and-dirty compression for WAL backup blocks
List pgsql-hackers
"Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes:
>> A run-length compressor would be reasonably quick but I think that the
>> omit-the-middle-hole approach gets most of the possible win with even
>> less work.

> I can't think that a RLE scheme would be much more expensive than a 'count
> the hole' approach with more benefit, so I wouldn't like to discount this
> straight away...

RLE would require scanning the whole page with no certainty of win,
whereas count-the-hole is a certain win since you only examine bytes
that are potentially removable from the later CRC calculation.

> If you do manage to go ahead with the code, I'd be very interested to see
> some comparisons in bytes written to XLog for old and new approaches for
> some inserts/updates. Perhaps we could ask Mark to run another TPC benchmark
> at OSDL when this and the CRC changes have been completed.

I've completed a test run for this (it's essentially MySQL's sql-bench
done immediately after initdb).  What I get is:

CVS tip of 6/1: ending WAL offset = 0/A364A780 = 2741282688 bytes written

CVS tip of 6/2: ending WAL offset = 0/8BB091DC = 2343604700 bytes written

or about a 15% savings.  This is with a checkpoint_segments setting of 30.
One can presume that the savings would be larger at smaller checkpoint
intervals and smaller at larger intervals, but I didn't try more than
one set of test conditions.

I'd say that's an improvement worth having, especially considering that
it requires no net expenditure of CPU time.  But the table is certainly
still open to discuss more complicated approaches.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: Re: Precedence of %
Next
From: Tom Lane
Date:
Subject: Re: Precedence of %