Re: Quick-and-dirty compression for WAL backup blocks - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Quick-and-dirty compression for WAL backup blocks
Date
Msg-id 1117582367.3844.805.camel@localhost.localdomain
Whole thread Raw
In response to Quick-and-dirty compression for WAL backup blocks  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Quick-and-dirty compression for WAL backup blocks  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 2005-05-31 at 16:26 -0400, Tom Lane wrote:
> The TODO item that comes to mind immediately is "Compress WAL entries".
> A more concrete version of this is: examine the page to see if the
> pd_lower field is between SizeOfPageHeaderData and BLCKSZ, and if so
> whether there is a run of consecutive zero bytes beginning at the
> pd_lower position.  Omit any such bytes from what is written to WAL.
> (This definition ensures that nothing goes wrong if the page does not
> follow the normal page layout conventions: the transformation is
> lossless no matter what, since we can always reconstruct the exact page
> contents.)  The overhead needed is only 2 bytes to show the number of
> bytes removed.
> 
> The other alternatives that were suggested included running the page
> contents through the same compressor used for TOAST, and implementing
> a general-purpose run-length compressor that could get rid of runs of
> zeroes anywhere on the page.  However, considering that the compression
> work has to be done while holding WALInsertLock, it seems to me there
> is a strong premium on speed.  I think that lets out the TOAST
> compressor, which isn't amazingly speedy.  (Another objection to the
> TOAST compressor is that it certainly won't win on already-compressed
> toasted data.)  A run-length compressor would be reasonably quick but
> I think that the omit-the-middle-hole approach gets most of the possible
> win with even less work.  In particular, I think it can be proven that
> omit-the-hole will actually require less CPU than now, since counting
> zero bytes should be strictly faster than CRC'ing bytes, and we'll be
> able to save the CRC work on whatever bytes we omit.
> 
> Any objections?

None: completely agree with your analysis. Sounds great.

> It seems we are more or less agreed that 32-bit CRC ought to be enough
> for WAL; and we also need to make a change to ensure that backup blocks
> are positively linked to their parent WAL record, as I noted earlier
> today.  So as long as we have to mess with the WAL record format, I was
> wondering what else we could get done in the same change.

Is this a change that would be backpatched as you suggested previously?
It seems a rather large patch to change three things at once. Can the
backpatch wait until 8.1 has gone through beta to allow the changes to
be proven?

Best Regards, Simon Riggs




pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Cost of XLogInsert CRC calculations
Next
From: Simon Riggs
Date:
Subject: Re: Tablespace-level Block Size Definitions