Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From ktm@rice.edu
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id 20140914172332.GA4429@aart.rice.edu
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Sun, Sep 14, 2014 at 05:21:10PM +0200, Andres Freund wrote:
> On 2014-09-13 20:27:51 -0500, ktm@rice.edu wrote:
> 
> > Also, while I understand that CRC has a very venerable history and
> > is well studied for transmission type errors, I have been unable to find
> > any research on its applicability to validating file/block writes to a
> > disk drive.
> 
> Which incidentally doesn't really match what the CRC is used for
> here. It's used for individual WAL records. Usually these are pretty
> small, far smaller than disk/postgres' blocks on average. There's a
> couple scenarios where they can get large, true, but most of them are
> small.
> The primary reason they're important is to correctly detect the end of
> the WAL. To ensure we're interpreting half written records, or records
> from before the WAL file was overwritten.
> 
> 
> > While it is to quote you "unbeaten collision wise", xxhash,
> > both the 32-bit and 64-bit version are its equal.
> 
> Aha? You take that from the smhasher results?

Yes.

> 
> > Since there seems to be a lack of research on disk based error
> > detection versus CRC polynomials, it seems likely that any of the
> > proposed hash functions are on an equal footing in this regard. As
> > Andres commented up-thread, xxhash comes along for "free" with lz4.
> 
> This is pure handwaving.

Yes. But without research to support the use of CRC32 in this same
environment, it is handwaving in the other direction. :)

Regards,
Ken



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Aussie timezone database changes incoming
Next
From: Emre Hasegeli
Date:
Subject: Re: KNN-GiST with recheck