Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)
Date
Msg-id 541815B0.2050006@vmware.com
Whole thread Raw
In response to Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)  (Andres Freund <andres@2ndquadrant.com>)
Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 09/16/2014 01:28 PM, Andres Freund wrote:
> On 2014-09-16 15:43:06 +0530, Amit Kapila wrote:
>> On Sat, Sep 13, 2014 at 1:33 AM, Heikki Linnakangas <hlinnakangas@vmware.com>
>> wrote:
>>> On 09/12/2014 10:54 PM, Abhijit Menon-Sen wrote:
>>>> At 2014-09-12 22:38:01 +0300, hlinnakangas@vmware.com wrote:
>>>>> We probably should consider switching to a faster CRC algorithm again,
>>>>> regardless of what we do with compression.
>>>>
>>>> As it happens, I'm already working on resurrecting a patch that Andres
>>>> posted in 2010 to switch to zlib's faster CRC implementation.
>>>
>>> As it happens, I also wrote an implementation of Slice-by-4 the other day
>> :-).
>>> Haven't gotten around to post it, but here it is.
>>
>> Incase we are using the implementation for everything that uses
>> COMP_CRC32() macro, won't it give problem for older version
>> databases.  I have created a database with Head code and then
>> tried to start server after applying this patch it gives below error:
>> FATAL:  incorrect checksum in control file
>
> That's indicative of a bug. This really shouldn't cause such problems -
> at least my version was compatible with the current definition, and IIRC
> Heikki's should be the same in theory. If I read it right.
>
>> In general, the idea sounds quite promising.  To see how it performs
>> on small to medium size data, I have used attached test which is
>> written be you (with some additional tests) during performance test
>> of WAL reduction patch in 9.4.
>
> Yes, we should really do this.
>
>> The patched version gives better results in all cases
>> (in range of 10~15%), though this is not the perfect test, however
>> it gives fair idea that the patch is quite promising.  I think to test
>> the benefit from crc calculation for full page, we can have some
>> checkpoint during each test (may be after insert).  Let me know
>> what other kind of tests do you think are required to see the
>> gain/loss from this patch.
>
> I actually think we don't really need this. It's pretty evident that
> slice-by-4 is a clear improvement.
>
>> I think the main difference in this patch and what Andres has
>> developed sometime back was code for manually unrolled loop
>> doing 32bytes at once, so once Andres or Abhijit will post an
>> updated version, we can do some performance tests to see
>> if there is any additional gain.
>
> If Heikki's version works I see little need to use my/Abhijit's
> patch. That version has part of it under the zlib license. If Heikki's
> version is a 'clean room', then I'd say we go with it. It looks really
> quite similar though... We can make minor changes like additional
> unrolling without problems lateron.

I used http://create.stephan-brumme.com/crc32/#slicing-by-8-overview as 
reference - you can probably see the similarity. Any implementation is 
going to look more or less the same, though; there aren't that many ways 
to write the implementation.

- Heikki




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)
Next
From: Andres Freund
Date:
Subject: Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)