Re: Cost of XLogInsert CRC calculations - Mailing list pgsql-hackers

From Mark Cave-Ayland
Subject Re: Cost of XLogInsert CRC calculations
Date
Msg-id 9EB50F1A91413F4FA63019487FCD251D113371@WEBBASEDDC.webbasedltd.local
Whole thread Raw
In response to Re: Cost of XLogInsert CRC calculations  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Cost of XLogInsert CRC calculations  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Cost of XLogInsert CRC calculations  (Manfred Koizar <mkoi-pg@aon.at>)
List pgsql-hackers
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: 27 May 2005 17:49
> To: Mark Cave-Ayland (External)
> Cc: 'Manfred Koizar'; 'Greg Stark'; 'Bruce Momjian';
> pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Cost of XLogInsert CRC calculations

(cut)

> I went back and looked at the code, and see that I was misled by
> terminology: what we've been calling "2x32" in this thread is
> not two independent CRC32 calculations, it is use of 32-bit
> arithmetic to execute one CRC64 calculation.

Yeah, I did find the terminology a little confusing until I looked at the
source itself. It doesn't make much sense publishing numbers if you don't
know their meaning ;)

> Based on the numbers we've seen so far, one could argue for
> staying with the 64-bit CRC, but changing the rule we use for
> selecting which implementation code to use: use the true
> 64-bit code only when sizeof(unsigned long) == 64, and
> otherwise use the 2x32 code, even if there is a 64-bit
> unsigned long long type available.  This essentially assumes
> that the unsigned long long type isn't very efficient, which
> isn't too unreasonable.  This would buy most of the speedup
> without giving up anything at all in the error-detection department.

All our servers are x86 based Linux with gcc, so if a factor of 2 speedup
for CPU calculations is the minimum improvement that we get as a result of
this thread then I would be very happy.

> Alternatively, we might say that 64-bit CRC was overkill from
> day one, and we'd rather get the additional 10% or 20% or so
> speedup.  I'm kinda leaning in that direction, but only weakly.

What would you need to persuade you either way? I believe that disk drives
use CRCs internally to verify that the data has been read correctly from
each sector. If the majority of the errors would be from a disk failure,
then a corrupt sector would have to pass the drive CRC *and* the PostgreSQL
CRC in order for an XLog entry to be considered valid. I would have thought
the chances of this being able to happen would be reasonably small and so
even with CRC32 this can be detected fairly accurately.

In the case of an OS crash then we could argue that there may be a partially
written sector to the disk, in which case again either one or both of the
drive CRC and the PostgreSQL CRC would be incorrect and so this condition
could also be reasonably detected using CRC32.

As far as I can tell, the main impact of this would be that we would reduce
the ability to accurately detect multiple random bit errors, which is more
the type of error I would expect to occur in RAM (alpha particles etc.). How
often would this be likely to occur? I believe that different generator
polynomials have different characteristics that can make them more sensitive
to a particular type of error. Perhaps Manfred can tell us the generator
polynomial that was used to create the lookup tables?


Kind regards,

Mark.

------------------------
WebBased Ltd
South West Technology Centre
Tamar Science Park
Plymouth
PL6 8BT

T: +44 (0)1752 797131
F: +44 (0)1752 791023
W: http://www.webbased.co.uk




pgsql-hackers by date:

Previous
From: Dennis Bjorklund
Date:
Subject: Re: Backslash handling in strings
Next
From: Alvaro Herrera
Date:
Subject: Re: A 2 phase commit weirdness