Re: Enabling Checksums - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Enabling Checksums
Date
Msg-id 513809EA.2010103@2ndQuadrant.com
Whole thread Raw
In response to Re: Enabling Checksums  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 3/6/13 1:24 PM, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
>> On 2013-03-06 11:21:21 -0500, Garick Hamlin wrote:
>>> If picking a CRC why not a short optimal one rather than truncate CRC32C?
>
>> CRC32C is available in hardware since SSE4.2.
>
> I think that should be at most a fourth-order consideration, since we
> are not interested solely in Intel hardware, nor do we have any portable
> way of getting at such a feature even if the hardware has it.

True, but that situation might actually improve.

The Castagnoli CRC-32C that's accelerated on the better Intel CPUs is 
also used to protect iSCSI and SCTP (a streaming protocol).  And there 
is an active project to use a CRC32C to checksum ext4 metadata blocks on 
Linux:  https://ext4.wiki.kernel.org/index.php/Ext4_Metadata_Checksums 
https://groups.google.com/forum/?fromgroups=#!topic/linux.kernel/APKfoMzjgdY

Now, that project doesn't make the Postgres feature obsolete, because 
there's nowhere to put checksum data for every block on ext4 without 
whacking block alignment.  The filesystem can't make an extra 32 bits 
appear on every block any more than we can.  It's using a similar trick 
to the PG checksum feature, grabbing some empty space just for the 
metadata then shoving the CRC32C into there.  But the fact that this is 
going on means that there are already Linux kernel modules built with 
both software/hardware accelerated versions of the CRC32C function.  And 
the iSCSI/SCTP use cases means it's not out of the question this will 
show up in other useful forms one day.  Maybe two years from now, there 
will be a common Linux library that autoconf can find to compute the CRC 
for us--with hardware acceleration when available, in software if not.

The first of those ext4 links above even discusses the exact sort of 
issue we're facing.  The author wonders if the easiest way to proceed 
for 16 bit checksums is to compute the CRC32C, then truncate it, simply 
because CRC32C creation is so likely to get hardware help one day.  I 
think that logic doesn't really apply to the PostgreSQL case as strongly 
though, as the timetime before we can expect a hardware accelerated 
version to be available is much further off than a Linux kernel 
developer's future.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: Enabling Checksums
Next
From: Greg Stark
Date:
Subject: Re: Enabling Checksums