Home > mailing lists

Re: Cost of XLogInsert CRC calculations - Mailing list pgsql-hackers

From	Simon Riggs
Subject	Re: Cost of XLogInsert CRC calculations
Date	May 10, 2005 18:49:59
Msg-id	1115761565.3830.196.camel@localhost.localdomain Whole thread Raw
In response to	Re: Cost of XLogInsert CRC calculations (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses	Re: Cost of XLogInsert CRC calculations
List	pgsql-hackers

Tree view

On Tue, 2005-05-10 at 10:34 -0400, Bruce Momjian wrote:
> Tom Lane wrote:
> > "Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes:
> > > I was just researching some articles on compression (zlib) and I saw mention
> > > of the Adler-32 algorithm which is supposed to be slightly less accurate
> > > than an equivalent CRC calculation but significantly faster to compute. I
> > > haven't located a good paper comparing the error rates of the two different
> > > checksums,
> > 
> > ... probably because there isn't one.  With all due respect to the Zip
> > guys, I doubt anyone has done anywhere near the analysis on Adler-32
> > that has been done on CRCs.  I'd much prefer to stick with true CRC
> > and drop it to 32 bits than go with a less-tested algorithm.  Throwing
> > more bits at the problem doesn't necessarily create a safer checksum.
> 
> Agreed.  64-bit was overkill when we added it, and it is now shown to be
> a performance problem.

Hold on... Tom has shown that there is a performance problem with the
existing CRC calculation. Thanks to Mark for staying on top of that with
some good ideas. 

The cause of the performance problem has been attributed to it being a
64-bit rather than 32-bit calculation. That is certainly part of it, but
I have seen evidence that there is an Intel processor stall associated
with the use of a single byte constant somewhere in the algorithm. So
I'm unclear as to what extent the poor performance is attributable to
either issue.

That's where my experience stops so I have highlighted that for somebody
with more hardware specific assembler experience to have a look at the
algorithm. Fixing that, if possible, could greatly improve the
performance whether or not we drop from 64 to 32 bits. My hope for
outside assistance on that looks like it is not now forthcoming.

My guess would be that the algorithm's use of the byte-by-byte
calculation together with a bitmask of &FF is responsible. Perhaps
varying the length of the bitmask to either &000000FF or longer may
help? (see src/include/xlog_internal.h)

Best Regards, Simon Riggs

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 10 May 2005, 18:18:56
Subject: Re: [PATCHES] Cleaning up unreferenced table files

From: Simon Riggs
Date: 10 May 2005, 19:13:18
Subject: Re: Table Partitioning, Part 1

Re: Cost of XLogInsert CRC calculations - Mailing list pgsql-hackers

Previous

Next