Re: What exactly is our CRC algorithm? - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: What exactly is our CRC algorithm?
Date
Msg-id 546F1B9E.2070908@vmware.com
Whole thread Raw
In response to Re: What exactly is our CRC algorithm?  (Abhijit Menon-Sen <ams@2ndQuadrant.com>)
Responses Re: What exactly is our CRC algorithm?  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On 11/21/2014 12:11 PM, Abhijit Menon-Sen wrote:
> At 2014-11-20 13:47:00 +0530, ams@2ndQuadrant.com wrote:
>>
>>> Suggestions for how to address (b) are welcome.
>
> With help from Andres, I set up a workload where XLogInsert* was at the
> top of my profiles: server with fsync and synchronous_commit off, and
> pgbench running a multiple-row insert into a single-text-column table
> with -M prepared -c 25 -t 250000 -f script.

How wide is the row, in terms of bytes? You should see bigger 
improvement with wider rows, as you get longer contiguous chunks of data 
to CRC that way. With very narrow rows, you might not see much 
difference because the chunks are so small.

Did you run these tests with a fresh checkout, including the WAL format 
patch I committed yesterday? That would make some difference to how many 
XLogRecData chunks there are for each insertion record.

If that's the problem, it might be beneficial to memcpy() all the data 
to a temporary buffer, and calculate the CRC over the whole, instead of 
CRC'ing each XLogRecData chunk separately. XLogRecordAssemble() uses a 
scratch area, hdr_scratch, for building all the headers. You could check 
how much rmgr-specific data there is, and if there isn't much, just 
append the data to that scratch area too.

> Unfortunately I can't see much difference despite running things with
> slightly different parameters a few dozen times. For example, here are
> real/user/sys times from three runs each with HEAD and slice-by-8 on an
> otherwise-idle i7-3770 server with a couple of undistinguished Toshiba
> 7200rpm SATA disks in RAID-1:
>
> HEAD:
>      2m24.822s/0m18.776s/0m23.156s
>      3m34.586s/0m18.784s/0m24.324s
>      3m41.557s/0m18.640s/0m23.920s
>
> Slice-by-8:
>      2m26.977s/0m18.420s/0m22.884s
>      3m36.664s/0m18.376s/0m24.232s
>      3m43.930s/0m18.580s/0m24.560s
>
> I don't know how to interpret these results (especially the tendency for
> the tests to slow down as time passes, with every version).

The user/sys times are very stable, but the real time is much higher and 
increases. Does that mean that it's blocked on I/O?

- Heikki




pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: WAL format and API changes (9.5)
Next
From: Andres Freund
Date:
Subject: Re: What exactly is our CRC algorithm?