Re: Avoiding repeated snapshot computation - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Avoiding repeated snapshot computation
Date
Msg-id CA+TgmoaKu-h=1oHFqwUw+9Qg4zkyvS8bWcW6Kiw2qxFp3JvC1w@mail.gmail.com
Whole thread Raw
In response to Re: Avoiding repeated snapshot computation  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Sat, Nov 26, 2011 at 5:49 PM, Andres Freund <andres@anarazel.de> wrote:
> On Saturday, November 26, 2011 11:39:23 PM Robert Haas wrote:
>> On Sat, Nov 26, 2011 at 5:28 PM, Andres Freund <andres@anarazel.de> wrote:
>> > On Saturday, November 26, 2011 09:52:17 PM Tom Lane wrote:
>> >> I'd just as soon keep the fields in a logical order.
>> >
>> > Btw, I don't think the new order is necessarily worse than the old one.
>>
>> You forget to attach the benchmark results.
>>
>> My impression is that cache lines on modern hardware are 64 or 128
>> *bytes*, in which case this wouldn't matter a bit.
> All current x86 cpus use 64bytes. The 2nd 128bit reference was a typo. Sorry
> for that.
> And why is 72=>56 *bytes* (I even got that one right) not relevant for 64byte
> cachelines?

OK, so I got around to looking at this again today; sorry for the
delay.  I agree that 72 -> 56 bytes is very relevant for 64-byte cache
lines.  I had not realized the structure was as big as that.

> And yea. I didn't add benchmark results. I don't think I *have* to do that
> when making suggestions to somebody trying to improve something specific. I
> also currently don't have hardware where I can sensibly run at a high enough
> concurrency to see that GetSnapshotData takes ~40% of runtime.
> Additional cacheline references around synchronized access can hurt to my
> knowledge...

I tested this on Nate Boley's 32-core box today with the 32 clients
doing a select-only pgbench test.  Results of individual 5 minute
runs:

results.master.32:tps = 171701.947919 (including connections establishing)
results.master.32:tps = 227777.430112 (including connections establishing)
results.master.32:tps = 218257.478461 (including connections establishing)
results.master.32:tps = 226425.964855 (including connections establishing)
results.master.32:tps = 218687.662373 (including connections establishing)
results.master.32:tps = 219819.451605 (including connections establishing)
results.master.32:tps = 216800.131634 (including connections establishing)
results.snapshotdata-one-cacheline.32:tps = 181997.531357 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 171505.145440 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 226970.348605 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 169725.115763 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 219548.690522 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 175440.705722 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 217154.173823 (including
connections establishing)

There's no statistically significant difference between those two data
sets; if anything, the results with the patch look like they might be
worse, although I believe that's an artifact - some runs seem to
mysteriously come out in the 170-180 rangeinstead of the 215-225
range, with nothing in between.  But even if you only average the good
runs out of each set there's no material difference.

Having said that, I am coming around to the view that we should apply
this patch anyway, just because it reduces memory consumption.  Since
the size change crosses a power-of-two boundary, I believe it will
actually cut down the size of a palloc'd chunk for a SnapshotData
object from 128 bytes to 64 bytes.  I doubt we can measure the benefit
of that even on a microbenchmark unless someone has a better idea for
making PostgreSQL take ridiculous numbers of snapshots than I do.  It
still seems like a good idea, though: a penny saved is a penny earned.

With response to Tom's objections downthread, I don't think that the
new ordering is significantly more confusing than the old one.
xcnt/subxcnt/xip/subxip doesn't seem measurably less clear than
xcnt/xip/subxcnt/subxip, and we're not normally in the habit of
letting such concerns get in the way of squeezing out alignment
padding.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: xlog location arithmetic
Next
From: Greg Smith
Date:
Subject: Re: Command Triggers