Re: Should we cacheline align PGXACT? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Should we cacheline align PGXACT?
Date
Msg-id 20160822021747.u5bqx2xwwjzac5u5@alap3.anarazel.de
Whole thread Raw
In response to Re: Should we cacheline align PGXACT?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2016-08-20 14:33:13 -0400, Robert Haas wrote:
> On Aug 19, 2016, at 2:12 AM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
> > Hackers,
> > 
> > originally this idea was proposed by Andres Freund while experimenting with lockfree Pin/UnpinBuffer [1].
> > The patch is attached as well as results of pgbench -S on 72-cores machine.  As before it shows huge benefit in
thiscase.
 
> > For sure, we should validate that it doesn't cause performance regression in other cases.  At least we should test
read-writeand smaller machines.
 
> > Any other ideas?
> 
> Wow, nice results.  My intuition on why PGXACT helped in the first
> place was that it minimized the number of cache lines that had to be
> touched to take a snapshot. Padding obviously would somewhat increase
> that again, so I can't quite understand why it seems to be
> helping... any idea?

I don't think it's that surprising: PGXACT->xid is written to each
transaction, and ->xmin is often written to multiple times per
transaction. That means that if a PGXACT's cacheline is shared between
backends one write will often first have another CPU flush it's store
buffer / L1 / L2 cache. If there's several hops between two cores, that
can mean quite a bit of added latency.  I previously played around with
*removing* the optimization of resetting ->xmin when not required
anymore - and on a bigger machine it noticeably increased throughput on
higher client counts.

To me it's pretty clear that rounding up PGXACT's size to a 16 bytes
(instead of the current 12, with 4 byte alignment) is going to be a win,
the current approach just leeds to pointless sharing.  Besides, storing
the database oid in there will allow GetOldestXmin() to only use PGXACT,
and could, with a bit more work, allow to ignore other databases in
GetSnapshotData().

I'm less sure that going up to a full cacheline is always a win.

Andres



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Patch: initdb: "'" for QUOTE_PATH (non-windows)
Next
From: Michael Paquier
Date:
Subject: Re: replication slots replicated to standbys?