On 2015-06-10 13:52:14 -0400, Robert Haas wrote:
> On Wed, Jun 10, 2015 at 1:39 PM, Andres Freund <andres@anarazel.de> wrote:
> > Well, not necessarily. If you can write your algorithm in a way that
> > xadd etc are used, instead of a lock cmpxchg, you're actually never
> > spinning on x86 as it's guaranteed to succeed. I think that's possible
> > for most of the places that currently lock buffer headers.
>
> Well, it will succeed by looping inside the instruction, I suppose. But OK.
On x86 atomic ops hold the bus lock for a short while - that's why
they're that expensive - and in that case you directly can do useful
work (xadd) or just return after reading the current value if there was
a concurrent change (cmpxchg). Afaik there's a more fundamental
difference than one variant just doing the retry in microcode. It's hard
to get definitive answers to that.
> > (I had a version of the lwlock changes that used xadd for shared lock
> > acquisition - but the code needed to back out in error cases made things
> > more complicated, and the benefit on a four socket machine wasn't that
> > large)
>
> Now that we (EnterpriseDB) have this 8-socket machine, maybe we could
> try your patch there, bound to varying numbers of sockets.
It'd be a significant amount of work to rebase it ontop current HEAD. I
guess the easiest thing would be to try an older version of the patch
with the xadd in place, and use a tree from back then.
Greetings,
Andres Freund