Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA
Date
Msg-id 20171019164636.zfjm242caowascem@alap3.anarazel.de
Whole thread Raw
In response to Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA  (Sokolov Yura <funny.falcon@postgrespro.ru>)
Responses Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA
List pgsql-hackers
On 2017-10-19 14:36:56 +0300, Sokolov Yura wrote:
> > > +    init_local_spin_delay(&delayStatus);
> > 
> > The way you moved this around has the disadvantage that we now do this -
> > a number of writes - even in the very common case where the lwlock can
> > be acquired directly.
> 
> Excuse me, I don't understand fine.
> Do you complain against init_local_spin_delay placed here?

Yes.


> Placing it in other place will complicate code.


> Or you complain against setting `mask` and `add`?

That seems right.


> In both cases, I think simpler version should be accepted first. It acts
> as algorithm definition. And it already gives measurable improvement.

Well, in scalability. I'm less sure about uncontended performance.



> > > +     * We intentionally do not call finish_spin_delay here, because
> > > the loop
> > > +     * above usually finished by queuing into the wait list on
> > > contention, and
> > > +     * doesn't reach spins_per_delay thereby doesn't sleep inside of
> > > +     * perform_spin_delay. Also, different LWLocks has very different
> > > +     * contention pattern, and it is wrong to update spin-lock
> > > statistic based
> > > +     * on LWLock contention.
> > > +     */
> > 
> > Huh? This seems entirely unconvincing. Without adjusting this here we'll
> > just spin the same way every iteration. Especially for the case where
> > somebody else holds LW_FLAG_LOCKED that's quite bad.
> 
> LWLock's are very different. Some of them are always short-term
> (BufferLock), others are always locked for a long time.

That seems not particularly relevant. The same is true for
spinlocks. The relevant question isn't how long the lwlock is held, it's
how long LW_FLAG_LOCKED is held - and that should only depend on
contention (i.e. bus speed, amount of times put into sleep while holding
lock, etc), not on how long the lock is held.

> I've tried to place this delay into lock itself (it has 2 free bytes),
> but this attempt performed worse.

That seems unsurprising - there's a *lot* of locks, and we'd have to
tune all of them. Additionally there's a bunch of platforms where we do
*not* have free bytes (consider the embedding in BufferTag).


> Now I understand, that delays should be stored in array indexed by
> tranche. But I have no time to test this idea. And I doubt it will give
> cardinally better results (ie > 5%), so I think it is better to accept
> patch in this way, and then experiment with per-tranche delay.

I don't think tranches have any decent predictive power here.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Fix traversal of half-frozen update chains
Next
From: Nico Williams
Date:
Subject: Re: [HACKERS] Interest in a SECURITY DEFINER function current_userstack access mechanism?