Hi,
On 2024-04-10 12:28:10 -0400, Tom Lane wrote:
> Actually ... Parag mentioned that this was specifically about
> lwlock.c's usage of spinlocks. It doesn't really use a spinlock,
> but it does use s_lock.c's delay logic, and I think it's got the
> usage pattern wrong:
>
> while (true)
> {
> /* always try once to acquire lock directly */
> old_state = pg_atomic_fetch_or_u32(&lock->state, LW_FLAG_LOCKED);
> if (!(old_state & LW_FLAG_LOCKED))
> break; /* got lock */
>
> /* and then spin without atomic operations until lock is released */
> {
> SpinDelayStatus delayStatus;
>
> init_local_spin_delay(&delayStatus);
>
> while (old_state & LW_FLAG_LOCKED)
> {
> perform_spin_delay(&delayStatus);
> old_state = pg_atomic_read_u32(&lock->state);
> }
> #ifdef LWLOCK_STATS
> delays += delayStatus.delays;
> #endif
> finish_spin_delay(&delayStatus);
> }
>
> /*
> * Retry. The lock might obviously already be re-acquired by the time
> * we're attempting to get it again.
> */
> }
>
> I don't think it's correct to re-initialize the SpinDelayStatus each
> time around the outer loop. That state should persist through the
> entire acquire operation, as it does in a regular spinlock acquire.
> As this stands, it resets the delay to minimum each time around the
> outer loop, and I bet it is that behavior not the RNG that's to blame
> for what he's seeing.
Hm, yea, that's not right. Normally this shouldn't be heavily contended enough
to matter. I don't think just pulling out the spin delay would be the right
thing though, because that'd mean we'd initialize it even in the much much
more common case of there not being any contention. I think we'll have to add
a separate fetch_or before the outer loop.
> (One should still wonder what is the LWLock usage pattern that is
> causing this spot to become so heavily contended.)
My suspicion is that it's a4adc31f690 which we only recently backpatched to
< 16.
Greetings,
Andres Freund