Re: Issue with the PRNG used by Postgres - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Issue with the PRNG used by Postgres |
Date | |
Msg-id | 4084246.1712768585@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Issue with the PRNG used by Postgres (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Issue with the PRNG used by Postgres
|
List | pgsql-hackers |
I wrote: > I don't think it's correct to re-initialize the SpinDelayStatus each > time around the outer loop. That state should persist through the > entire acquire operation, as it does in a regular spinlock acquire. > As this stands, it resets the delay to minimum each time around the > outer loop, and I bet it is that behavior not the RNG that's to blame > for what he's seeing. After thinking about this some more, it is fairly clear that that *is* a mistake that can cause a thundering-herd problem. Assume we have two or more backends waiting in perform_spin_delay, and for whatever reason the scheduler wakes them up simultaneously. They see the LW_FLAG_LOCKED bit clear (because whoever had the lock when they went to sleep is long gone) and iterate the outer loop. One gets the lock; the rest go back to sleep. And they will all sleep exactly MIN_DELAY_USEC, because they all reset their SpinDelayStatus. Lather, rinse, repeat. If the s_lock code were being used as intended, they would soon acquire different cur_delay settings; but that never happens. That is not the RNG's fault. So I think we need something like the attached. regards, tom lane diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c index b1e388dc7c..54c84b9d67 100644 --- a/src/backend/storage/lmgr/lwlock.c +++ b/src/backend/storage/lmgr/lwlock.c @@ -857,46 +857,49 @@ static void LWLockWaitListLock(LWLock *lock) { uint32 old_state; -#ifdef LWLOCK_STATS - lwlock_stats *lwstats; - uint32 delays = 0; - lwstats = get_lwlock_stats_entry(lock); -#endif + /* + * Try once to acquire the lock, before we go to the trouble of setting up + * state for the spin loop. + */ + old_state = pg_atomic_fetch_or_u32(&lock->state, LW_FLAG_LOCKED); + if (!(old_state & LW_FLAG_LOCKED)) + return; /* got lock */ - while (true) { - /* always try once to acquire lock directly */ - old_state = pg_atomic_fetch_or_u32(&lock->state, LW_FLAG_LOCKED); - if (!(old_state & LW_FLAG_LOCKED)) - break; /* got lock */ + SpinDelayStatus delayStatus; +#ifdef LWLOCK_STATS + lwlock_stats *lwstats = get_lwlock_stats_entry(lock); +#endif - /* and then spin without atomic operations until lock is released */ - { - SpinDelayStatus delayStatus; + init_local_spin_delay(&delayStatus); - init_local_spin_delay(&delayStatus); + while (true) + { + /* try again to acquire lock directly */ + old_state = pg_atomic_fetch_or_u32(&lock->state, LW_FLAG_LOCKED); + if (!(old_state & LW_FLAG_LOCKED)) + break; /* got lock */ + /* spin without atomic operations until lock is released */ while (old_state & LW_FLAG_LOCKED) { perform_spin_delay(&delayStatus); old_state = pg_atomic_read_u32(&lock->state); } -#ifdef LWLOCK_STATS - delays += delayStatus.delays; -#endif - finish_spin_delay(&delayStatus); + + /* + * Retry. The lock might obviously already be re-acquired by the + * time we're attempting to get it again. + */ } - /* - * Retry. The lock might obviously already be re-acquired by the time - * we're attempting to get it again. - */ - } + finish_spin_delay(&delayStatus); #ifdef LWLOCK_STATS - lwstats->spin_delay_count += delays; + lwstats->spin_delay_count += delayStatus.delays; #endif + } } /*
pgsql-hackers by date: