Thread: can while loop in ClockSweepTick function be kind of infinite loop in some cases?

This question is about ClockSweepTick function and the code is below.

 The value of expected, NBuffers, wrapped variable is fixed in the while loop, so that when the value of expected variable is not equal to StrategyControl->nextVictimBuffer, CAS operation fails and the while loop will be run kind-of infinitely. 
It is possible for this problem to occur when ClockSweepTick function is concurrently called and nextVictimBuffer is incremented by other process before CAS operation in the loop (ex: in this case, the value of expected variable is NBuffers+1 while the value of nextVictimBuffer variable is NBuffers+2. so CAS operation fails) 
I think. `expected = originalVictim + 1;` line should be in while loop (before acquiring spin lock) so that, even in the case above, expected variable is incremented for each loop and CAS operation will be successful at some point. 
Is my understanding correct? If so, I will send PR for fixing this issue.

Thank you in advance
Hayato Shiba



Hi,

On 2023-01-11 01:25:06 +0900, 斯波隼斗 wrote:
> This question is about ClockSweepTick function and the code is below.
>
https://github.com/postgres/postgres/blob/24d2b2680a8d0e01b30ce8a41c4eb3b47aca5031/src/backend/storage/buffer/freelist.c#L146-L165
> 
>  The value of expected, NBuffers, wrapped variable is fixed in the while
> loop, so that when the value of expected variable is not equal to
> StrategyControl->nextVictimBuffer, CAS operation fails and the while loop
> will be run kind-of infinitely.
> It is possible for this problem to occur when ClockSweepTick function is
> concurrently called and nextVictimBuffer is incremented by other process
> before CAS operation in the loop (ex: in this case, the value of expected
> variable is NBuffers+1 while the value of nextVictimBuffer variable is
> NBuffers+2. so CAS operation fails)
> I think. `expected = originalVictim + 1;` line should be in while loop
> (before acquiring spin lock) so that, even in the case above, expected
> variable is incremented for each loop and CAS operation will be successful
> at some point.
> Is my understanding correct? If so, I will send PR for fixing this issue.

Yes, I think your understanding might be correct. Interesting that this
apparently has never occurred.

Yes, please send a patch.

Thanks,

Andres



On Tue, Jan 10, 2023 at 12:40 PM Andres Freund <andres@anarazel.de> wrote:
> > I think. `expected = originalVictim + 1;` line should be in while loop
> > (before acquiring spin lock) so that, even in the case above, expected
> > variable is incremented for each loop and CAS operation will be successful
> > at some point.
> > Is my understanding correct? If so, I will send PR for fixing this issue.
>
> Yes, I think your understanding might be correct. Interesting that this
> apparently has never occurred.

Doesn't pg_atomic_compare_exchange_u32 set expected if it fails?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Hi,

On 2023-01-10 13:11:35 -0500, Robert Haas wrote:
> On Tue, Jan 10, 2023 at 12:40 PM Andres Freund <andres@anarazel.de> wrote:
> > > I think. `expected = originalVictim + 1;` line should be in while loop
> > > (before acquiring spin lock) so that, even in the case above, expected
> > > variable is incremented for each loop and CAS operation will be successful
> > > at some point.
> > > Is my understanding correct? If so, I will send PR for fixing this issue.
> >
> > Yes, I think your understanding might be correct. Interesting that this
> > apparently has never occurred.
>
> Doesn't pg_atomic_compare_exchange_u32 set expected if it fails?

Indeed, so there's no problem.

I wonder if we should make ->nextVictimBuffer a 64bit atomic. At the time the
changes went in we didn't (or rather, couldn't) rely on it, but these days we
could.  I think with a 64bit number we could get rid of ->completePasses and
just infer it from ->nextVictimBuffer/NBuffers, removing th need for the
spinlock.  It's very unlikely that 64bit would ever wrap, and even if, it'd
just be a small inaccuracy in the assumed number of passes. OTOH, it's
doubtful the overflow handling / the spinlock matters performance wise.

Greetings,

Andres Freund




2023年1月11日(水) 3:59 Andres Freund <andres@anarazel.de>:
Hi,

On 2023-01-10 13:11:35 -0500, Robert Haas wrote:
> On Tue, Jan 10, 2023 at 12:40 PM Andres Freund <andres@anarazel.de> wrote:
> > > I think. `expected = originalVictim + 1;` line should be in while loop
> > > (before acquiring spin lock) so that, even in the case above, expected
> > > variable is incremented for each loop and CAS operation will be successful
> > > at some point.
> > > Is my understanding correct? If so, I will send PR for fixing this issue.
> >
> > Yes, I think your understanding might be correct. Interesting that this
> > apparently has never occurred.
>
> Doesn't pg_atomic_compare_exchange_u32 set expected if it fails?

Indeed, so there's no problem.

I wonder if we should make ->nextVictimBuffer a 64bit atomic. At the time the
changes went in we didn't (or rather, couldn't) rely on it, but these days we
could.  I think with a 64bit number we could get rid of ->completePasses and
just infer it from ->nextVictimBuffer/NBuffers, removing th need for the
spinlock.  It's very unlikely that 64bit would ever wrap, and even if, it'd
just be a small inaccuracy in the assumed number of passes. OTOH, it's
doubtful the overflow handling / the spinlock matters performance wise.

Greetings,

Andres Freund