Thread: can while loop in ClockSweepTick function be kind of infinite loop in some cases?
This question is about ClockSweepTick function and the code is below.
The value of expected, NBuffers, wrapped variable is fixed in the while loop, so that when the value of expected variable is not equal to StrategyControl->nextVictimBuffer, CAS operation fails and the while loop will be run kind-of infinitely.
It is possible for this problem to occur when ClockSweepTick function is concurrently called and nextVictimBuffer is incremented by other process before CAS operation in the loop (ex: in this case, the value of expected variable is NBuffers+1 while the value of nextVictimBuffer variable is NBuffers+2. so CAS operation fails)
I think. `expected = originalVictim + 1;` line should be in while loop (before acquiring spin lock) so that, even in the case above, expected variable is incremented for each loop and CAS operation will be successful at some point.
Is my understanding correct? If so, I will send PR for fixing this issue.
Thank you in advance
Hayato Shiba
Re: can while loop in ClockSweepTick function be kind of infinite loop in some cases?
From
Andres Freund
Date:
Hi, On 2023-01-11 01:25:06 +0900, 斯波隼斗 wrote: > This question is about ClockSweepTick function and the code is below. > https://github.com/postgres/postgres/blob/24d2b2680a8d0e01b30ce8a41c4eb3b47aca5031/src/backend/storage/buffer/freelist.c#L146-L165 > > The value of expected, NBuffers, wrapped variable is fixed in the while > loop, so that when the value of expected variable is not equal to > StrategyControl->nextVictimBuffer, CAS operation fails and the while loop > will be run kind-of infinitely. > It is possible for this problem to occur when ClockSweepTick function is > concurrently called and nextVictimBuffer is incremented by other process > before CAS operation in the loop (ex: in this case, the value of expected > variable is NBuffers+1 while the value of nextVictimBuffer variable is > NBuffers+2. so CAS operation fails) > I think. `expected = originalVictim + 1;` line should be in while loop > (before acquiring spin lock) so that, even in the case above, expected > variable is incremented for each loop and CAS operation will be successful > at some point. > Is my understanding correct? If so, I will send PR for fixing this issue. Yes, I think your understanding might be correct. Interesting that this apparently has never occurred. Yes, please send a patch. Thanks, Andres
Re: can while loop in ClockSweepTick function be kind of infinite loop in some cases?
From
Robert Haas
Date:
On Tue, Jan 10, 2023 at 12:40 PM Andres Freund <andres@anarazel.de> wrote: > > I think. `expected = originalVictim + 1;` line should be in while loop > > (before acquiring spin lock) so that, even in the case above, expected > > variable is incremented for each loop and CAS operation will be successful > > at some point. > > Is my understanding correct? If so, I will send PR for fixing this issue. > > Yes, I think your understanding might be correct. Interesting that this > apparently has never occurred. Doesn't pg_atomic_compare_exchange_u32 set expected if it fails? -- Robert Haas EDB: http://www.enterprisedb.com
Re: can while loop in ClockSweepTick function be kind of infinite loop in some cases?
From
Andres Freund
Date:
Hi, On 2023-01-10 13:11:35 -0500, Robert Haas wrote: > On Tue, Jan 10, 2023 at 12:40 PM Andres Freund <andres@anarazel.de> wrote: > > > I think. `expected = originalVictim + 1;` line should be in while loop > > > (before acquiring spin lock) so that, even in the case above, expected > > > variable is incremented for each loop and CAS operation will be successful > > > at some point. > > > Is my understanding correct? If so, I will send PR for fixing this issue. > > > > Yes, I think your understanding might be correct. Interesting that this > > apparently has never occurred. > > Doesn't pg_atomic_compare_exchange_u32 set expected if it fails? Indeed, so there's no problem. I wonder if we should make ->nextVictimBuffer a 64bit atomic. At the time the changes went in we didn't (or rather, couldn't) rely on it, but these days we could. I think with a 64bit number we could get rid of ->completePasses and just infer it from ->nextVictimBuffer/NBuffers, removing th need for the spinlock. It's very unlikely that 64bit would ever wrap, and even if, it'd just be a small inaccuracy in the assumed number of passes. OTOH, it's doubtful the overflow handling / the spinlock matters performance wise. Greetings, Andres Freund
Re: can while loop in ClockSweepTick function be kind of infinite loop in some cases?
From
斯波隼斗
Date:
Hi, Thank you for your quick reply
I misunderstood the logic of pg_atomic_compare_exchange_u32, so the loop cannot be infinite.
> I wonder if we should make ->nextVictimBuffer a 64bit atomic. At the time the changes went in we didn't (or rather, couldn't) rely on it, but these days we could. I think with a 64bit number we could get rid of ->completePasses and just infer it from ->nextVictimBuffer/NBuffers, removing th need for the spinlock. It's very unlikely that 64bit would ever wrap, and even if, it'd just be a small inaccuracy in the assumed number of passes. OTOH, it's doubtful the overflow handling / the spinlock matters performance wise.
I'm not sure why 64 bit was not used at the time, so I'm concerned about it.
but, except for it, you have a point and I completely agree with you. as you have said, we should use 64 bit whose higher-order 32 bit indicates completePasses, and should remove spinlock.
maybe we don't have to exceptionally worry about the overflow here mainly because, even now, the completePasses can overflow and the possibility of overflow may not be so different so that the 64 bit atomic operation is better.
if overflow would happen, passes_delta variable in the function called by bgwriter would be negative high value and it would lead to the failure of assert. (the code is below
Do you send patch for the replacement with 64 bit? If you don't mind, I would like to send patch. ( or is there some procedure before sending patch?
Thanks
hayato
2023年1月11日(水) 3:59 Andres Freund <andres@anarazel.de>:
Hi,
On 2023-01-10 13:11:35 -0500, Robert Haas wrote:
> On Tue, Jan 10, 2023 at 12:40 PM Andres Freund <andres@anarazel.de> wrote:
> > > I think. `expected = originalVictim + 1;` line should be in while loop
> > > (before acquiring spin lock) so that, even in the case above, expected
> > > variable is incremented for each loop and CAS operation will be successful
> > > at some point.
> > > Is my understanding correct? If so, I will send PR for fixing this issue.
> >
> > Yes, I think your understanding might be correct. Interesting that this
> > apparently has never occurred.
>
> Doesn't pg_atomic_compare_exchange_u32 set expected if it fails?
Indeed, so there's no problem.
I wonder if we should make ->nextVictimBuffer a 64bit atomic. At the time the
changes went in we didn't (or rather, couldn't) rely on it, but these days we
could. I think with a 64bit number we could get rid of ->completePasses and
just infer it from ->nextVictimBuffer/NBuffers, removing th need for the
spinlock. It's very unlikely that 64bit would ever wrap, and even if, it'd
just be a small inaccuracy in the assumed number of passes. OTOH, it's
doubtful the overflow handling / the spinlock matters performance wise.
Greetings,
Andres Freund