Re: Add missing ConditionVariableCancelSleep() in slot.c - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Add missing ConditionVariableCancelSleep() in slot.c
Date
Msg-id CALj2ACXNXa8oXicTmJOg=MEVDEM7FyWCpVfUgyEDMVn6_yQLYw@mail.gmail.com
Whole thread Raw
In response to Re: Add missing ConditionVariableCancelSleep() in slot.c  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
On Sat, Apr 13, 2024 at 4:37 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> Hmm, but shouldn't we cancel the sleep after we have completed sleeping
> altogether, that is, until we've determined that we're no longer to
> sleep waiting for this slot?  That would suggest to put the call to
> cancel sleep after the for(;;) loop is complete, rather than immediately
> after sleeping.  No?
>
> diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
> index cebf44bb0f..59e9bef642 100644
> --- a/src/backend/replication/slot.c
> +++ b/src/backend/replication/slot.c
> @@ -1756,6 +1756,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
>                 }
>         }
>
> +       ConditionVariableCancelSleep();
> +
>         Assert(released_lock == !LWLockHeldByMe(ReplicationSlotControlLock));
>
>         return released_lock;

We can do that and +1 for it since the prepare to sleep cancel the
previous one anyway. I mentioned that approach in the original email:

>> Alternatively, we can
>> just add ConditionVariableCancelSleep() outside of the for loop to
>> cancel the sleep of the last cycle if any. This also looks correct
>> because every prepare to sleep does ensure the previous sleep is
>> canceled, and if no sleep at all, the cacnel sleep call exits.

> However, I noticed that ConditionVariablePrepareToSleep() does a
> CancelSleep upon being called ... so what I suggest would not have any
> effect whatsoever, because the sleep would be cancelled next time
> through the loop anyway.

But what if we break from the loop and never come back? We have to
wait until the sigsetjmp/exit path of the backend to hit and cancel
the sleep.

> But shouldn't we also modify PrepareToSleep to
> exit without doing anything if our current sleep CV is the same one
> being newly installed?
>
> diff --git a/src/backend/storage/lmgr/condition_variable.c b/src/backend/storage/lmgr/condition_variable.c
> index 112a518bae..65811ff989 100644
> --- a/src/backend/storage/lmgr/condition_variable.c
> +++ b/src/backend/storage/lmgr/condition_variable.c
> @@ -57,6 +57,14 @@ ConditionVariablePrepareToSleep(ConditionVariable *cv)
>  {
>         int                     pgprocno = MyProcNumber;
>
> +       /*
> +        * If we're preparing to sleep on the same CV we were already going to
> +        * sleep on, we don't actually need to do anything.  This may seem like
> +        * supporting sloppy coding, which is what it actually does, so ¯\_(ツ)_/¯
> +        */
> +       if (cv_sleep_target == cv)
> +               return;
> +
>         /*
>          * If some other sleep is already prepared, cancel it; this is necessary
>          * because we have just one static variable tracking the prepared sleep,

That seems to work as a quick exit path avoiding spin lock acquisition
and release if the CV is already prepared to sleep. Specifically in
the InvalidatePossiblyObsoleteSlot for loop, it can avoid a bunch of
spin lock acquisitions and releases if we ever sleep on the same
slot's CV. However, I'm not sure if it will have any other issues.

BTW, I like the emoji "¯\_(ツ)_/¯" in the code comments :).

> Alternatively, maybe we need to not prepare-to-sleep in
> InvalidatePossiblyObsoleteSlot() if we have already prepared to sleep in
> a previous iteration through the loop (and of course, don't cancel the
> sleep until we're out of the loop).

I think this looks complicated. To keep things simple, I prefer to add
the ConditionVariableCancelSleep() out of the for loop in
InvalidatePossiblyObsoleteSlot().

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Frédéric Yhuel
Date:
Subject: Re: New GUC autovacuum_max_threshold ?
Next
From: Pavel Stehule
Date:
Subject: Re: broken reading on standby (PostgreSQL 16.2)