Thread: Spinlock can be released twice in procsignal.c

Spinlock can be released twice in procsignal.c

From
"Maksim.Melnikov"
Date:

Hi, it seems we can release spinlock twice in /src/backend/storage/ipc/procsignal.c file, method ProcSignalInit.

void
ProcSignalInit(bool cancel_key_valid, int32 cancel_key)
{
    ProcSignalSlot *slot;
    uint64        barrier_generation;

..............................................................................

    slot = &ProcSignal->psh_slot[MyProcNumber];

    /* sanity check */
    SpinLockAcquire(&slot->pss_mutex);
    if (pg_atomic_read_u32(&slot->pss_pid) != 0)
    {
        SpinLockRelease(&slot->pss_mutex);
        elog(LOG, "process %d taking over ProcSignal slot %d, but it's not empty",
             MyProcPid, MyProcNumber);
    }

    /* Clear out any leftover signal reasons */
    MemSet(slot->pss_signalFlags, 0, NUM_PROCSIGNALS * sizeof(sig_atomic_t));

......................

    slot->pss_cancel_key_valid = cancel_key_valid;
    slot->pss_cancel_key = cancel_key;
    pg_atomic_write_u32(&slot->pss_pid, MyProcPid);

    SpinLockRelease(&slot->pss_mutex);


First in the if clause, second near the end of function. Such behavior can lead to unpredictable concurrent issues.

In applied patch I removed spinlock release in if clause.

Attachment

Re: Spinlock can be released twice in procsignal.c

From
Andrey Borodin
Date:

> On 26 Feb 2025, at 00:34, Maksim.Melnikov <m.melnikov@postgrespro.ru> wrote:
>
> In applied patch I removed spinlock release in if clause.

Looks like the oversight in 9d9b9d4. IMO the fix is correct.


Best regards, Andrey Borodin.


Re: Spinlock can be released twice in procsignal.c

From
Michael Paquier
Date:
On Wed, Feb 26, 2025 at 09:08:53AM +0500, Andrey Borodin wrote:
> Looks like the oversight in 9d9b9d4. IMO the fix is correct.

     if (pg_atomic_read_u32(&slot->pss_pid) != 0)
     {
-        SpinLockRelease(&slot->pss_mutex);
         elog(LOG, "process %d taking over ProcSignal slot %d, but it's not empty",
              MyProcPid, MyProcNumber);
     }

This fix is not correct.  No system function calls (well basically
most of them) or even more no PostgreSQL-specific calls should happen
while holding a spinlock.  elog() is a good example of what not to do.
One example: imagine a palloc failure while holding this spinlock in
this elog().

The code should be restructured so as we read pss_pid while holding
the spinlock, release the spinlock, and then LOG.  Based on the
structure of this routine, you could just assign a boolean to decide
if something should be logged and delay the LOG until the spinlock is
released, because we don't intend an early exit like in
CleanupProcSignalState().  In ~16, ProcSignalInit() is able to LOG
things early, but the ProcSignal is forced even if pss_pid is set, so
delaying the LOG to be generated after updating ProcSignal does not
matter.
--
Michael

Attachment

Re: Spinlock can be released twice in procsignal.c

From
Maxim Orlov
Date:

On Wed, 26 Feb 2025 at 07:56, Michael Paquier <michael@paquier.xyz> wrote:

This fix is not correct.  No system function calls (well basically
most of them) or even more no PostgreSQL-specific calls should happen
while holding a spinlock.  elog() is a good example of what not to do.
One example: imagine a palloc failure while holding this spinlock in
this elog().

Indeed. PFA the correct one.


--
Best regards,
Maxim Orlov.
Attachment

Re: Spinlock can be released twice in procsignal.c

From
Andrey Borodin
Date:
Michael, thanks a lot for the detailed explanation.

> On 26 Feb 2025, at 11:57, Maxim Orlov <orlovmg@gmail.com> wrote:
>
> Indeed. PFA the correct one.
>
> <v2-0001-Avoid-double-spinlock-release.patch>

I’d suggest to give some more descriptive name to “logit” and expand comment “/* sanity check */“. This comment was
easierto understand when elog() was near, but now IMO we can have few words about what is going on. 


Best regards, Andrey Borodin.


Re: Spinlock can be released twice in procsignal.c

From
Maxim Orlov
Date:

On Wed, 26 Feb 2025 at 11:26, Andrey Borodin <x4mmm@yandex-team.ru> wrote:

I’d suggest to give some more descriptive name to “logit” and expand comment “/* sanity check */“. This comment was easier to understand when elog() was near, but now IMO we can have few words about what is going on.

Done. Except for a new name for "logit" variable. Unfortunately, I can't think of anything sane. As an example I looked at sequence.c. The same name is used there. I will gladly change this name to whatever you want if it still look misleading or incomplete for you in some way or another. Just write the name that you think is correct.
 
--
Best regards,
Maxim Orlov.
Attachment