Re: pgsql: Fix double-release of spinlock - Mailing list pgsql-committers

From Andres Freund
Subject Re: pgsql: Fix double-release of spinlock
Date
Msg-id 20240729174609.jbfei3sokdobeeeo@awork3.anarazel.de
Whole thread Raw
In response to Re: pgsql: Fix double-release of spinlock  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pgsql: Fix double-release of spinlock
List pgsql-committers
On 2024-07-29 12:45:19 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2024-07-29 12:33:13 -0400, Tom Lane wrote:
> >> I dunno, is that the only extra check that the --disable-spinlocks
> >> implementation is providing?
>
> > I think it also provides the (valuable!) check that spinlocks were actually
> > initialized. But that again seems like something we'd be better off adding
> > more general infrastructure for - nobody runs --disable-spinlocks locally, we
> > shouldn't need to run this on the buildfarm to find problems like this.
>
> Hmm, but how?

I think there's a few ways:

> One of the things we gave up by nuking HPPA support
> was that that platform's representation of an initialized, free
> spinlock was not all-zeroes, so that it'd catch this type of problem.
> I think all the remaining platforms do use zeroes, so it's hard to
> see how anything short of valgrind would be likely to catch it.

1) There's nothing forcing us to use 0/1 for most of the spinlock
implementations. E.g. for x86-64 we could use 0 for uninitialized, 1 for free
and 2 for locked.

2) We could also change the layout of slock_t in assert enabled builds, adding
a dedicated 'initialized' field when assertions are enabled. But that might be
annoying from an ABI POV?


1) seems preferrable, so I gave it a quick try. Seems to work. There's a
*slight* difference in the instruction sequence:

old:
    41f6:    f0 86 10                 lock xchg %dl,(%rax)
    41f9:    84 d2                    test   %dl,%dl
    41fb:    75 1b                    jne    4218 <GetRecoveryState+0x38>

new:
    4216:    f0 86 10                 lock xchg %dl,(%rax)
    4219:    80 fa 02                 cmp    $0x2,%dl
    421c:    74 22                    je     4240 <GetRecoveryState+0x40>

I.e. the version using 2 as the locked state uses a three byte instruction vs
a two byte instruction before.


*If* we are worried about this, we could

a) Change the representation only for assert enabled builds, but that'd have
   ABI issues again.

b) Instead define the spinlock to have 1 as the unlocked state and 0 as the
   locked state. That makes it a bit harder to understand that initialization
   is missing, compared to a dedicated state, as the first use of the spinlock
   just blocks.


To make 1) b) easier to understand it might be worth changing the slock_t
typedef to be something like

typedef struct slock_t
{
        char is_free;
} slock_t;

which also might help catch some cases of type confusion - the char typedef is
too forgiving imo.

Greetings,

Andres Freund



pgsql-committers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: pgsql: Remove dead generators for cyrillic encoding conversion tables
Next
From: Tom Lane
Date:
Subject: Re: pgsql: Fix double-release of spinlock