Re: valgrind versus pg_atomic_init() - Mailing list pgsql-hackers

From Noah Misch
Subject Re: valgrind versus pg_atomic_init()
Date
Msg-id 20200615015527.GA2850107@rfd.leadboat.com
Whole thread Raw
In response to valgrind versus pg_atomic_init()  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: valgrind versus pg_atomic_init()
Re: valgrind versus pg_atomic_init()
List pgsql-hackers
On Sun, Jun 07, 2020 at 12:23:35AM -0400, Tom Lane wrote:
> I experimented with running "make check" on ARM64 under a reasonably
> bleeding-edge valgrind (3.16.0).  One thing I ran into is that
> regress.c's test_atomic_ops fails; valgrind shows the stack trace
> 
>    fun:__aarch64_cas8_acq_rel
>    fun:pg_atomic_compare_exchange_u64_impl
>    fun:pg_atomic_exchange_u64_impl
>    fun:pg_atomic_write_u64_impl
>    fun:pg_atomic_init_u64_impl
>    fun:pg_atomic_init_u64
>    fun:test_atomic_uint64
>    fun:test_atomic_ops
>    fun:ExecInterpExpr
> 
> Now, this is basically the same thing as is already memorialized in
> src/tools/valgrind.supp:
> 
> # Atomic writes to 64bit atomic vars uses compare/exchange to
> # guarantee atomic writes of 64bit variables. pg_atomic_write is used
> # during initialization of the atomic variable; that leads to an
> # initial read of the old, undefined, memory value. But that's just to
> # make sure the swap works correctly.
> {
>     uninitialized_atomic_init_u64
>     Memcheck:Cond
>     fun:pg_atomic_exchange_u64_impl
>     fun:pg_atomic_write_u64_impl
>     fun:pg_atomic_init_u64_impl
> }
> 
> so my first thought was that we just needed an architecture-specific
> variant of that.  But on thinking more about this, it seems like
> generic.h's version of pg_atomic_init_u64_impl is just fundamentally
> misguided.  Why isn't it simply assigning the value with an ordinary
> unlocked write?  By definition, we had better not be using this function
> in any circumstance where there might be conflicting accesses

Does something guarantee the write will be globally-visible by the time the
first concurrent accessor shows up?  (If not, one could (a) do an unlocked
ptr->value=0, then the atomic write, or (b) revert and improve the
suppression.)  I don't doubt it's fine for the ways PostgreSQL uses atomics
today, which generally initialize an atomic before the concurrent-accessor
processes even exist.

> , so I don't
> see why we should need to tolerate a valgrind exception here.  Moreover,
> if a simple assignment *isn't* good enough, then surely the spinlock
> version in atomics.c is 100% broken.

Are you saying it would imply a bug in atomics.c pg_atomic_init_u64_impl(),
pg_atomic_compare_exchange_u64_impl(), or pg_atomic_fetch_add_u64_impl()?  Can
you explain that more?  If you were referring to unlocked "*(lock) = 0", that
is different since it's safe to have a delay in propagation of the change from
locked state to unlocked state.



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Include access method in listTables output
Next
From: Michael Paquier
Date:
Subject: Re: Postgresql13_beta1 (could not rename temporary statistics file)Windows 64bits