Re: WAL Insertion Lock Improvements - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: WAL Insertion Lock Improvements
Date
Msg-id ZLoetVXkyvS5lryz@paquier.xyz
Whole thread Raw
In response to Re: WAL Insertion Lock Improvements  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: WAL Insertion Lock Improvements
List pgsql-hackers
On Thu, Jul 20, 2023 at 02:38:29PM +0530, Bharath Rupireddy wrote:
> On Fri, Jul 14, 2023 at 4:17 AM Andres Freund <andres@anarazel.de> wrote:
>> I think this commit does too many things at once.
>
> I've split the patch into three - 1) Make insertingAt 64-bit atomic.
> 2) Have better commenting on why there's no memory barrier or spinlock
> in and around LWLockWaitForVar call sites. 3) Have a quick exit for
> LWLockUpdateVar.

FWIW, I was kind of already OK with 0001, as it shows most of the
gains observed while 0003 had a limited impact:
https://www.postgresql.org/message-id/CALj2ACWgeOPEKVY-TEPvno%3DVatyzrb-vEEP8hN7QqrQ%3DyPRupA%40mail.gmail.com

It is kind of a no-brainer to replace the spinlocks with atomic reads
and writes there.

>> I don't think:
>>                          * NB: LWLockConflictsWithVar (which is called from
>>                          * LWLockWaitForVar) relies on the spinlock used above in this
>>                          * function and doesn't use a memory barrier.
>>
>> helps to understand why any of this is safe to a meaningful degree.
>>
>> The existing comments aren't obviously aren't sufficient to explain this, but
>> the reason it's, I think, safe today, is that we are only waiting for
>> insertions that started before WaitXLogInsertionsToFinish() was called.  The
>> lack of memory barriers in the loop means that we might see locks as "unused"
>> that have *since* become used, which is fine, because they only can be for
>> later insertions that we wouldn't want to wait on anyway.
>
> Right.

FWIW, I always have a hard time coming back to this code and see it
rely on undocumented assumptions with code in lwlock.c while we need
to keep an eye in xlog.c (we take a spinlock there while the variable
wait logic relies on it for ordering @-@).  So the proposal of getting
more documentation in place via 0002 goes in the right direction.

>> This comment seems off to me. Using atomics doesn't guarantee not needing
>> locking. It just guarantees that we are reading a non-torn value.
>
> Modified the comment.

-       /*
-        * Read value using the lwlock's wait list lock, as we can't generally
-        * rely on atomic 64 bit reads/stores.  TODO: On platforms with a way to
-        * do atomic 64 bit reads/writes the spinlock should be optimized away.
-        */
-       LWLockWaitListLock(lock);
-       value = *valptr;
-       LWLockWaitListUnlock(lock);
+       /* Reading atomically avoids getting a torn value */
+       value = pg_atomic_read_u64(valptr);

Should this specify that this is specifically important for platforms
where reading a uint64 could lead to a torn value read, if you apply
this term in this context?  Sounding picky, I would make that a bit
longer, say something like that:
"Reading this value atomically is safe even on platforms where uint64
cannot be read without observing a torn value."

Only xlogprefetcher.c uses the term "torn" for a value by the way, but
for a write.

>>> @@ -1605,9 +1602,14 @@ LWLockConflictsWithVar(LWLock *lock,
>>>   *
>>>   * Note: this function ignores shared lock holders; if the lock is held
>>>   * in shared mode, returns 'true'.
>>> + *
>>> + * Be careful that LWLockConflictsWithVar() does not include a memory barrier,
>>> + * hence the caller of this function may want to rely on an explicit barrier or
>>> + * a spinlock to avoid memory ordering issues.
>>>   */
>>
>> s/careful/aware/?
>>
>> s/spinlock/implied barrier via spinlock or lwlock/?
>
> Done.

Okay to mention a LWLock here, even if the sole caller of this routine
relies on a spinlock.

0001 looks OK-ish seen from here.  Thoughts?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Support worker_spi to execute the function dynamically.
Next
From: Bharath Rupireddy
Date:
Subject: Re: Synchronizing slots from primary to standby