Re: LogwrtResult contended spinlock - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: LogwrtResult contended spinlock |
Date | |
Msg-id | 20210130023011.n545o54j65t4kgxn@alap3.anarazel.de Whole thread Raw |
In response to | Re: LogwrtResult contended spinlock (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Responses |
Re: LogwrtResult contended spinlock
(Alvaro Herrera <alvherre@2ndquadrant.com>)
|
List | pgsql-hackers |
Hi, On 2021-01-29 22:47:56 -0300, Alvaro Herrera wrote: > So I tried this, but -- perhaps not suprisingly -- I can't get it to > work properly; the synchronization fails. What do you mean by "synchronization fails"? > Strangely, all tests work for me, but the pg_upgrade one in particular > fails. It's one of the few tests using fsync=on. > +static inline void > +pg_atomic_monotonic_advance_u64(volatile pg_atomic_uint64 *ptr, uint64 target_) > +{ > +#ifndef PG_HAVE_ATOMIC_U64_SIMULATION > + AssertPointerAlignment(ptr, 8); > +#endif > + /* FIXME is this algorithm correct if we have u64 simulation? */ I don't see a problem. > + while (true) > + { > + uint64 currval; > + > + currval = pg_atomic_read_u64(ptr); > + if (currval > target_) > + break; /* already done by somebody else */ > + if (pg_atomic_compare_exchange_u64(ptr, &currval, target_)) > + break; /* successfully done */ > + } > +} I suggest writing this as currval = pg_atomic_read_u64(ptr); while (currval < target_) { if (pg_atomic_compare_exchange_u64(ptr, &currval, target_)) break; } > /* > * Inserting to WAL is protected by a small fixed number of WAL insertion > @@ -596,8 +599,10 @@ typedef struct XLogCtlData > { > XLogCtlInsert Insert; > > + XLogwrtAtomic LogwrtRqst; > + XLogwrtAtomic LogwrtResult; > + > /* Protected by info_lck: */ > - XLogwrtRqst LogwrtRqst; Not sure putting these into the same cacheline is a good idea. > @@ -2166,12 +2163,11 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, bool opportunistic) > if (opportunistic) > break; > > - /* Before waiting, get info_lck and update LogwrtResult */ > - SpinLockAcquire(&XLogCtl->info_lck); > - if (XLogCtl->LogwrtRqst.Write < OldPageRqstPtr) > - XLogCtl->LogwrtRqst.Write = OldPageRqstPtr; > - LogwrtResult = XLogCtl->LogwrtResult; > - SpinLockRelease(&XLogCtl->info_lck); > + /* Before waiting, update LogwrtResult */ > + pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtRqst.Write, OldPageRqstPtr); > + > + LogwrtResult.Write = pg_atomic_read_u64(&XLogCtl->LogwrtResult.Write); > + LogwrtResult.Flush = pg_atomic_read_u64(&XLogCtl->LogwrtResult.Flush); I don't think it's quite as easy as this. Write/Flush now aren't guaranteed to be coherent with each other - previously they were. And because it's in a global variable used everywhere, we can't just be more careful about update protocols in one place... We also shouldn't re-read a variable that we just did via the pg_atomic_monotonic_advance_u64(). I think we should stop updating both the Write/Flush position at the same time. That way we don't have an expectation of them being coherent with each other. Most places really don't need both anyway. > { > - SpinLockAcquire(&XLogCtl->info_lck); > - XLogCtl->LogwrtResult = LogwrtResult; > - if (XLogCtl->LogwrtRqst.Write < LogwrtResult.Write) > - XLogCtl->LogwrtRqst.Write = LogwrtResult.Write; > - if (XLogCtl->LogwrtRqst.Flush < LogwrtResult.Flush) > - XLogCtl->LogwrtRqst.Flush = LogwrtResult.Flush; > - SpinLockRelease(&XLogCtl->info_lck); > + pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtResult.Write, LogwrtResult.Write); > + pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtResult.Flush, LogwrtResult.Flush); > + > + pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtRqst.Write, LogwrtResult.Write); > + pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtRqst.Flush, LogwrtResult.Flush); Hm. We went from one cheap atomic operation (SpinLockAcquire) to four expensive ones in the happy path. That's not free... I don't think we need to manipulate XLogCtl->LogwrtResult.* using atomic ops - they can only be updated with WALWriteLock held, right? XLogCtl->LogwrtResult was updated with plain assignment before, why did you change it to pg_atomic_monotonic_advance_u64()? > @@ -1169,6 +1169,8 @@ XLogInsertRecord(XLogRecData *rdata, > { > /* advance global request to include new block(s) */ > pg_atomic_monotonic_advance_u64(&XLogCtl->LogwrtRqst.Write, EndPos); > + pg_memory_barrier(); > + That's not really useful - the path that actually updates already implies a barrier. It'd probably be better to add a barrier to a "never executed cmpxchg" fastpath. > @@ -2905,6 +2909,7 @@ XLogFlush(XLogRecPtr record) > WriteRqstPtr = pg_atomic_read_u64(&XLogCtl->LogwrtRqst.Write); > LogwrtResult.Write = pg_atomic_read_u64(&XLogCtl->LogwrtResult.Write); > LogwrtResult.Flush = pg_atomic_read_u64(&XLogCtl->LogwrtResult.Flush); > + pg_read_barrier(); I'm not sure you really can get away with just a read barrier in these places. We can't e.g. have later updates to other shared memory variables be "moved" to before the barrier (which a read barrier allows). Greetings, Andres Freund
pgsql-hackers by date: