Thread: RE: RFC: Lock-free XLog Reservation from WAL

RE: RFC: Lock-free XLog Reservation from WAL

From
"Zhou, Zhiguo"
Date:
This message is a duplicate of PH7PR11MB5796659F654F9BE983F3AD97EF142@PH7PR11MB5796.namprd11.prod.outlook.com. Please
considerdropping this thread and review the original one instead. 

Sorry for your inconvenience.

-----Original Message-----
From: Zhou, Zhiguo <zhiguo.zhou@intel.com>
Sent: Thursday, January 2, 2025 5:15 PM
To: pgsql-hackers@lists.postgresql.org
Subject: RFC: Lock-free XLog Reservation from WAL

Hi all,

I am reaching out to solicit your insights and comments on a recent proposal regarding the "Lock-free XLog Reservation
fromWAL." We have identified some challenges with the current WAL insertions, which require space reservations in the
WALbuffer which involve updating two shared-memory statuses in XLogCtlInsert: CurrBytePos (the start position of the
currentXLog) and PrevBytePos (the prev-link to the previous XLog). Currently, the use of XLogCtlInsert.insertpos_lck
ensuresconsistency but introduces lock contention, hindering the parallelism of XLog insertions. 

To address this issue, we propose the following changes:

1. Removal of PrevBytePos: This will allow increments of the CurrBytePos (a single uint64 field) to be implemented with
anatomic operation (fetch_add). 
2. Updating Prev-Link of next XLog: Based on the fact that the prev-link of the next XLog always points to the head of
thecurrent Xlog,we will slightly exceed the reserved memory range of the current XLog to update the prev-link of the
nextXLog, regardless of which backend acquires the next memory space. The next XLog inserter will wait until its
prev-linkis updated for CRC calculation before starting its own XLog copy into the WAL. 
3. Breaking Sequential Write Convention: Each backend will update the prev-link of its next XLog first, then return to
theheader position for the current log insertion. This change will reduce the dependency of XLog writes on previous
ones(compared with the sequential writes). 
4. Revised GetXLogBuffer: To support #3, we need update this function to separate the LSN it intends to access from the
LSNit expects to update in the insertingAt field. 
5. Increase NUM_XLOGINSERT_LOCKS: With the above changes, increasing NUM_XLOGINSERT_LOCKS, for example to 128, could
effectivelyenhance the parallelism. 

The attached patch could pass the regression tests (make check, make check-world), and in the performance test of this
POCon SPR (480 vCPU) shows that this optimization could help the TPCC benchmark better scale with the core count and as
aresult the performance with full cores enabled could be improved by 2.04x. 

Before we proceed with further patch validation and refinement work, we are eager to hear the community's thoughts and
commentson this optimization so that we can confirm our current work aligns with expectations.