Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting - Mailing list pgsql-hackers

From Xuneng Zhou
Subject Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting
Date
Msg-id CABPTF7XLrGChi=TOiqiKitWS2toMyYvzf0q+Ug78wSiONiDrKQ@mail.gmail.com
Whole thread Raw
In response to Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting  (Xuneng Zhou <xunengzhou@gmail.com>)
Responses Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting
List pgsql-hackers
Hi,

On Wed, Oct 15, 2025 at 4:43 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi,
>
> On Wed, Oct 15, 2025 at 8:31 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> >
> > Hi,
> >
> > On Sat, Oct 11, 2025 at 11:02 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > The following is the split patch set. There are certain limitations to
> > > this simplification effort, particularly in patch 2. The
> > > read_local_xlog_page_guts callback demands more functionality from the
> > > facility than the WAIT FOR patch — specifically, it must wait for WAL
> > > flush events, though it does not require timeout handling. In some
> > > sense, parts of patch 3 can be viewed as a superset of the WAIT FOR
> > > patch, since it installs wake-up hooks in more locations. Unlike the
> > > WAIT FOR patch, which only needs wake-ups triggered by replay,
> > > read_local_xlog_page_guts must also handle wake-ups triggered by WAL
> > > flushes.
> > >
> > > Workload characteristics play a key role here. A sorted dlist performs
> > > well when insertions and removals occur in order, achieving O(1)
> > > complexity in the best case. In synchronous replication, insertion
> > > patterns seem generally monotonic with commit LSNs, though not
> > > strictly ordered due to timing variations and contention. When most
> > > insertions remain ordered, a dlist can be efficient. However, as the
> > > number of elements grows and out-of-order insertions become more
> > > frequent, the insertion cost can degrade to O(n) more often.
> > >
> > > By contrast, a pairing heap maintains stable O(1) insertion for both
> > > ordered and disordered inputs, with amortized O(log n) removals. Since
> > > LSNs in the WAIT FOR command are likely to arrive in a non-sequential
> > > fashion, the pairing heap introduced in v6 provides more predictable
> > > performance under such workloads.
> > >
> > > At this stage (v7), no consolidation between syncrep and xlogwait has
> > > been implemented. This is mainly because the dlist and pairing heap
> > > each works well under different workloads — neither is likely to be
> > > universally optimal. Introducing the facility with a pairing heap
> > > first seems reasonable, as it offers flexibility for future
> > > refactoring: we could later replace dlist with a heap or adopt a
> > > modular design depending on observed workload characteristics.
> > >
> >
> > v8-0002 removed the early fast check before addLSNWaiter in WaitForLSNReplay,
> > as the likelihood of a server state change is small compared to the
> > branching cost and added code complexity.
> >
>
> Made minor changes to #include of xlogwait.h in patch2 to calm CF-bots down.

Now that the LSN-waiting infrastructure (3b4e53a) and WAL replay
wake-up calls (447aae1) are in place, this patch has been updated to
make use of them.
Please check.

Best,
Xuneng

Attachment

pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: Consistently use the XLogRecPtrIsInvalid() macro
Next
From: "cca5507"
Date:
Subject: Minor refactor of the code in ExecScanExtended()