Home > mailing lists

Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting - Mailing list pgsql-hackers

From	Xuneng Zhou
Subject	Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting
Date	October 4 10:21:07
Msg-id	CABPTF7Vt-xb-LCZWGFEe-LM1+HLAyc6CDtORmes=9fzEQcuqKw@mail.gmail.com Whole thread Raw
In response to	Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting (Xuneng Zhou <xunengzhou@gmail.com>)
List	pgsql-hackers

Tree view

Hi,

On Sat, Oct 4, 2025 at 10:25 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi,
>
> On Fri, Oct 3, 2025 at 9:50 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> >
> > Hi Michael,
> >
> > Thanks for your review.
> >
> > On Fri, Oct 3, 2025 at 2:24 PM Michael Paquier <michael@paquier.xyz> wrote:
> > >
> > > On Thu, Oct 02, 2025 at 11:06:14PM +0800, Xuneng Zhou wrote:
> > > > v5-0002 separates the waitlsn_cmp() comparator function into two distinct
> > > > functions (waitlsn_replay_cmp and waitlsn_flush_cmp) for the replay
> > > > and flush heaps, respectively.
> > >
> > > The primary goal that you want to achieve here is a replacement of the
> > > wait/sleep logic of read_local_xlog_page_guts() with a condition
> > > variable, and design a new facility to make the callback more
> > > responsive on polls.  That's a fine idea in itself.  However I would
> > > suggest to implement something that does not depend entirely on WAIT
> > > FOR, which is how your patch is presented.  Instead of having your
> > > patch depend on an entirely different feature, it seems to me that you
> > > should try to extract from this other feature the basics that you are
> > > looking for, and make them shared between the WAIT FOR patch and what
> > > you are trying to achieve here.  You should not need something as
> > > complex as what the other feature needs for a page read callback in
> > > the backend.
> > >
> > > At the end, I suspect that you will reuse a slight portion of it (or
> > > perhaps nothing at all, actually, but I did not look at the full scope
> > > of it).  You should try to present your patch so as it is in a
> > > reviewable state, so as others would be able to read it and understand
> > > it.  WAIT FOR is much more complex than what you want to do here
> > > because it covers larger areas of the code base and needs to worry
> > > about more cases.  So, you should implement things so as the basic
> > > pieces you want to build on top of are simpler, not more complicated.
> > > Simpler means easier to review and easier to catch problems, designed
> > > in a way that depends on how you want to fix your problem, not
> > > designed in a way that depends on how a completely different feature
> > > deals with its own problems.
> >
> > The core infrastructure shared by both this patch and the WAIT FOR
> > command patch is primarily in xlogwait.c, which provides the mechanism
> > for waiting until a given LSN is reached. Other parts of the code in
> > the WAIT FOR patch—covering the SQL command implementation,
> > documentation, and tests—is not relevant for the current patch. What
> > we need is only the infrastructure in xlogwait.c, on which we can
> > implement the optimization for read_local_xlog_page_guts.
> >
> > Regarding complexity: the initial optimization idea was to introduce
> > condition-variable based waiting, as Heikki suggested in his comment:
> >
> > /*
> >  * Loop waiting for xlog to be available if necessary
> >  *
> >  * TODO: The walsender has its own version of this function, which uses a
> >  * condition variable to wake up whenever WAL is flushed. We could use the
> >  * same infrastructure here, instead of the check/sleep/repeat style of
> >  * loop.
> >  */
> >
> > I reviewed the relevant code in WalSndWakeup and WalSndWait. While
> > these mechanisms reduce polling overhead, they don’t prevent false
> > wakeups. Addressing that would likely require a request queue that
> > maps waiters to their target LSNs and issues targeted wakeups—a much
> > more complex design. Given that read_local_xlog_page_guts is not as
> > performance-sensitive as its equivalents, this added complexity may
> > not be justified. So I implemented the initial version of the
> > optimization like  WalSndWakeup and WalSndWait.
> >
> > After this, I came across  the WAIT FOR patch in the mailing list and
> > noticed that the infrastructure in xlogwait.c aligns well with our
> > needs. Based on that, I built the current patch using this shared
> > infra.
> >
> > If we prioritise simplicity and can tolerate occasional false wakeups,
> > then waiting in read_local_xlog_page_guts can be implemented in a much
> > simpler way than the current version. At the same time, the WAIT FOR
> > command seems to be a valuable feature in its own right, and both
> > patches can naturally share the same infrastructure. We could also
> > extract the infra and implement the current patch on it, then Wait for
> > could utilize it as well. Personally, I don’t have a strong preference
> > between the two approaches.
> >
>
> Another potential use for this infra could be static XLogRecPtr
> WalSndWaitForWal(XLogRecPtr loc), I'm planning to hack a version as
> well.
>
> Best,
> Xuneng

v6 refactors and extends the infrastructure from the WAIT FOR command
patch, applying it to read_local_xlog_page_guts. I'm also thinking of
creating a standalone patch/commit for the extended
infra in xlogwait, so it can be reused in different threads.

Best,
Xuneng

Attachment

v6-0001-Replace-inefficient-polling-loops-in-read_local_x.patch

pgsql-hackers by date:

From: Matthias van de Meent
Date: 04 October, 10:05:45
Subject: Re: Buffer locking is special (hints, checksums, AIO writes)

From: Álvaro Herrera
Date: 04 October, 14:57:44
Subject: Re: pg_createsubscriber --dry-run logging concerns

Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting - Mailing list pgsql-hackers

Attachment

Previous

Next