Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting - Mailing list pgsql-hackers

From Xuneng Zhou
Subject Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting
Date
Msg-id CABPTF7UzKNQDrCeomgiPOZTP7-2J4mUYOZ=Vn9rqCyCTqR6rrw@mail.gmail.com
Whole thread Raw
In response to Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting  (Xuneng Zhou <xunengzhou@gmail.com>)
List pgsql-hackers
Hi,

On Fri, Oct 3, 2025 at 9:50 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi Michael,
>
> Thanks for your review.
>
> On Fri, Oct 3, 2025 at 2:24 PM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Thu, Oct 02, 2025 at 11:06:14PM +0800, Xuneng Zhou wrote:
> > > v5-0002 separates the waitlsn_cmp() comparator function into two distinct
> > > functions (waitlsn_replay_cmp and waitlsn_flush_cmp) for the replay
> > > and flush heaps, respectively.
> >
> > The primary goal that you want to achieve here is a replacement of the
> > wait/sleep logic of read_local_xlog_page_guts() with a condition
> > variable, and design a new facility to make the callback more
> > responsive on polls.  That's a fine idea in itself.  However I would
> > suggest to implement something that does not depend entirely on WAIT
> > FOR, which is how your patch is presented.  Instead of having your
> > patch depend on an entirely different feature, it seems to me that you
> > should try to extract from this other feature the basics that you are
> > looking for, and make them shared between the WAIT FOR patch and what
> > you are trying to achieve here.  You should not need something as
> > complex as what the other feature needs for a page read callback in
> > the backend.
> >
> > At the end, I suspect that you will reuse a slight portion of it (or
> > perhaps nothing at all, actually, but I did not look at the full scope
> > of it).  You should try to present your patch so as it is in a
> > reviewable state, so as others would be able to read it and understand
> > it.  WAIT FOR is much more complex than what you want to do here
> > because it covers larger areas of the code base and needs to worry
> > about more cases.  So, you should implement things so as the basic
> > pieces you want to build on top of are simpler, not more complicated.
> > Simpler means easier to review and easier to catch problems, designed
> > in a way that depends on how you want to fix your problem, not
> > designed in a way that depends on how a completely different feature
> > deals with its own problems.
>
> The core infrastructure shared by both this patch and the WAIT FOR
> command patch is primarily in xlogwait.c, which provides the mechanism
> for waiting until a given LSN is reached. Other parts of the code in
> the WAIT FOR patch—covering the SQL command implementation,
> documentation, and tests—is not relevant for the current patch. What
> we need is only the infrastructure in xlogwait.c, on which we can
> implement the optimization for read_local_xlog_page_guts.
>
> Regarding complexity: the initial optimization idea was to introduce
> condition-variable based waiting, as Heikki suggested in his comment:
>
> /*
>  * Loop waiting for xlog to be available if necessary
>  *
>  * TODO: The walsender has its own version of this function, which uses a
>  * condition variable to wake up whenever WAL is flushed. We could use the
>  * same infrastructure here, instead of the check/sleep/repeat style of
>  * loop.
>  */
>
> I reviewed the relevant code in WalSndWakeup and WalSndWait. While
> these mechanisms reduce polling overhead, they don’t prevent false
> wakeups. Addressing that would likely require a request queue that
> maps waiters to their target LSNs and issues targeted wakeups—a much
> more complex design. Given that read_local_xlog_page_guts is not as
> performance-sensitive as its equivalents, this added complexity may
> not be justified. So I implemented the initial version of the
> optimization like  WalSndWakeup and WalSndWait.
>
> After this, I came across  the WAIT FOR patch in the mailing list and
> noticed that the infrastructure in xlogwait.c aligns well with our
> needs. Based on that, I built the current patch using this shared
> infra.
>
> If we prioritise simplicity and can tolerate occasional false wakeups,
> then waiting in read_local_xlog_page_guts can be implemented in a much
> simpler way than the current version. At the same time, the WAIT FOR
> command seems to be a valuable feature in its own right, and both
> patches can naturally share the same infrastructure. We could also
> extract the infra and implement the current patch on it, then Wait for
> could utilize it as well. Personally, I don’t have a strong preference
> between the two approaches.
>

Another potential use for this infra could be static XLogRecPtr
WalSndWaitForWal(XLogRecPtr loc), I'm planning to hack a version as
well.

Best,
Xuneng



pgsql-hackers by date:

Previous
From: Xuneng Zhou
Date:
Subject: Re: Implement waiting for wal lsn replay: reloaded