Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting - Mailing list pgsql-hackers
From | Xuneng Zhou |
---|---|
Subject | Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting |
Date | |
Msg-id | CABPTF7Vt-xb-LCZWGFEe-LM1+HLAyc6CDtORmes=9fzEQcuqKw@mail.gmail.com Whole thread Raw |
In response to | Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting (Xuneng Zhou <xunengzhou@gmail.com>) |
List | pgsql-hackers |
Hi, On Sat, Oct 4, 2025 at 10:25 AM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > Hi, > > On Fri, Oct 3, 2025 at 9:50 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > > > Hi Michael, > > > > Thanks for your review. > > > > On Fri, Oct 3, 2025 at 2:24 PM Michael Paquier <michael@paquier.xyz> wrote: > > > > > > On Thu, Oct 02, 2025 at 11:06:14PM +0800, Xuneng Zhou wrote: > > > > v5-0002 separates the waitlsn_cmp() comparator function into two distinct > > > > functions (waitlsn_replay_cmp and waitlsn_flush_cmp) for the replay > > > > and flush heaps, respectively. > > > > > > The primary goal that you want to achieve here is a replacement of the > > > wait/sleep logic of read_local_xlog_page_guts() with a condition > > > variable, and design a new facility to make the callback more > > > responsive on polls. That's a fine idea in itself. However I would > > > suggest to implement something that does not depend entirely on WAIT > > > FOR, which is how your patch is presented. Instead of having your > > > patch depend on an entirely different feature, it seems to me that you > > > should try to extract from this other feature the basics that you are > > > looking for, and make them shared between the WAIT FOR patch and what > > > you are trying to achieve here. You should not need something as > > > complex as what the other feature needs for a page read callback in > > > the backend. > > > > > > At the end, I suspect that you will reuse a slight portion of it (or > > > perhaps nothing at all, actually, but I did not look at the full scope > > > of it). You should try to present your patch so as it is in a > > > reviewable state, so as others would be able to read it and understand > > > it. WAIT FOR is much more complex than what you want to do here > > > because it covers larger areas of the code base and needs to worry > > > about more cases. So, you should implement things so as the basic > > > pieces you want to build on top of are simpler, not more complicated. > > > Simpler means easier to review and easier to catch problems, designed > > > in a way that depends on how you want to fix your problem, not > > > designed in a way that depends on how a completely different feature > > > deals with its own problems. > > > > The core infrastructure shared by both this patch and the WAIT FOR > > command patch is primarily in xlogwait.c, which provides the mechanism > > for waiting until a given LSN is reached. Other parts of the code in > > the WAIT FOR patch—covering the SQL command implementation, > > documentation, and tests—is not relevant for the current patch. What > > we need is only the infrastructure in xlogwait.c, on which we can > > implement the optimization for read_local_xlog_page_guts. > > > > Regarding complexity: the initial optimization idea was to introduce > > condition-variable based waiting, as Heikki suggested in his comment: > > > > /* > > * Loop waiting for xlog to be available if necessary > > * > > * TODO: The walsender has its own version of this function, which uses a > > * condition variable to wake up whenever WAL is flushed. We could use the > > * same infrastructure here, instead of the check/sleep/repeat style of > > * loop. > > */ > > > > I reviewed the relevant code in WalSndWakeup and WalSndWait. While > > these mechanisms reduce polling overhead, they don’t prevent false > > wakeups. Addressing that would likely require a request queue that > > maps waiters to their target LSNs and issues targeted wakeups—a much > > more complex design. Given that read_local_xlog_page_guts is not as > > performance-sensitive as its equivalents, this added complexity may > > not be justified. So I implemented the initial version of the > > optimization like WalSndWakeup and WalSndWait. > > > > After this, I came across the WAIT FOR patch in the mailing list and > > noticed that the infrastructure in xlogwait.c aligns well with our > > needs. Based on that, I built the current patch using this shared > > infra. > > > > If we prioritise simplicity and can tolerate occasional false wakeups, > > then waiting in read_local_xlog_page_guts can be implemented in a much > > simpler way than the current version. At the same time, the WAIT FOR > > command seems to be a valuable feature in its own right, and both > > patches can naturally share the same infrastructure. We could also > > extract the infra and implement the current patch on it, then Wait for > > could utilize it as well. Personally, I don’t have a strong preference > > between the two approaches. > > > > Another potential use for this infra could be static XLogRecPtr > WalSndWaitForWal(XLogRecPtr loc), I'm planning to hack a version as > well. > > Best, > Xuneng v6 refactors and extends the infrastructure from the WAIT FOR command patch, applying it to read_local_xlog_page_guts. I'm also thinking of creating a standalone patch/commit for the extended infra in xlogwait, so it can be reused in different threads. Best, Xuneng
Attachment
pgsql-hackers by date: