Re: Implement waiting for wal lsn replay: reloaded - Mailing list pgsql-hackers

From Xuneng Zhou
Subject Re: Implement waiting for wal lsn replay: reloaded
Date
Msg-id CABPTF7U7_iYEXn3xeV29-xwFWqTc+2N4=CdL7qEzPPg=hv8eqA@mail.gmail.com
Whole thread Raw
In response to Re: Implement waiting for wal lsn replay: reloaded  (Álvaro Herrera <alvherre@kurilemu.de>)
Responses Re: Implement waiting for wal lsn replay: reloaded
List pgsql-hackers
Hi,

Thanks for working on this.

I’ve just come across this thread and haven’t had a chance to dig into
the patch yet, but I’m keen to review it soon. In the meantime, I have
a quick question: is WAIT FOR REPLY intended mainly for user-defined
functions, or can internal code invoke it as well?

During a recent performance run [1] I noticed heavy polling in
read_local_xlog_page_guts(). Heikki’s comment from a few months ago
also hints that we could replace this check–sleep–repeat loop with the
condition-variable (CV) infrastructure used by walsender:

/*
 * Loop waiting for xlog to be available if necessary
 *
 * TODO: The walsender has its own version of this function, which uses a
 * condition variable to wake up whenever WAL is flushed. We could use the
 * same infrastructure here, instead of the check/sleep/repeat style of
 * loop.
 */

Because read_local_xlog_page_guts() waits for a specific flush or
replay LSN, polling becomes inefficient when the wait is long. I built
a POC patch that swaps polling for CVs, but a single global CV (or
even separate “flush” and “replay” CVs) isn’t ideal:

The wake-up routines don’t know which LSN each waiter cares about, so
they’d have to broadcast on every flush/replay. Caching the minimum
outstanding LSN could reduce spuriously awakened waiters, yet wouldn’t
eliminate them—multiple backends might wait for different LSNs
simultaneously. A more precise solution would require a request queue
that maps waiters to target LSNs and issues targeted wake-ups, adding
complexity.

Walsender accepts the potential broadcast overhead by using two cvs
for different waiters, so it might be acceptable for
read_local_xlog_page_guts() as well. However, if WAIT FOR REPLY
becomes available to backend code, we might leverage it to eliminate
the polling for waiting replay in read_local_xlog_page_guts() without
introducing a bespoke dispatcher. I’d appreciate any thoughts on
whether that use case is in scope.

Best,
Xuneng

[1] https://www.postgresql.org/message-id/CABPTF7VuFYm9TtA9vY8ZtS77qsT+yL_HtSDxUFnW3XsdB5b9ew@mail.gmail.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: headerscheck warnings with late-model gcc
Next
From: Aleksander Alekseev
Date:
Subject: Re: VM corruption on standby