On Mon, Sep 22, 2025 at 8:56 PM Jeff Davis <pgsql@j-davis.com> wrote:
On Sat, 2025-09-13 at 22:04 -0700, Bharath Rupireddy wrote: > Thanks for looking at this. Yes, the WAL writers can zero out flushed > buffers before WALReadFromBuffers gets to them. However, > WALReadFromBuffers was intentionally designed as an opportunistic > optimization - it's a "try this first, quickly" approach before > falling back to reading from WAL files.
IIRC, one motivation (perhaps the primary motivation?) was to make it possible to read buffers before they are flushed. It was always possible to read already-flushed buffers.
The benefit of reading unflushed buffers is that we can replicate the WAL sooner (though it can't be replayed until the primary flushes it). Is that right?
I'm not certain about the primary motivation, but as it stands, WALReadFromBuffers only reads WAL records present in buffers up to the flush pointer. This is because XLogSendPhysical currently sends records only up to the flush pointer, not beyond.
I am currently testing a patch developed by Melih Mutlu that implements the functionality you described, sending unflushed buffers during physical replication. After some tuning, the patch has shown a 5 percent improvement in TPS for synchronous replication with remote_write. I am working on further improving the patch before sharing it on the hackers mailing list.