Re: Minimal logical decoding on standbys - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Minimal logical decoding on standbys
Date
Msg-id 20210407203218.pestfxu7watrnog4@alap3.anarazel.de
Whole thread Raw
In response to Re: Minimal logical decoding on standbys  (Andres Freund <andres@anarazel.de>)
Responses Re: Minimal logical decoding on standbys  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi,

On 2021-04-07 10:09:54 -0700, Andres Freund wrote:
> There's also no test for a recovery conflict due to row removal. Despite
> that being a substantial part of the patchset.

Another aspect that wasn't tested *at all*: Whether logical decoding
actually produces useful and correct results.


> I'm tempted to throw out 024 - all of its tests seem fragile and prove
> little. And then add a few more tests to 025 (and renaming it).

While working on this I found a, somewhat substantial, issue:

When the primary is idle, on the standby logical decoding via walsender
will typically not process the records until further WAL writes come in
from the primary, or until a 10s lapsed.

The problem is that WalSndWaitForWal() waits for the *replay* LSN to
increase, but gets woken up by walreceiver when new WAL has been
flushed. Which means that typically walsenders will get woken up at the
same time that the startup process will be - which means that by the
time the logical walsender checks GetXLogReplayRecPtr() it's unlikely
that the startup process already replayed the record and updated
XLogCtl->lastReplayedEndRecPtr.

I think fixing this would require too invasive changes at this point. I
think we might be able to live with 10s delay issue for one release, but
it sure is ugly :(.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: buildfarm instance bichir stuck
Next
From: Andrew Dunstan
Date:
Subject: Re: multi-install PostgresNode fails with older postgres versions