On Mon, Apr 13, 2020 at 09:45:16PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > This seems to have made the following race condition easier to hit:
> > https://www.postgresql.org/message-id/flat/20200206074552.GB3326097%40rfd.leadboat.com
> > https://www.postgresql.org/message-id/flat/21519.1585272409%40sss.pgh.pa.us
>
> Yeah, I just came to the same guess in the other thread.
>
> > While I don't think that indicates anything wrong with the fix for $SUBJECT,
> > creating more buildfarm noise is itself bad. I am inclined to revert the fix
> > after a week. Not immediately, in case it uncovers lower-probability bugs.
> > I'd then re-commit it after one of those threads fixes the other bug. Would
> > anyone like to argue for a revert earlier, later, or not at all?
>
> I don't think you should revert. Those failures are (just) often enough
> to be annoying but I do not think that a proper fix is very far away.
That works for me, but an actual defect may trigger a revert. Fujii Masao
reported high walsender CPU usage after this patch. The patch caused idle
physical walsenders to use 100% CPU. When caught up, the
WalSndSendDataCallback for logical replication, XLogSendLogical(), sleeps
until more WAL is available. XLogSendPhysical() just returns when caught up.
No amount of WAL is too small for physical replication to dispatch, but
logical replication needs the full xl_tot_len of a record. Some options:
1. Make XLogSendPhysical() more like XLogSendLogical(), calling
WalSndWaitForWal() when no WAL is available. A quick version of this
passes tests, but I'll need to audit WalSndWaitForWal() for things that are
wrong for physical replication.
2. Make XLogSendLogical() more like XLogSendPhysical(), returning when
insufficient WAL is available. This complicates the xlogreader.h API to
pass back "wait for this XLogRecPtr", and we'd then persist enough state to
resume decoding. This doesn't have any advantages to make up for those.
3. Don't avoid waiting in WalSndLoop(); instead, fix the stall by copying the
WalSndKeepalive() call from WalSndWaitForWal() to WalSndLoop(). This risks
further drift between the two wait sites; on the other hand, one could
refactor later to help avoid that.
4. Keep the WalSndLoop() wait, but condition it on !logical. This is the
minimal fix, but it crudely punches through the abstraction between
WalSndLoop() and its WalSndSendDataCallback.
I'm favoring (1). Other preferences?