Re: 001_rep_changes.pl stalls - Mailing list pgsql-hackers

From Noah Misch
Subject Re: 001_rep_changes.pl stalls
Date
Msg-id 20200418070142.GA1075445@rfd.leadboat.com
Whole thread Raw
In response to Re: 001_rep_changes.pl stalls  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: 001_rep_changes.pl stalls  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Re: 001_rep_changes.pl stalls  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
On Fri, Apr 17, 2020 at 05:06:29PM +0900, Kyotaro Horiguchi wrote:
> At Fri, 17 Apr 2020 17:00:15 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> > By the way, if latch is consumed in WalSndLoop, succeeding call to
> > WalSndWaitForWal cannot be woke-up by the latch-set.  Doesn't that
> > cause missing wakeups? (in other words, overlooking of wakeup latch).
> 
> - Since the only source other than timeout of walsender wakeup is latch,
> - we should avoid wasteful consuming of latch. (It is the same issue
> - with [1]).
> 
> + Since walsender is wokeup by LSN advancement via latch, we should
> + avoid wasteful consuming of latch. (It is the same issue with [1]).
> 
> 
> > If wakeup signal is not remembered on walsender (like
> > InterruptPending), WalSndPhysical cannot enter a sleep with
> > confidence.

No; per latch.h, "What must be avoided is placing any checks for asynchronous
events after WaitLatch and before ResetLatch, as that creates a race
condition."  In other words, the thing to avoid is calling ResetLatch()
without next examining all pending work that a latch would signal.  Each
walsender.c WaitLatch call does follow the rules.

On Sat, Apr 18, 2020 at 12:29:58AM +0900, Fujii Masao wrote:
> On 2020/04/17 14:41, Noah Misch wrote:
> >1. Make XLogSendPhysical() more like XLogSendLogical(), calling
> >    WalSndWaitForWal() when no WAL is available.  A quick version of this
> >    passes tests, but I'll need to audit WalSndWaitForWal() for things that are
> >    wrong for physical replication.
> 
> (1) makes even physical replication walsender sleep in two places and
> which seems to make the code for physical replication complicated
> more than necessary. I'd like to avoid (1) if possible.

Good point.

> >2. Make XLogSendLogical() more like XLogSendPhysical(), returning when
> >    insufficient WAL is available.  This complicates the xlogreader.h API to
> >    pass back "wait for this XLogRecPtr", and we'd then persist enough state to
> >    resume decoding.  This doesn't have any advantages to make up for those.
> >
> >3. Don't avoid waiting in WalSndLoop(); instead, fix the stall by copying the
> >    WalSndKeepalive() call from WalSndWaitForWal() to WalSndLoop().  This risks
> >    further drift between the two wait sites; on the other hand, one could
> >    refactor later to help avoid that.
> 
> Since the additional call of WalSndKeepalive() is necessary only for
> logical replication, it should be copied to, e.g., XLogSendLogical(),
> instead of WalSndLoop()? For example, when XLogSendLogical() sets
> WalSndCaughtUp to true, it should call WalSndKeepalive()?

We'd send a keepalive even when pq_flush_if_writable() can't empty the output
buffer.  That could be acceptable, but it's not ideal.

> The root problem seems that when WAL record that's no-opes for
> logical rep is processed, keep alive message has not sent immediately,
> in spite of that we want pg_stat_replication to be updated promptly.

The degree of promptness should be predictable, at least.  If we removed the
WalSndKeepalive() from WalSndWaitForWal(), pg_stat_replication updates would
not be prompt, but they would be predictable.  I do, however, think prompt
updates are worthwhile.

> (3) seems to try to address this problem straightly and looks better to me.
> 
> >4. Keep the WalSndLoop() wait, but condition it on !logical.  This is the
> >    minimal fix, but it crudely punches through the abstraction between
> >    WalSndLoop() and its WalSndSendDataCallback.
> 
> (4) also looks good because it's simple, if we can redesign those
> functions in good shape.

Let's do that.  I'm attaching the replacement implementation and the revert of
v1.

Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: fixing old_snapshot_threshold's time->xid mapping
Next
From: Erik Rijkers
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions