Re: 001_rep_changes.pl stalls - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: 001_rep_changes.pl stalls
Date
Msg-id fba5fc85-205a-3aeb-9bf6-a124f762aa27@oss.nttdata.com
Whole thread Raw
In response to Re: 001_rep_changes.pl stalls  (Noah Misch <noah@leadboat.com>)
Responses Re: 001_rep_changes.pl stalls
List pgsql-hackers

On 2020/04/17 14:41, Noah Misch wrote:
> On Mon, Apr 13, 2020 at 09:45:16PM -0400, Tom Lane wrote:
>> Noah Misch <noah@leadboat.com> writes:
>>> This seems to have made the following race condition easier to hit:
>>> https://www.postgresql.org/message-id/flat/20200206074552.GB3326097%40rfd.leadboat.com
>>> https://www.postgresql.org/message-id/flat/21519.1585272409%40sss.pgh.pa.us
>>
>> Yeah, I just came to the same guess in the other thread.
>>
>>> While I don't think that indicates anything wrong with the fix for $SUBJECT,
>>> creating more buildfarm noise is itself bad.  I am inclined to revert the fix
>>> after a week.  Not immediately, in case it uncovers lower-probability bugs.
>>> I'd then re-commit it after one of those threads fixes the other bug.  Would
>>> anyone like to argue for a revert earlier, later, or not at all?
>>
>> I don't think you should revert.  Those failures are (just) often enough
>> to be annoying but I do not think that a proper fix is very far away.
> 
> That works for me, but an actual defect may trigger a revert.  Fujii Masao
> reported high walsender CPU usage after this patch.  The patch caused idle
> physical walsenders to use 100% CPU.  When caught up, the
> WalSndSendDataCallback for logical replication, XLogSendLogical(), sleeps
> until more WAL is available.  XLogSendPhysical() just returns when caught up.
> No amount of WAL is too small for physical replication to dispatch, but
> logical replication needs the full xl_tot_len of a record.  Some options:
> 
> 1. Make XLogSendPhysical() more like XLogSendLogical(), calling
>     WalSndWaitForWal() when no WAL is available.  A quick version of this
>     passes tests, but I'll need to audit WalSndWaitForWal() for things that are
>     wrong for physical replication.

(1) makes even physical replication walsender sleep in two places and
which seems to make the code for physical replication complicated
more than necessary. I'd like to avoid (1) if possible.

> 
> 2. Make XLogSendLogical() more like XLogSendPhysical(), returning when
>     insufficient WAL is available.  This complicates the xlogreader.h API to
>     pass back "wait for this XLogRecPtr", and we'd then persist enough state to
>     resume decoding.  This doesn't have any advantages to make up for those.
> 
> 3. Don't avoid waiting in WalSndLoop(); instead, fix the stall by copying the
>     WalSndKeepalive() call from WalSndWaitForWal() to WalSndLoop().  This risks
>     further drift between the two wait sites; on the other hand, one could
>     refactor later to help avoid that.

Since the additional call of WalSndKeepalive() is necessary only for
logical replication, it should be copied to, e.g., XLogSendLogical(),
instead of WalSndLoop()? For example, when XLogSendLogical() sets
WalSndCaughtUp to true, it should call WalSndKeepalive()?

The root problem seems that when WAL record that's no-opes for
logical rep is processed, keep alive message has not sent immediately,
in spite of that we want pg_stat_replication to be updated promptly.
(3) seems to try to address this problem straightly and looks better to me.

> 4. Keep the WalSndLoop() wait, but condition it on !logical.  This is the
>     minimal fix, but it crudely punches through the abstraction between
>     WalSndLoop() and its WalSndSendDataCallback.

(4) also looks good because it's simple, if we can redesign those
functions in good shape.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: adding partitioned tables to publications
Next
From: Tom Lane
Date:
Subject: Re: Race condition in SyncRepGetSyncStandbysPriority