001_rep_changes.pl stalls - Mailing list pgsql-hackers

From Noah Misch
Subject 001_rep_changes.pl stalls
Date
Msg-id 20200406063649.GA3738151@rfd.leadboat.com
Whole thread Raw
Responses Re: 001_rep_changes.pl stalls
List pgsql-hackers
Executive summary: the "MyWalSnd->write < sentPtr" in WalSndWaitForWal() is
important for promptly updating pg_stat_replication.  When caught up, we
should impose that logic before every sleep.  The one-line fix is to sleep in
WalSndLoop() only when pq_is_send_pending(), not when caught up.


On my regular development machine, src/test/subscription/t/001_rep_changes.pl
stalls for ~10s at this wait_for_catchup:

  $node_publisher->safe_psql('postgres', "DELETE FROM tab_rep");

  # Restart the publisher and check the state of the subscriber which
  # should be in a streaming state after catching up.
  $node_publisher->stop('fast');
  $node_publisher->start;

  $node_publisher->wait_for_catchup('tap_sub');

That snippet emits three notable physical WAL records.  There's a
Transaction/COMMIT at the end of the DELETE, an XLOG/CHECKPOINT_SHUTDOWN, and
an XLOG/FPI_FOR_HINT.

The buildfarm has stalled there, but it happens probably less than half the
time.  Examples[1] showing the stall:


https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=mandrill&dt=2020-03-20%2017%3A09%3A53&stg=subscription-check

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=thorntail&dt=2020-03-22%2019%3A51%3A38&stg=subscription-check

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hoverfly&dt=2020-03-19%2003%3A35%3A01&stg=subscription-check

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hoverfly&dt=2020-03-20%2015%3A15%3A01&stg=subscription-check

Here's the most-relevant walsender call tree:

WalSndLoop
  XLogSendLogical (caller invokes once per loop iteration, via send_data callback)
    XLogReadRecord (caller invokes once)
      ReadPageInternal (caller invokes twice in this test; more calls are possible)
        logical_read_xlog_page (caller skips when page is same as last call, else invokes 1-2 times via
state->read_page()callback, registered in StartLogicalReplication)
 
          WalSndWaitForWal (caller invokes once; has fast path)


The cause is a race involving the flow of reply messages (send_feedback()
messages) from logical apply worker to walsender.  Here are two sequencing
patterns; the more-indented parts are what differ.  Stalling pattern:

  sender reads Transaction/COMMIT and sends the changes
  receiver applies the changes
  receiver send_feedback() reports progress up to Transaction/COMMIT
    sender accepts the report
    sender reads XLOG/CHECKPOINT_SHUTDOWN and/or XLOG/FPI_FOR_HINT, which are no-ops for logical rep
    sender WalSndCaughtUp becomes true; sender sleeps in WalSndLoop()
    receiver wal_receiver_status_interval elapses; receiver reports progress up to Transaction/COMMIT
  sender wakes up, accepts the report
  sender calls WalSndWaitForWal(), which sends a keepalive due to "MyWalSnd->write < sentPtr"
  receiver gets keepalive, send_feedback() reports progress up to XLOG/FPI_FOR_HINT

Non-stalling pattern (more prevalent with lower machine performance):

  sender reads Transaction/COMMIT and sends the changes
    sender reads XLOG/CHECKPOINT_SHUTDOWN and/or XLOG/FPI_FOR_HINT, which are no-ops for logical rep
    sender WalSndCaughtUp becomes true; sender sleeps in WalSndLoop()
  receiver applies the changes
  receiver send_feedback() reports progress up to Transaction/COMMIT
  sender wakes up, accepts the report
  sender calls WalSndWaitForWal(), which sends a keepalive due to "MyWalSnd->write < sentPtr"
  receiver gets keepalive, send_feedback() reports progress up to XLOG/FPI_FOR_HINT


The fix is to test "MyWalSnd->write < sentPtr" before more sleeps.  The test
is unnecessary when sleeping due to pq_is_send_pending(); in that case, the
receiver is not idle and will reply before idling.  I changed WalSndLoop() to
sleep only for pq_is_send_pending().  For all other sleep reasons, the sleep
will happen in WalSndWaitForWal().  Attached.  I don't know whether this is
important outside of testing scenarios.  I lean against back-patching, but I
will back-patch if someone thinks this qualifies as a performance bug.

Thanks,
nm

[1] I spot-checked only my animals, since I wanted to experiment on an
affected animal.

Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)
Next
From: Michael Paquier
Date:
Subject: Re: Protocol problem with GSSAPI encryption?