Re: Logical replication timeout problem - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Logical replication timeout problem
Date
Msg-id CAHut+PsTpZSjnSjjNi0C-Yt+8QQFpHc=Oa1LRBboE2aVsJNLDg@mail.gmail.com
Whole thread Raw
In response to RE: Logical replication timeout problem  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
Responses Re: Logical replication timeout problem  (Amit Kapila <amit.kapila16@gmail.com>)
RE: Logical replication timeout problem  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
List pgsql-hackers
Here are my review comments for patch v4-0001

======
General

1.

It makes no real difference, but I was wondering about:
"update txn progress" versus "update progress txn"

I thought that the first way sounds more natural. YMMV.

If you change this then there is impact for the typedef, function
names, comments, member names:

ReorderBufferUpdateTxnProgressCB -->  ReorderBufferUpdateProgressTxnCB

“/* update progress txn callback */” --> “/* update txn progress callback */”

update_progress_txn_cb_wrapper -->  update_txn_progress_cb_wrapper

updated_progress_txn --> update_txn_progress

======
Commit message

2.

The problem is when there is a DDL in a transaction that generates lots of
temporary data due to rewrite rules, these temporary data will not be processed
by the pgoutput plugin. The previous commit (f95d53e) only fixed timeouts
caused by filtering out changes in pgoutput. Therefore, the previous fix for
DML had no impact on this case.

~

IMO this still some rewording to say up-front what the the actual
problem -- i.e. an avoidable timeout occuring.

SUGGESTION (or something like this...)

When there is a DDL in a transaction that generates lots of temporary
data due to rewrite rules, this temporary data will not be processed
by the pgoutput plugin. This means it is possible for a timeout to
occur if a sufficiently long time elapses since the last pgoutput
message. A previous commit (f95d53e) fixed a similar scenario in this
area, but that only fixed timeouts for DML going through pgoutput, so
it did not address this DDL timeout case.

======
src/backend/replication/logical/logical.c

3. update_progress_txn_cb_wrapper

+/*
+ * Update progress callback while processing a transaction.
+ *
+ * Try to update progress and send a keepalive message during sending data of a
+ * transaction (and its subtransactions) to the output plugin.
+ *
+ * For a large transaction, if we don't send any change to the downstream for a
+ * long time (exceeds the wal_receiver_timeout of standby) then it can timeout.
+ * This can happen when all or most of the changes are either not published or
+ * got filtered out.
+ */
+static void
+update_progress_txn_cb_wrapper(ReorderBuffer *cache, ReorderBufferTXN *txn,
+    ReorderBufferChange *change)

Simplify the "Try to..." paragraph. And other part should also mention
about DDL.

SUGGESTION

Try send a keepalive message during transaction processing.

This is done because if we don't send any change to the downstream for
a long time (exceeds the wal_receiver_timeout of standby), then it can
timeout. This can happen for large DDL, or for large transactions when
all or most of the changes are either not published or got filtered
out.

======
.../replication/logical/reorderbuffer.c

4. ReorderBufferProcessTXN

@@ -2105,6 +2105,19 @@ ReorderBufferProcessTXN(ReorderBuffer *rb,
ReorderBufferTXN *txn,

  PG_TRY();
  {
+ /*
+ * Static variable used to accumulate the number of changes while
+ * processing txn.
+ */
+ static int changes_count = 0;
+
+ /*
+ * Sending keepalive messages after every change has some overhead, but
+ * testing showed there is no noticeable overhead if keepalive is only
+ * sent after every ~100 changes.
+ */
+#define CHANGES_THRESHOLD 100
+

IMO these can be relocated to be declared/defined inside the "while"
loop -- i.e. closer to where they are being used.

~~~

5.

+ if (++changes_count >= CHANGES_THRESHOLD)
+ {
+ rb->update_progress_txn(rb, txn, change);
+ changes_count = 0;
+ }

When there is no update_progress function this code is still incurring
some small additional overhead for incrementing and testing the
THRESHOLD every time, and also needlessly calling to the wrapper every
100x. This overhead could be avoided with a simpler up-front check
like shown below. OTOH, maybe the overhead is insignificant enough
that just leaving the curent code is neater?

LogicalDecodingContext *ctx = rb->private_data;
...
if (ctx->update_progress_txn && (++changes_count >= CHANGES_THRESHOLD))
{
rb->update_progress_txn(rb, txn, change);
changes_count = 0;
}

------
Kind Reagrds,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: run pgindent on a regular basis / scripted manner
Next
From: Andres Freund
Date:
Subject: Re: run pgindent on a regular basis / scripted manner