Re: Logical replication timeout problem - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Logical replication timeout problem |
Date | |
Msg-id | CAHut+PvraNWTMnhCruQKF+ZxOnv+ZyD_GPN7pAwdEt9o7GYDag@mail.gmail.com Whole thread Raw |
In response to | RE: Logical replication timeout problem ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>) |
Responses |
Re: Logical replication timeout problem
RE: Logical replication timeout problem |
List | pgsql-hackers |
Here are some review comments for patch v3-0001. ====== Commit message 1. The problem is when there is a DDL in a transaction that generates lots of temporary data due to rewrite rules, these temporary data will not be processed by the pgoutput - plugin. Therefore, the previous fix (f95d53e) for DML had no impact on this case. ~ 1a. IMO this comment needs to give a bit of background about the original problem here, rather than just starting with "The problem is" which is describing the flaws of the previous fix. ~ 1b. "pgoutput - plugin" -> "pgoutput plugin" ?? ~~~ 2. To fix this, we introduced a new ReorderBuffer callback - 'ReorderBufferUpdateProgressCB'. This callback is called to try to update the process after each change has been processed during sending data of a transaction (and its subtransactions) to the output plugin. IIUC it's not really "after each change" - shouldn't this comment mention something about the CHANGES_THRESHOLD 100? ====== src/backend/replication/logical/logical.c 3. forward declaration +/* update progress callback */ +static void update_progress_cb_wrapper(ReorderBuffer *cache, + ReorderBufferTXN *txn, + ReorderBufferChange *change); I felt this function wrapper name was a bit misleading... AFAIK every other wrapper really does just wrap their respective functions. But this one seems a bit different because it calls the wrapped function ONLY if some threshold is exceeded. IMO maybe this function could have some name that conveys this better: e.g. update_progress_cb_wrapper_with_threshold ~~~ 4. update_progress_cb_wrapper +/* + * Update progress callback + * + * Try to update progress and send a keepalive message if too many changes were + * processed when processing txn. + * + * For a large transaction, if we don't send any change to the downstream for a + * long time (exceeds the wal_receiver_timeout of standby) then it can timeout. + * This can happen when all or most of the changes are either not published or + * got filtered out. + */ SUGGESTION (instead of the "Try to update" sentence) Send a keepalive message whenever more than <CHANGES_THRESHOLD> changes are encountered while processing a transaction. ~~~ 5. +static void +update_progress_cb_wrapper(ReorderBuffer *cache, ReorderBufferTXN *txn, + ReorderBufferChange *change) +{ + LogicalDecodingContext *ctx = cache->private_data; + LogicalErrorCallbackState state; + ErrorContextCallback errcallback; + static int changes_count = 0; /* Static variable used to accumulate + * the number of changes while + * processing txn. */ + IMO this may be more readable if the static 'changes_count' local var was declared first and separated from the other vars by a blank line. ~~~ 6. + /* + * We don't want to try sending a keepalive message after processing each + * change as that can have overhead. Tests revealed that there is no + * noticeable overhead in doing it after continuously processing 100 or so + * changes. + */ +#define CHANGES_THRESHOLD 100 6a. I think it might be better to define this right at the top of the function adjacent to the 'changes_count' variable (e.g. a bit like the original HEAD code looked) ~ 6b. SUGGESTION (for the comment) Sending keepalive messages after every change has some overhead, but testing showed there is no noticeable overhead if keepalive is only sent after every ~100 changes. ~~~ 7. + + /* + * After continuously processing CHANGES_THRESHOLD changes, we + * try to send a keepalive message if required. + */ + if (++changes_count >= CHANGES_THRESHOLD) + { + ctx->update_progress(ctx, ctx->write_location, ctx->write_xid, false); + changes_count = 0; + } + 7a. SUGGESTION (for comment) Send a keepalive message after every CHANGES_THRESHOLD changes. ~ 7b. Would it be neater to just call OutputPluginUpdateProgress here instead? e.g. BEFORE ctx->update_progress(ctx, ctx->write_location, ctx->write_xid, false); AFTER OutputPluginUpdateProgress(ctx, false); ------ Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: