RE: Logical replication timeout problem - Mailing list pgsql-hackers

From wangw.fnst@fujitsu.com
Subject RE: Logical replication timeout problem
Date
Msg-id OS3PR01MB62750D04D4A84A3FA27D7B099E039@OS3PR01MB6275.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: Logical replication timeout problem  ("kuroda.hayato@fujitsu.com" <kuroda.hayato@fujitsu.com>)
Responses Re: Logical replication timeout problem
RE: Logical replication timeout problem
List pgsql-hackers
On Mon, Feb 28, 2022 at 6:58 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
> Dear Wang,
> 
> > Attached a new patch that addresses following improvements I have got
> > so far as
> > comments:
> > 1. Consider other changes that need to be skipped(truncate, DDL and
> > function calls pg_logical_emit_message). [suggestion by Ajin, Amit]
> > (Add a new function SendKeepaliveIfNecessary for trying to send
> > keepalive
> > message.)
> > 2. Set the threshold conservatively to a static value of
> > 10000.[suggestion by Amit, Kuroda-San] 3. Reset sendTime in function
> > WalSndUpdateProgress when send_keep_alive is false. [suggestion by
> > Amit]
> 
> Thank you for giving a good patch! I'll check more detail later, but it can be
> applied my codes and passed check world.
> I put some minor comments:
Thanks for your comments.

> ```
> + * Try to send keepalive message
> ```
> 
> Maybe missing "a"?
Fixed. Add missing "a".

> ```
> +       /*
> +       * After continuously skipping SKIPPED_CHANGES_THRESHOLD changes, try
> to send a
> +       * keepalive message.
> +       */
> ```
> 
> This comments does not follow preferred style:
> https://www.postgresql.org/docs/devel/source-format.html
Fixed. Correct wrong comment style.

> ```
> @@ -683,12 +683,12 @@ OutputPluginWrite(struct LogicalDecodingContext *ctx,
> bool last_write)
>   * Update progress tracking (if supported).
>   */
>  void
> -OutputPluginUpdateProgress(struct LogicalDecodingContext *ctx)
> +OutputPluginUpdateProgress(struct LogicalDecodingContext *ctx, bool
> +send_keep_alive)
> ```
> 
> This function is no longer doing just tracking.
> Could you update the code comment above?
Fixed. Update the comment above function OutputPluginUpdateProgress.

> ```
>     if (!is_publishable_relation(relation))
>         return;
> ```
> 
> I'm not sure but it seems that the function exits immediately if relation is a
> sequence, view, temporary table and so on. Is it OK? Does it never happen?
I did some checks to confirm this. After my confirmation, there are several
situations that can cause a timeout. For example, if I insert many date into
table sql_features in a long transaction, subscriber-side will time out.
Although I think users should not modify these tables arbitrarily, it could
happen. To be conservative, I think this use case should be addressed as well.
Fixed. Invoke function SendKeepaliveIfNecessary before return.

> ```
> +       SendKeepaliveIfNecessary(ctx, false);
> ```
> 
> I think a comment is needed above which clarifies sending a keepalive message.
Fixed. Before invoking function SendKeepaliveIfNecessary, add the corresponding
comment.

Attach the new patch. [suggestion by Kuroda-San]
1. Fix the typo.
2. Improve comment style.
3. Fix missing consideration.
4. Add comments to clarifies above functions and function calls.

Regards,
Wang wei

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Design of pg_stat_subscription_workers vs pgstats
Next
From: Amit Kapila
Date:
Subject: Re: pg_stat_get_replication_slot and pg_stat_get_subscription_worker incorrectly marked as proretset