RE: Time delayed LR (WAS Re: logical replication restrictions) - Mailing list pgsql-hackers

From shiy.fnst@fujitsu.com
Subject RE: Time delayed LR (WAS Re: logical replication restrictions)
Date
Msg-id OSZPR01MB631039F6CA720C3BCE465308FDD19@OSZPR01MB6310.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: Time delayed LR (WAS Re: logical replication restrictions)  ("Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com>)
Responses Re: Time delayed LR (WAS Re: logical replication restrictions)
RE: Time delayed LR (WAS Re: logical replication restrictions)
List pgsql-hackers
On Mon, Jan 30, 2023 6:05 PM Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com> wrote:
>
> On Saturday, January 28, 2023 1:28 PM I wrote:
> > Attached the updated patch v24.
> Hi,
>
>
> I've conducted the rebase affected by the commit(1e8b61735c)
> by renaming the GUC to logical_replication_mode accordingly,
> because it's utilized in the TAP test of this time-delayed LR feature.
> There is no other change for this version.
>
> Kindly have a look at the attached v25.
>

Thanks for your patch. Here are some comments.

1.
+    /*
+     * The min_apply_delay parameter is ignored until all tablesync workers
+     * have reached READY state. This is because if we allowed the delay
+     * during the catchup phase, then once we reached the limit of tablesync
+     * workers it would impose a delay for each subsequent worker. That would
+     * cause initial table synchronization completion to take a long time.
+     */
+    if (!AllTablesyncsReady())
+        return;

I saw that the new parameter becomes effective after all tables are in ready
state, because the apply worker can't set the state to catchup during the delay.
But can we call process_syncing_tables() in the while-loop of
maybe_apply_delay()? Then the tablesync can finish without delay. If we can't do
so, it might be better to add some comments for it.

2.
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");

I think the last parameter should be 500.
Besides, I am not sure it's a stable test to check the log. Is it possible that
there's no such log on a slow machine? I modified the code to sleep 1s at the
beginning of apply_dispatch(), then the new added test failed because the server
log cannot match.

Regards,
Shi yu



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: transition tables and UPDATE
Next
From: Jakub Wartak
Date:
Subject: Re: Syncrep and improving latency due to WAL throttling