Re: speed up a logical replica setup - Mailing list pgsql-hackers

From Shlok Kyal
Subject Re: speed up a logical replica setup
Date
Msg-id CANhcyEV6q1Vhd37i1axUeScLi0UAGVxta1LDa0BV0Eh--TcPMg@mail.gmail.com
Whole thread Raw
In response to Re: speed up a logical replica setup  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: speed up a logical replica setup
List pgsql-hackers
Hi,
>
> I was trying to test this utility when 'sync_replication_slots' is on
> and it gets in an ERROR loop [1] and never finishes. Please find the
> postgresql.auto used on the standby attached. I think if the standby
> has enabled sync_slots, you need to pass dbname in
> GenerateRecoveryConfig(). I couldn't test it further but I wonder if
> there are already synced slots on the standby (either due to
> 'sync_replication_slots' or users have used
> pg_sync_replication_slots() before invoking pg_createsubscriber),
> those would be retained as it is on new subscriber and lead to
> unnecessary WAL retention and dead rows.
>
> [1]
> 2024-04-30 11:50:43.239 IST [12536] LOG:  slot sync worker started
> 2024-04-30 11:50:43.247 IST [12536] ERROR:  slot synchronization
> requires dbname to be specified in primary_conninfo

Hi,

I tested the scenario posted by Amit in [1], in which retaining synced
slots lead to unnecessary WAL retention and ERROR. This is raised as
the second open point in [2].
The steps to reproduce the issue:
(1) Setup physical replication with sync slot feature turned on by
setting sync_replication_slots = 'true' or using
pg_sync_replication_slots() on the standby node.
For physical replication setup, run pg_basebackup with -R  and -d option.
(2) Create a logical replication slot on primary node with failover
option as true. A corresponding slot is created on standby as part of
sync slot feature.
(3) Run pg_createsubscriber on standby node.
(4) On Checking for the replication slot on standby node, I noticed
that the logical slots created in step 2 are retained.
 I have attached the script to reproduce the issue.

I and Kuroda-san worked to resolve open points. Here are patches to
solve the second and third point in [2].
Patches proposed by Euler are also attached just in case, but they
were not modified.

v2-0001: not changed
v2-0002: not changed
v2-0003: ensures the slot sync is disabled during the conversion. This
resolves the second point.
v2-0004: drops sync slots which may be retained after running. This
resolves the second point.
v2-0005: removes misleading output messages in dry-run. This resolves
the third point.

[1]: https://www.postgresql.org/message-id/CAA4eK1KdCb%2B5sjYu6qCMXXdCX1y_ihr8kFzMozq0%3DP%3DauYxgog%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAA4eK1J22UEfrqx222h5j9DQ7nxGrTbAa_BC%2B%3DmQXdXs-RCsew%40mail.gmail.com

Thanks and Regards,
Shlok Kyal

Attachment

pgsql-hackers by date:

Previous
From: Dave Page
Date:
Subject: zlib detection in Meson on Windows broken?
Next
From: Amit Kapila
Date:
Subject: Re: Fix src/test/subscription/t/029_on_error.pl test when wal_debug is enabled