RE: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: Synchronizing slots from primary to standby
Date
Msg-id TYAPR01MB58660162840428B320AE4087F525A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby
List pgsql-hackers
Dear Drouvot,

Hi, I'm also interested in the feature. Followings are my high-level comments.
I did not mention some detailed notations because this patch is not at the stage.
And very sorry that I could not follow all of this discussions.

1. I thought that we should not reuse logical replication launcher for another purpose.
   The background worker should have only one task. I wanted to ask opinions some other people...
2. I want to confirm the reason why new replication command is added. IIUC the
   launcher connects to primary by using primary_conninfo connection string, but
   it establishes the physical replication connection so that any SQL cannot be executed.
   Is it right? Another approach not to use is to specify the target database via
   GUC, whereas not smart. How do you think?
3. You chose the per-db worker approach, however, it is difficult to extend the
   feature to support physical slots. This may be problematic. Was there any
   reasons for that? I doubted ReplicationSlotCreate() or advance functions might
   not be used from other databases and these may be reasons, but not sure.
   If these operations can do without connecting to specific database, I think
   the architecture can be changed.
4. Currently the launcher establishes the connection every time. Isn't it better
   to reuse the same one instead?

Following comments are assumed the configuration, maybe the straightfoward:

primary->standby
   |->subscriber

5. After constructing the system, I dropped the subscription on the subscriber.
   In this case the logical slot on primary was removed, but that was not replicated
   to standby server. Did you support the workload or not?

```
$ psql -U postgres -p $port_sub -c "DROP SUBSCRIPTION sub"
NOTICE:  dropped replication slot "sub" on publisher
DROP SUBSCRIPTION

$ psql -U postgres -p $port_primary -c "SELECT * FROM pg_replication_slots"
slot_name |  plugin  | slot_type | datoid | database |...
-----------+----------+-----------+--------+----------+... 
(0 rows)

$ psql -U postgres -p $port_standby -c "SELECT * FROM pg_replication_slots"
 slot_name |  plugin  | slot_type | datoid | database |...
-----------+----------+-----------+--------+----------+...
 sub       | pgoutput | logical   |      5 | postgres |...
(1 row)

```

6. Current approach may delay the startpoint of sync.

Assuming that physical replication system is created first, and then the
subscriber connects to the publisher node. In this case the launcher connects to
primary earlier than the apply worker, and reads the slot. At that time there are
no slots on primary, so launcher disconnects from primary and waits a time period (up to 3min).
Even if the apply worker creates the slot on publisher, but the launcher on standby
cannot notice that. The synchronization may start 3 min later.

I'm not sure how to fix or it could be acceptable. Thought?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: Incremental View Maintenance, take 2
Next
From: Amit Kapila
Date:
Subject: Re: Synchronizing slots from primary to standby