Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Drouvot, Bertrand |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | afe4ab6c-dde3-48ea-acd8-6f6052c7b8fd@gmail.com Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (shveta malik <shveta.malik@gmail.com>) |
Responses |
Re: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
Hi, On 10/27/23 11:56 AM, shveta malik wrote: > On Wed, Oct 25, 2023 at 3:15 PM Drouvot, Bertrand > <bertranddrouvot.pg@gmail.com> wrote: >> >> Hi, >> >> On 10/25/23 5:00 AM, shveta malik wrote: >>> On Tue, Oct 24, 2023 at 11:54 AM Drouvot, Bertrand >>> <bertranddrouvot.pg@gmail.com> wrote: >>>> >>>> Hi, >>>> >>>> On 10/23/23 2:56 PM, shveta malik wrote: >>>>> On Mon, Oct 23, 2023 at 5:52 PM Drouvot, Bertrand >>>>> <bertranddrouvot.pg@gmail.com> wrote: >>>> >>>>>> We are waiting for DEFAULT_NAPTIME_PER_CYCLE (3 minutes) before checking if there >>>>>> is new synced slot(s) to be created on the standby. Do we want to keep this behavior >>>>>> for V1? >>>>>> >>>>> >>>>> I think for the slotsync workers case, we should reduce the naptime in >>>>> the launcher to say 30sec and retain the default one of 3mins for >>>>> subscription apply workers. Thoughts? >>>>> >>>> >>>> Another option could be to keep DEFAULT_NAPTIME_PER_CYCLE and create a new >>>> API on the standby that would refresh the list of sync slot at wish, thoughts? >>>> >>> >>> Do you mean API to refresh list of DBIDs rather than sync-slots? >>> As per current design, launcher gets DBID lists for all the failover >>> slots from the primary at intervals of DEFAULT_NAPTIME_PER_CYCLE. >> >> I mean an API to get a newly created slot on the primary being created/synced on >> the standby at wish. >> >> Also let's imagine this scenario: >> >> - create logical_slot1 on the primary (and don't start using it) >> >> Then on the standby we'll get things like: >> >> 2023-10-25 08:33:36.897 UTC [740298] LOG: waiting for remote slot "logical_slot1" LSN (0/C00316A0) and catalog xmin (752)to pass local slot LSN (0/C0049530) and and catalog xmin (754) >> >> That's expected and due to the fact that ReplicationSlotReserveWal() does set the slot >> restart_lsn to a value < at the corresponding restart_lsn slot on the primary. >> >> - create logical_slot2 on the primary (and start using it) >> >> Then logical_slot2 won't be created/synced on the standby until there is activity on logical_slot1 on the primary >> that would produce things like: >> 2023-10-25 08:41:35.508 UTC [740298] LOG: wait over for remote slot "logical_slot1" as its LSN (0/C005FFD8) and catalogxmin (756) has now passed local slot LSN (0/C0049530) and catalog xmin (754) > > > Slight correction to above. As soon as we start activity on > logical_slot2, it will impact all the slots on primary, as the WALs > are consumed by all the slots. So even if there is activity on > logical_slot2, logical_slot1 creation on standby will be unblocked and > it will then move to logical_slot2 creation. eg: > > --on standby: > 2023-10-27 15:15:46.069 IST [696884] LOG: waiting for remote slot > "mysubnew1_1" LSN (0/3C97970) and catalog xmin (756) to pass local > slot LSN (0/3C979A8) and and catalog xmin (756) > > on primary: > newdb1=# select now(); > now > ---------------------------------- > 2023-10-27 15:15:51.504835+05:30 > (1 row) > > --activity on mysubnew1_3 > newdb1=# insert into tab1_3 values(1); > INSERT 0 1 > newdb1=# select now(); > now > ---------------------------------- > 2023-10-27 15:15:54.651406+05:30 > > > --on standby, mysubnew1_1 is unblocked. > 2023-10-27 15:15:56.223 IST [696884] LOG: wait over for remote slot > "mysubnew1_1" as its LSN (0/3C97A18) and catalog xmin (757) has now > passed local slot LSN (0/3C979A8) and catalog xmin (756) > > My Setup: > mysubnew1_1 -->mypubnew1_1 -->tab1_1 > mysubnew1_3 -->mypubnew1_3-->tab1_3 > Agree with your test case, but in my case I was not using pub/sub. I was not clear, so when I said: >> - create logical_slot1 on the primary (and don't start using it) I meant don't start decoding from it (like using pg_recvlogical() or pg_logical_slot_get_changes()). By using pub/sub the "don't start using it" is not satisfied. My test case is: " SELECT * FROM pg_create_logical_replication_slot('logical_slot1', 'test_decoding', false, true, true); SELECT * FROM pg_create_logical_replication_slot('logical_slot2', 'test_decoding', false, true, true); pg_recvlogical -d postgres -S logical_slot2 --no-loop --start -f - " Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: