Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Drouvot, Bertrand |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | 1c4691b6-787c-4b02-adf3-d5865b12820f@gmail.com Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
RE: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
Hi, On 11/23/23 6:13 AM, Amit Kapila wrote: > On Tue, Nov 21, 2023 at 4:35 PM Drouvot, Bertrand > <bertranddrouvot.pg@gmail.com> wrote: >> >> On 11/21/23 10:32 AM, shveta malik wrote: >>> On Tue, Nov 21, 2023 at 2:02 PM shveta malik <shveta.malik@gmail.com> wrote: >>>> >> >>> v37 fails to apply to HEAD due to a recent commit e83aa9f92fdd, >>> rebased the patches. PFA v37_2 patches. >> >> Thanks! >> >> Regarding the promotion flow: If the primary is available and reachable I don't >> think we currently try to ensure that slots are in sync. I think we'd miss the >> activity since the last sync and the promotion request or am I missing something? >> >> If the primary is available and reachable shouldn't we launch a last round of >> synchronization (skipping all the slots that are not in 'r' state)? >> > > We may miss the last round but there is no guarantee that we can > ensure to sync of everything if the primary is available. Because > after our last sync, there could probably be some more activity. I don't think so thanks to the fact that we ensure that logical walsenders on the primary wait for the physical standby. Indeed that should prevent any decoding activity on the primary while the promotion is in progress on the standby (at least as soon as the walreceiver is shutdown). So that I think that a promotion flow like: - walreceiver shutdown - last round of sync - sync-worker shutdown Should ensure that slots are in sync (as logical slots on the primary should not be able to advance as soon as the walreceiver is shutdown during the promotion). > I think it is the user's responsibility to promote a new primary when > the old one is not required for some reason. Do you mean they should ensure something like? 1. no more activity on the primary 2. check that the slots are in sync with the primary 3. promote but then they could also (without the new feature we're building): 1. create and advance slots manually (pg_replication_slot_advance) on the standby to sync them up at regular interval and then before promotion: 2. ensure no more activity on the primary 3. last round of advance slots manually 3. promote I think that ensuring the slots are in sync during promotion (should the primary be available) would provide added value as compared to the above scenarios. > It is not only slots that > can be out of sync but even we can miss fetching some of the data. I > think this is quite similar to what we do for WAL where on finding the > promotion signal, we shut down Walreceiver and just replay any WAL > that was already received by walreceiver. > Also, the promotion > shouldn't create any problem w.r.t subscribers connecting to the new > primary because the slot's position is slightly behind what could be > requested by subscribers which means the corresponding data will be > available on the new primary. > Right. > Do you have something in mind that can create any problem if we don't > attempt additional fetching round after the promotion signal is > received? It's not a "real" problem per say, but in case of non synced slot, I can see 2 cases: - publisher/subscriber case: I don't see any problem here, since after an "alter subscription XXX connection '<new_primary>'" logical replication should start from the right place thanks to the replication origin associated to the subscription. - non publisher/subscriber case (say pg_recvlogical that does not make use of replication origin) then: a) data since the last sync and promotion would be decoded again unless b) or c) b) user manually advances the slot on the standby after promotion c) user restarts the decoding with an appropriate --startpos option That's for this non publisher/subscriber case that I think it would be beneficial to try to ensure that the slots are in sync during the promotion. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: