Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Drouvot, Bertrand |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | d05a8944-a287-485c-85bd-5a694420d870@gmail.com Whole thread Raw |
In response to | RE: Synchronizing slots from primary to standby ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
Responses |
Re: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
Hi, On 11/17/23 2:46 AM, Zhijie Hou (Fujitsu) wrote: > On Tuesday, November 14, 2023 10:27 PM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote: >> On 11/13/23 2:57 PM, Zhijie Hou (Fujitsu) wrote: >>> On Friday, November 10, 2023 4:16 PM Drouvot, Bertrand >> <bertranddrouvot.pg@gmail.com> wrote: >>>> Yeah good point, agree to just error out in all the case then (if we >>>> discard the sync_ reserved wording proposal, which seems to be the >>>> case as probably not worth the extra work). >>> >>> Thanks for the discussion! >>> >>> Here is the V33 patch set which includes the following changes: >> >> Thanks for working on it! >> >>> >>> 1) Drop slots with state 'i' in promotion flow after we shut down WalReceiver. >> >> @@ -3557,10 +3558,15 @@ WaitForWALToBecomeAvailable(XLogRecPtr >> RecPtr, bool randAccess, >> * this only after failure, so when you promote, we still >> * finish replaying as much as we can from archive and >> * pg_wal before failover. >> + * >> + * Drop the slots for which sync is initiated but not yet >> + * completed i.e. they are still waiting for the primary >> + * server to catch up. >> */ >> if (StandbyMode && CheckForStandbyTrigger()) >> { >> XLogShutdownWalRcv(); >> + slotsync_drop_initiated_slots(); >> return XLREAD_FAIL; >> } >> >> I had a closer look and it seems this is not located at the right place. >> >> Indeed, it's added here: >> >> switch (currentSource) >> { >> case XLOG_FROM_ARCHIVE: >> case XLOG_FROM_PG_WAL: >> >> While in our case we are in >> >> case XLOG_FROM_STREAM: >> >> So I think we should move slotsync_drop_initiated_slots() in the >> XLOG_FROM_STREAM case. Maybe before shutting down the sync slot worker? >> (the TODO item number 2 you mentioned up-thread) > > Thanks for the comment. > > I feel the WaitForWALToBecomeAvailable may not be the best place to shutdown > slotsync worker and drop slots. There could be other reasons(other than > promotion) as mentioned in comments in case XLOG_FROM_STREAM to reach the code > there. I thought if the intention is to stop slotsync workers on promotion, > maybe FinishWalRecovery() is a better place to do it as it's indicating the end > of recovery and XLogShutdownWalRcv is also called in it. I can see that slotsync_drop_initiated_slots() has been moved in FinishWalRecovery() in v35. That looks ok. > > And I feel we'd better drop the slots after shutting down the slotsync workers, > because otherwise the slotsync workers could create the dropped slot again in > rare cases. Yeah, agree and I can see that it's done that way in v35. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: