Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | CAA4eK1KC9tb0AUKKnkqyuXXc-B89KQDk5NEWxhm15-dPG2pt3g@mail.gmail.com Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
RE: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
On Thu, Feb 29, 2024 at 3:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Few additional comments on the latest patch: > ================================= > 1. > static XLogRecPtr > WalSndWaitForWal(XLogRecPtr loc) > { > ... > + if (!XLogRecPtrIsInvalid(RecentFlushPtr) && loc <= RecentFlushPtr && > + (!replication_active || StandbyConfirmedFlush(loc, WARNING))) > + { > + /* > + * Fast path to avoid acquiring the spinlock in case we already know > + * we have enough WAL available and all the standby servers have > + * confirmed receipt of WAL up to RecentFlushPtr. This is particularly > + * interesting if we're far behind. > + */ > return RecentFlushPtr; > + } > ... > ... > + * Wait for WALs to be flushed to disk only if the postmaster has not > + * asked us to stop. > + */ > + if (loc > RecentFlushPtr && !got_STOPPING) > + wait_event = WAIT_EVENT_WAL_SENDER_WAIT_FOR_WAL; > + > + /* > + * Check if the standby slots have caught up to the flushed position. > + * It is good to wait up to RecentFlushPtr and then let it send the > + * changes to logical subscribers one by one which are already covered > + * in RecentFlushPtr without needing to wait on every change for > + * standby confirmation. Note that after receiving the shutdown signal, > + * an ERROR is reported if any slots are dropped, invalidated, or > + * inactive. This measure is taken to prevent the walsender from > + * waiting indefinitely. > + */ > + else if (replication_active && > + !StandbyConfirmedFlush(RecentFlushPtr, WARNING)) > + { > + wait_event = WAIT_EVENT_WAIT_FOR_STANDBY_CONFIRMATION; > + wait_for_standby = true; > + } > + else > + { > + /* Already caught up and doesn't need to wait for standby_slots. */ > break; > + } > ... > } > > Can we try to move these checks into a separate function that returns > a boolean and has an out parameter as wait_event? > > 2. How about naming StandbyConfirmedFlush() as StandbySlotsAreCaughtup? > Some more comments: ================== 1. + foreach_ptr(char, name, elemlist) + { + ReplicationSlot *slot; + + slot = SearchNamedReplicationSlot(name, true); + + if (!slot) + { + GUC_check_errdetail("replication slot \"%s\" does not exist", + name); + ok = false; + break; + } + + if (!SlotIsPhysical(slot)) + { + GUC_check_errdetail("\"%s\" is not a physical replication slot", + name); + ok = false; + break; + } + } Won't the second check need protection via ReplicationSlotControlLock? 2. In WaitForStandbyConfirmation(), we are anyway calling StandbyConfirmedFlush() before the actual sleep in the loop, so does calling it at the beginning of the function will serve any purpose? If so, it is better to add some comments explaining the same. 3. Also do we need to perform the validation checks done in StandbyConfirmedFlush() repeatedly when it is invoked in a loop? We can probably separate those checks and perform them just once. 4. + * + * XXX: If needed, this can be attempted in future. Remove this part of the comment. -- With Regards, Amit Kapila.
pgsql-hackers by date: