Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | CAHut+PvyfdHsKcHa7ZA9Wo7G3d4aJ+1voxRnEs8Yq7-WHes+7g@mail.gmail.com Whole thread Raw |
In response to | RE: Synchronizing slots from primary to standby ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
Responses |
Re: Synchronizing slots from primary to standby
RE: Synchronizing slots from primary to standby |
List | pgsql-hackers |
On Sun, Mar 3, 2024 at 6:51 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Sunday, March 3, 2024 7:47 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > 3. > > + <note> > > + <para> > > + Value <literal>*</literal> is not accepted as it is inappropriate to > > + block logical replication for physical slots that either lack > > + associated standbys or have standbys associated that are not > > enabled > > + for replication slot synchronization. (see > > + <xref > > linkend="logicaldecoding-replication-slots-synchronization"/>). > > + </para> > > + </note> > > > > Why does the document need to provide an excuse/reason for the rule? > > You could just say something like: > > > > SUGGESTION > > The slots must be named explicitly. For example, specifying wildcard values like > > <literal>*</literal> is not permitted. > > As suggested by Amit, I moved this to code comments. Was the total removal of this note deliberate? I only suggested removing the *reason* for the rule, not the entire rule. Otherwise, the user won't know to avoid doing this until they try it and get an error. > > > > 9. NeedToWaitForWal > > > > + /* > > + * Check if the standby slots have caught up to the flushed position. > > + It > > + * is good to wait up to flushed position and then let it send the > > + changes > > + * to logical subscribers one by one which are already covered in > > + flushed > > + * position without needing to wait on every change for standby > > + * confirmation. Note that after receiving the shutdown signal, an > > + ERROR > > + * is reported if any slots are dropped, invalidated, or inactive. This > > + * measure is taken to prevent the walsender from waiting indefinitely. > > + */ > > + if (NeedToWaitForStandby(target_lsn, flushed_lsn, wait_event)) return > > + true; > > > > I felt it was confusing things for this function to also call to the other one -- it > > seems an overlapping/muddling of the purpose of these. > > I think it will be easier to understand if the calling code just calls to one or both > > of these functions as required. > > Same as Amit, I didn't change this. AFAICT my previous review comment was misinterpreted. Please see [1] for more details. ~~~~ Here are some more review comments for v104-00001 ====== Commit message 1. Additionally, The SQL functions pg_logical_slot_get_changes, pg_logical_slot_peek_changes and pg_replication_slot_advance are modified to wait for the replication slots specified in 'standby_slot_names' to catch up before returning. ~ Maybe that should be expressed using similar wording as the docs... SUGGESTION Additionally, The SQL functions ... are modified. Now, when used with failover-enabled logical slots, these functions will block until all physical slots specified in 'standby_slot_names' have confirmed WAL receipt. ====== doc/src/sgml/config.sgml 2. + <function>pg_logical_slot_peek_changes</function></link>, + when used with failover enabled logical slots, will block until all + physical slots specified in <varname>standby_slot_names</varname> have + confirmed WAL receipt. /failover enabled logical slots/failover-enabled logical slots/ ====== doc/src/sgml/func.sgml 3. + The function may be blocked if the specified slot is a failover enabled + slot and <link linkend="guc-standby-slot-names"><varname>standby_slot_names</varname></link> + is configured. </para></entry> /a failover enabled slot//a failover-enabled slot/ ~~~ 4. + slot may return to an earlier position. The function may be blocked if + the specified slot is a failover enabled slot and + <link linkend="guc-standby-slot-names"><varname>standby_slot_names</varname></link> + is configured. /a failover enabled slot//a failover-enabled slot/ ====== src/backend/replication/slot.c 5. +/* + * Wait for physical standbys to confirm receiving the given lsn. + * + * Used by logical decoding SQL functions that acquired failover enabled slot. + * It waits for physical standbys corresponding to the physical slots specified + * in the standby_slot_names GUC. + */ +void +WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) /failover enabled slot/failover-enabled slot/ ~~~ 6. + /* + * Don't need to wait for the standby to catch up if the current acquired + * slot is not a failover enabled slot, or there is no value in + * standby_slot_names. + */ /failover enabled slot/failover-enabled slot/ ====== src/backend/replication/slotfuncs.c 7. + + /* + * Wake up logical walsenders holding failover enabled slots after + * updating the restart_lsn of the physical slot. + */ + PhysicalWakeupLogicalWalSnd(); /failover enabled slots/failover-enabled slots/ ====== src/backend/replication/walsender.c 8. +/* + * Wake up the logical walsender processes with failover enabled slots if the + * currently acquired physical slot is specified in standby_slot_names GUC. + */ +void +PhysicalWakeupLogicalWalSnd(void) /failover enabled slots/failover-enabled slots/ 9. +/* + * Returns true if not all standbys have caught up to the flushed position + * (flushed_lsn) when the current acquired slot is a failover enabled logical + * slot and we are streaming; otherwise, returns false. + * + * If returning true, the function sets the appropriate wait event in + * wait_event; otherwise, wait_event is set to 0. + */ +static bool +NeedToWaitForStandby(XLogRecPtr target_lsn, XLogRecPtr flushed_lsn, + uint32 *wait_event) 9a. /failover enabled logical slot/failover-enabled logical slot/ ~ 9b. Probably that function name should be plural. /NeedToWaitForStandby/NeedToWaitForStandbys/ ~~~ 10. +/* + * Returns true if we need to wait for WALs to be flushed to disk, or if not + * all standbys have caught up to the flushed position (flushed_lsn) when the + * current acquired slot is a failover enabled logical slot and we are + * streaming; otherwise, returns false. + * + * If returning true, the function sets the appropriate wait event in + * wait_event; otherwise, wait_event is set to 0. + */ +static bool +NeedToWaitForWal(XLogRecPtr target_lsn, XLogRecPtr flushed_lsn, + uint32 *wait_event) /failover enabled logical slot/failover-enabled logical slot/ ~~~ 11. WalSndWaitForWal + /* + * Within the loop, we wait for the necessary WALs to be flushed to + * disk first, followed by waiting for standbys to catch up if there + * are enought WALs or upon receiving the shutdown signal. To avoid + * the scenario where standbys need to catch up to a newer WAL + * location in each iteration, we update our idea of the currently + * flushed position only if we are not waiting for standbys to catch + * up. + */ typo /enought/enough/ ---------- [1] https://www.postgresql.org/message-id/CAHut%2BPteoyDki-XdygDgoaZJLmasutzRquQepYx0raNs0RSMvg%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Austalia
pgsql-hackers by date: