Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAHut+PvyfdHsKcHa7ZA9Wo7G3d4aJ+1voxRnEs8Yq7-WHes+7g@mail.gmail.com
Whole thread Raw
In response to RE: Synchronizing slots from primary to standby  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
Responses Re: Synchronizing slots from primary to standby
RE: Synchronizing slots from primary to standby
List pgsql-hackers
On Sun, Mar 3, 2024 at 6:51 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Sunday, March 3, 2024 7:47 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >

> > 3.
> > +       <note>
> > +        <para>
> > +         Value <literal>*</literal> is not accepted as it is inappropriate to
> > +         block logical replication for physical slots that either lack
> > +         associated standbys or have standbys associated that are not
> > enabled
> > +         for replication slot synchronization. (see
> > +         <xref
> > linkend="logicaldecoding-replication-slots-synchronization"/>).
> > +        </para>
> > +       </note>
> >
> > Why does the document need to provide an excuse/reason for the rule?
> > You could just say something like:
> >
> > SUGGESTION
> > The slots must be named explicitly. For example, specifying wildcard values like
> > <literal>*</literal> is not permitted.
>
> As suggested by Amit, I moved this to code comments.

Was the total removal of this note deliberate? I only suggested
removing the *reason* for the rule, not the entire rule. Otherwise,
the user won't know to avoid doing this until they try it and get an
error.


> >
> > 9. NeedToWaitForWal
> >
> > + /*
> > + * Check if the standby slots have caught up to the flushed position.
> > + It
> > + * is good to wait up to flushed position and then let it send the
> > + changes
> > + * to logical subscribers one by one which are already covered in
> > + flushed
> > + * position without needing to wait on every change for standby
> > + * confirmation. Note that after receiving the shutdown signal, an
> > + ERROR
> > + * is reported if any slots are dropped, invalidated, or inactive. This
> > + * measure is taken to prevent the walsender from waiting indefinitely.
> > + */
> > + if (NeedToWaitForStandby(target_lsn, flushed_lsn, wait_event)) return
> > + true;
> >
> > I felt it was confusing things for this function to also call to the other one -- it
> > seems an overlapping/muddling of the purpose of these.
> > I think it will be easier to understand if the calling code just calls to one or both
> > of these functions as required.
>
> Same as Amit, I didn't change this.

AFAICT my previous review comment was misinterpreted. Please see [1]
for more details.

~~~~

Here are some more review comments for v104-00001

======
Commit message

1.
Additionally, The SQL functions pg_logical_slot_get_changes,
pg_logical_slot_peek_changes and pg_replication_slot_advance are modified
to wait for the replication slots specified in 'standby_slot_names' to
catch up before returning.

~

Maybe that should be expressed using similar wording as the docs...

SUGGESTION
Additionally, The SQL functions ... are modified. Now, when used with
failover-enabled logical slots, these functions will block until all
physical slots specified in 'standby_slot_names' have confirmed WAL
receipt.

======
doc/src/sgml/config.sgml

2.
+        <function>pg_logical_slot_peek_changes</function></link>,
+        when used with failover enabled logical slots, will block until all
+        physical slots specified in <varname>standby_slot_names</varname> have
+        confirmed WAL receipt.

/failover enabled logical slots/failover-enabled logical slots/

======
doc/src/sgml/func.sgml

3.
+        The function may be blocked if the specified slot is a failover enabled
+        slot and <link
linkend="guc-standby-slot-names"><varname>standby_slot_names</varname></link>
+        is configured.
        </para></entry>

/a failover enabled slot//a failover-enabled slot/

~~~

4.
+        slot may return to an earlier position. The function may be blocked if
+        the specified slot is a failover enabled slot and
+        <link linkend="guc-standby-slot-names"><varname>standby_slot_names</varname></link>
+        is configured.

/a failover enabled slot//a failover-enabled slot/

======
src/backend/replication/slot.c

5.
+/*
+ * Wait for physical standbys to confirm receiving the given lsn.
+ *
+ * Used by logical decoding SQL functions that acquired failover enabled slot.
+ * It waits for physical standbys corresponding to the physical slots specified
+ * in the standby_slot_names GUC.
+ */
+void
+WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)

/failover enabled slot/failover-enabled slot/

~~~

6.
+ /*
+ * Don't need to wait for the standby to catch up if the current acquired
+ * slot is not a failover enabled slot, or there is no value in
+ * standby_slot_names.
+ */

/failover enabled slot/failover-enabled slot/

======
src/backend/replication/slotfuncs.c

7.
+
+ /*
+ * Wake up logical walsenders holding failover enabled slots after
+ * updating the restart_lsn of the physical slot.
+ */
+ PhysicalWakeupLogicalWalSnd();

/failover enabled slots/failover-enabled slots/

======
src/backend/replication/walsender.c

8.
+/*
+ * Wake up the logical walsender processes with failover enabled slots if the
+ * currently acquired physical slot is specified in standby_slot_names GUC.
+ */
+void
+PhysicalWakeupLogicalWalSnd(void)

/failover enabled slots/failover-enabled slots/

9.
+/*
+ * Returns true if not all standbys have caught up to the flushed position
+ * (flushed_lsn) when the current acquired slot is a failover enabled logical
+ * slot and we are streaming; otherwise, returns false.
+ *
+ * If returning true, the function sets the appropriate wait event in
+ * wait_event; otherwise, wait_event is set to 0.
+ */
+static bool
+NeedToWaitForStandby(XLogRecPtr target_lsn, XLogRecPtr flushed_lsn,
+ uint32 *wait_event)

9a.
/failover enabled logical slot/failover-enabled logical slot/

~

9b.
Probably that function name should be plural.

/NeedToWaitForStandby/NeedToWaitForStandbys/

~~~

10.
+/*
+ * Returns true if we need to wait for WALs to be flushed to disk, or if not
+ * all standbys have caught up to the flushed position (flushed_lsn) when the
+ * current acquired slot is a failover enabled logical slot and we are
+ * streaming; otherwise, returns false.
+ *
+ * If returning true, the function sets the appropriate wait event in
+ * wait_event; otherwise, wait_event is set to 0.
+ */
+static bool
+NeedToWaitForWal(XLogRecPtr target_lsn, XLogRecPtr flushed_lsn,
+ uint32 *wait_event)

/failover enabled logical slot/failover-enabled logical slot/

~~~

11. WalSndWaitForWal

+ /*
+ * Within the loop, we wait for the necessary WALs to be flushed to
+ * disk first, followed by waiting for standbys to catch up if there
+ * are enought WALs or upon receiving the shutdown signal. To avoid
+ * the scenario where standbys need to catch up to a newer WAL
+ * location in each iteration, we update our idea of the currently
+ * flushed position only if we are not waiting for standbys to catch
+ * up.
+ */

typo

/enought/enough/

----------
[1] https://www.postgresql.org/message-id/CAHut%2BPteoyDki-XdygDgoaZJLmasutzRquQepYx0raNs0RSMvg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Austalia



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Support boolcol IS [NOT] iso-8859-1 in partition pruning
Next
From: jian he
Date:
Subject: Re: POC, WIP: OR-clause support for indexes