Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAD21AoDqEEu=ELFk1+hOR86PpKAahTia=EBPHE7E4sAu2ORQ4A@mail.gmail.com
Whole thread Raw
In response to RE: Synchronizing slots from primary to standby  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
On Tue, Mar 5, 2024 at 4:21 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Tuesday, March 5, 2024 2:35 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Mar 5, 2024 at 9:15 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Mar 5, 2024 at 6:10 AM Peter Smith <smithpb2250@gmail.com>
> > wrote:
> > > >
> > > > ======
> > > > src/backend/replication/walsender.c
> > > >
> > > > 5. NeedToWaitForWal
> > > >
> > > > + /*
> > > > + * Check if the standby slots have caught up to the flushed
> > > > + position. It
> > > > + * is good to wait up to the flushed position and then let the
> > > > + WalSender
> > > > + * send the changes to logical subscribers one by one which are
> > > > + already
> > > > + * covered by the flushed position without needing to wait on every
> > > > + change
> > > > + * for standby confirmation.
> > > > + */
> > > > + if (NeedToWaitForStandbys(flushed_lsn, wait_event)) return true;
> > > > +
> > > > + *wait_event = 0;
> > > > + return false;
> > > > +}
> > > > +
> > > >
> > > > 5a.
> > > > The comment (or part of it?) seems misplaced because it is talking
> > > > WalSender sending changes, but that is not happening in this function.
> > > >
> > >
> > > I don't think so. This is invoked only by walsender and a static
> > > function. I don't see any other better place to mention this.
> > >
> > > > Also, isn't what this is saying already described by the other
> > > > comment in the caller? e.g.:
> > > >
> > >
> > > Oh no, here we are explaining the wait order.
> >
> > I think there is a scope of improvement here. The comment inside
> > NeedToWaitForWal() which states that we need to wait here for standbys on
> > flush-position(and not on each change) should be outside of this function. It is
> > too embedded. And the comment which states the order of wait (first flush and
> > then standbys confirmation) should be outside the for-loop in
> > WalSndWaitForWal(), but yes we do need both the comments. Attached a
> > patch (.txt) for comments improvement, please merge if appropriate.
>
> Thanks, I have slightly modified the top-up patch and merged it.
>
> Attach the V106 patch which addressed above and Peter's comments[1].
>

I have one question about PhysicalWakeupLogicalWalSnd():

+/*
+ * Wake up the logical walsender processes with logical failover slots if the
+ * currently acquired physical slot is specified in standby_slot_names GUC.
+ */
+void
+PhysicalWakeupLogicalWalSnd(void)
+{
+        List      *standby_slots;
+
+        Assert(MyReplicationSlot && SlotIsPhysical(MyReplicationSlot));
+
+        standby_slots = GetStandbySlotList();
+
+        foreach_ptr(char, name, standby_slots)
+        {
+                if (strcmp(name, NameStr(MyReplicationSlot->data.name)) == 0)
+                {
+
ConditionVariableBroadcast(&WalSndCtl->wal_confirm_rcv_cv);
+                        return;
+                }
+        }
+}

IIUC walsender calls this function every time after updating the
slot's restart_lsn, which could be very frequently. I'm concerned that
it could be expensive to do a linear search on the standby_slot_names
list every time. Is it possible to cache the information in walsender
local somehow? For example, the walsender sets a flag in WalSnd after
processing the config file if its slot name is present in
standby_slot_names. That way, they can wake up logical walsenders if
eligible after updating the slot's restart_lsn, without checking the
standby_slot_names value.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: "Euler Taveira"
Date:
Subject: Re: speed up a logical replica setup
Next
From: Thomas Munro
Date:
Subject: Re: CREATE DATABASE with filesystem cloning