Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | CAD21AoDqEEu=ELFk1+hOR86PpKAahTia=EBPHE7E4sAu2ORQ4A@mail.gmail.com Whole thread Raw |
In response to | RE: Synchronizing slots from primary to standby ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
Responses |
Re: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
On Tue, Mar 5, 2024 at 4:21 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Tuesday, March 5, 2024 2:35 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Tue, Mar 5, 2024 at 9:15 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Tue, Mar 5, 2024 at 6:10 AM Peter Smith <smithpb2250@gmail.com> > > wrote: > > > > > > > > ====== > > > > src/backend/replication/walsender.c > > > > > > > > 5. NeedToWaitForWal > > > > > > > > + /* > > > > + * Check if the standby slots have caught up to the flushed > > > > + position. It > > > > + * is good to wait up to the flushed position and then let the > > > > + WalSender > > > > + * send the changes to logical subscribers one by one which are > > > > + already > > > > + * covered by the flushed position without needing to wait on every > > > > + change > > > > + * for standby confirmation. > > > > + */ > > > > + if (NeedToWaitForStandbys(flushed_lsn, wait_event)) return true; > > > > + > > > > + *wait_event = 0; > > > > + return false; > > > > +} > > > > + > > > > > > > > 5a. > > > > The comment (or part of it?) seems misplaced because it is talking > > > > WalSender sending changes, but that is not happening in this function. > > > > > > > > > > I don't think so. This is invoked only by walsender and a static > > > function. I don't see any other better place to mention this. > > > > > > > Also, isn't what this is saying already described by the other > > > > comment in the caller? e.g.: > > > > > > > > > > Oh no, here we are explaining the wait order. > > > > I think there is a scope of improvement here. The comment inside > > NeedToWaitForWal() which states that we need to wait here for standbys on > > flush-position(and not on each change) should be outside of this function. It is > > too embedded. And the comment which states the order of wait (first flush and > > then standbys confirmation) should be outside the for-loop in > > WalSndWaitForWal(), but yes we do need both the comments. Attached a > > patch (.txt) for comments improvement, please merge if appropriate. > > Thanks, I have slightly modified the top-up patch and merged it. > > Attach the V106 patch which addressed above and Peter's comments[1]. > I have one question about PhysicalWakeupLogicalWalSnd(): +/* + * Wake up the logical walsender processes with logical failover slots if the + * currently acquired physical slot is specified in standby_slot_names GUC. + */ +void +PhysicalWakeupLogicalWalSnd(void) +{ + List *standby_slots; + + Assert(MyReplicationSlot && SlotIsPhysical(MyReplicationSlot)); + + standby_slots = GetStandbySlotList(); + + foreach_ptr(char, name, standby_slots) + { + if (strcmp(name, NameStr(MyReplicationSlot->data.name)) == 0) + { + ConditionVariableBroadcast(&WalSndCtl->wal_confirm_rcv_cv); + return; + } + } +} IIUC walsender calls this function every time after updating the slot's restart_lsn, which could be very frequently. I'm concerned that it could be expensive to do a linear search on the standby_slot_names list every time. Is it possible to cache the information in walsender local somehow? For example, the walsender sets a flag in WalSnd after processing the config file if its slot name is present in standby_slot_names. That way, they can wake up logical walsenders if eligible after updating the slot's restart_lsn, without checking the standby_slot_names value. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: