RE: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Zhijie Hou (Fujitsu) |
---|---|
Subject | RE: Synchronizing slots from primary to standby |
Date | |
Msg-id | OS0PR01MB57166472DB3D01F7407259179496A@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | RE: Synchronizing slots from primary to standby ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
Responses |
RE: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby |
List | pgsql-hackers |
On Wednesday, December 20, 2023 4:03 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote: Hi, > > Dear Amit, Shveta, > > > > walsender.c > > > > > > 01. WalSndWaitForStandbyConfirmation > > > > > > ``` > > > + sleeptime = > WalSndComputeSleeptime(GetCurrentTimestamp()); > > > ``` > > > > > > It works well, but I'm not sure whether we should use > > WalSndComputeSleeptime() > > > because the function won't be called by walsender. > > > > > > > I don't think it is correct to use this function because it is > > walsender specific, for example, it uses 'last_reply_timestamp' which > > won't be even initialized in the backend environment. We need to > > probably use a different logic for sleep here or need to use a > > hard-coded value. > > Oh, you are right. I haven't look until the func. > > > I think we should change the name of functions like > > WalSndWaitForStandbyConfirmation() as they are no longer used by > > walsender. IIRC, earlier, we had a common logic to wait from both > > walsender and SQL APIs which led to this naming but that is no longer > > true with the latest patch. > > How about "WaitForStandbyConfirmation", which is simpler? There are some > functions like "WaitForParallelWorkersToFinish", "WaitForProcSignalBarrier" > and so on. Thanks for the comments. I think WaitForStandbyConfirmation is OK. And I removed the WalSnd prefix for these functions and move them to slot.c where the standby_slot_names is declared. > > > > 02.WalSndWaitForStandbyConfirmation > > > > > > ``` > > > + ConditionVariableTimedSleep(&WalSndCtl->wal_confirm_rcv_cv, > > sleeptime, > > > + > > WAIT_EVENT_WAL_SENDER_WAIT_FOR_STANDBY_CONFIRMATION) > > > ``` > > > > > > Hmm, is it OK to use the same event as WalSndWaitForWal()? IIUC it > > > should be > > avoided. > > > > > > > Agreed. So, how about using > > WAIT_EVENT_WAIT_FOR_STANDBY_CONFIRMATION > > so that we can use it both from the backend and walsender? > > Seems right. Note again that a description of .txt file must be also fixed. Changed. > > Anyway, further comments on v50-0001. > > ~~~~~ > protocol.sgml > > 01. create_replication_slot > > ``` > + <varlistentry> > + <term><literal>FAILOVER { 'true' | 'false' }</literal></term> > + <listitem> > + <para> > + If true, the slot is enabled to be synced to the physical > + standbys so that logical replication can be resumed after failover. > + </para> > + </listitem> > + </varlistentry> > ``` > > IIUC, the true/false is optional. libpqwalreceiver does not add the boolean. > Also you can follow the notation of `TWO_PHASE`. Changed. > > 02. alter_replication_slot > > ``` > + <variablelist> > + <varlistentry> > + <term><literal>FAILOVER { 'true' | 'false' }</literal></term> > + <listitem> > + <para> > + If true, the slot is enabled to be synced to the physical > + standbys so that logical replication can be resumed after failover. > + </para> > + </listitem> > + </varlistentry> > + </variablelist> > ``` > > Apart from above, this boolean is mandatory, right? > But you can follow other notation. > Right, changed it to optional to be consistent with others. > > ~~~~~~~ > slot.c > > 03. validate_standby_slots > > ``` > + /* Need a modifiable copy of string. */ > ... > + /* Verify syntax and parse string into a list of identifiers. */ > ``` > > Unnecessary comma? You mean comma or period ? I think the current style is OK. > > 04. validate_standby_slots > > ``` > + if (!ok || !ReplicationSlotCtl) > + { > + pfree(rawname); > + list_free(elemlist); > + return ok; > + } > ``` > > It may be more efficient to exit earlier when ReplicationSlotCtl is NULL. I think even if ReplicationSlotCtl is NULL, we still need to check the syntax of the slot names. > > ~~~~~~~ > walsender.c > > 05. PhysicalWakeupLogicalWalSnd > > ``` > +/* > + * Wake up the logical walsender processes with failover-enabled slots > +if the > + * physical slot of the current walsender is specified in > +standby_slot_names > + * GUC. > + */ > +void > +PhysicalWakeupLogicalWalSnd(void) > ``` > > The function can be called from backend processes, but you said "the current > walsender" > in the comment. Changed the words. > > 06. WalSndRereadConfigAndReInitSlotList > > ``` > + char *pre_standby_slot_names; > + > + ProcessConfigFile(PGC_SIGHUP); > + > + /* > + * If we are running on a standby, there is no need to reload > + * standby_slot_names since we do not support syncing slots to > cascading > + * standbys. > + */ > + if (RecoveryInProgress()) > + return; > + > + pre_standby_slot_names = pstrdup(standby_slot_names); > ``` > > I felt that we must preserve pre_standby_slot_names before calling > ProcessConfigFile(). > Good catch. Fixed. > > 07. WalSndFilterStandbySlots > > I felt the prefix "WalSnd" may not be needed because both backend processes > and walsender will call the function. Right, renamed. Attach the V51 patch set which addressed Kuroda-san's comments. I also tried to improve the test in 0003 to make it stable. Best Regards, Hou zj
Attachment
pgsql-hackers by date: