RE: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Zhijie Hou (Fujitsu) |
---|---|
Subject | RE: Synchronizing slots from primary to standby |
Date | |
Msg-id | OS0PR01MB5716A5430FD4B2AED9754ED194522@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (shveta malik <shveta.malik@gmail.com>) |
Responses |
Re: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
On Friday, February 16, 2024 6:41 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Thu, Feb 15, 2024 at 10:48 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Looking at v88_0001, random comments: > > Thanks for the feedback. > > > > > 1 === > > > > Commit message "Be enabling slot synchronization" > > > > Typo? s:Be/By > > Modified. > > > 2 === > > > > + It enables a physical standby to synchronize logical failover slots > > + from the primary server so that logical subscribers are not blocked > > + after failover. > > > > Not sure "not blocked" is the right wording. > > "can be resumed from the new primary" maybe? (was discussed in [1]) > > Modified. > > > 3 === > > > > +#define SlotSyncWorkerAllowed() \ > > + (sync_replication_slots && pmState == PM_HOT_STANDBY && \ > > + SlotSyncWorkerCanRestart()) > > > > Maybe add a comment above the macro explaining the logic? > > Done. > > > 4 === > > > > +#include "replication/walreceiver.h" > > #include "replication/slotsync.h" > > > > should be reverse order? > > Removed walreceiver.h inclusion as it was not needed. > > > 5 === > > > > + if (SlotSyncWorker->syncing) > > { > > - SpinLockRelease(&SlotSyncCtx->mutex); > > + SpinLockRelease(&SlotSyncWorker->mutex); > > ereport(ERROR, > > > errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > errmsg("cannot synchronize replication > slots concurrently")); > > } > > > > worth to add a test in 040_standby_failover_slots_sync.pl for it? > > It will be very difficult to stabilize this test as we have to make sure that the > concurrent users (SQL function(s) and/or worker(s)) are in that target function at > the same time to hit it. > > > > > 6 === > > > > +static void > > +slotsync_reread_config(bool restart) > > +{ > > > > worth to add test(s) in 040_standby_failover_slots_sync.pl for it? > > Added test. > > Please find v89 patch set. The other changes are: Thanks for the patch. Here are few comments: 1. +static char * +get_dbname_from_conninfo(const char *conninfo) +{ + static char *dbname; + + if (dbname) + return dbname; + else + dbname = walrcv_get_dbname_from_conninfo(conninfo); + + return dbname; +} I think it's not necessary to have a static variable here, because the guc check doesn't seem performance sensitive. Additionaly, it does not work well with slotsync SQL functions, because the dbname will be allocated in temp memory context when reaching here via SQL function, so it's not safe to access this static variable in next function call. 2. +static bool +validate_remote_info(WalReceiverConn *wrconn, int elevel) ... + + return (!remote_in_recovery && primary_slot_valid); The primary_slot_valid could be uninitialized in this function. Best Regards, Hou zj
pgsql-hackers by date: