RE: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: Synchronizing slots from primary to standby
Date
Msg-id OS0PR01MB5716A5430FD4B2AED9754ED194522@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
On Friday, February 16, 2024 6:41 PM shveta malik <shveta.malik@gmail.com> wrote:
> 
> On Thu, Feb 15, 2024 at 10:48 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Looking at v88_0001, random comments:
> 
> Thanks for the feedback.
> 
> >
> > 1 ===
> >
> > Commit message "Be enabling slot synchronization"
> >
> > Typo? s:Be/By
> 
> Modified.
> 
> > 2 ===
> >
> > +        It enables a physical standby to synchronize logical failover slots
> > +        from the primary server so that logical subscribers are not blocked
> > +        after failover.
> >
> > Not sure "not blocked" is the right wording.
> > "can be resumed from the new primary" maybe? (was discussed in [1])
> 
> Modified.
> 
> > 3 ===
> >
> > +#define SlotSyncWorkerAllowed()        \
> > +       (sync_replication_slots && pmState == PM_HOT_STANDBY && \
> > +        SlotSyncWorkerCanRestart())
> >
> > Maybe add a comment above the macro explaining the logic?
> 
> Done.
> 
> > 4 ===
> >
> > +#include "replication/walreceiver.h"
> >  #include "replication/slotsync.h"
> >
> > should be reverse order?
> 
> Removed walreceiver.h inclusion as it was not needed.
> 
> > 5 ===
> >
> > +       if (SlotSyncWorker->syncing)
> >         {
> > -               SpinLockRelease(&SlotSyncCtx->mutex);
> > +               SpinLockRelease(&SlotSyncWorker->mutex);
> >                 ereport(ERROR,
> >
> errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> >                                 errmsg("cannot synchronize replication
> slots concurrently"));
> >         }
> >
> > worth to add a test in 040_standby_failover_slots_sync.pl for it?
> 
> It will be very difficult to stabilize this test as we have to make sure that the
> concurrent users (SQL function(s) and/or worker(s)) are in that target function at
> the same time to hit it.
> 
> >
> > 6 ===
> >
> > +static void
> > +slotsync_reread_config(bool restart)
> > +{
> >
> > worth to add test(s) in 040_standby_failover_slots_sync.pl for it?
> 
> Added test.
> 
> Please find v89  patch set. The other changes are:

Thanks for the patch. Here are few comments:

1.

+static char *
+get_dbname_from_conninfo(const char *conninfo)
+{
+    static char *dbname;
+
+    if (dbname)
+        return dbname;
+    else
+        dbname = walrcv_get_dbname_from_conninfo(conninfo);
+
+    return dbname;
+}


I think it's not necessary to have a static variable here, because the guc
check doesn't seem performance sensitive. Additionaly, it does not work well
with slotsync SQL functions, because the dbname will be allocated in temp
memory context when reaching here via SQL function, so it's not safe to access
this static variable in next function call.

2.

+static bool
+validate_remote_info(WalReceiverConn *wrconn, int elevel)
...
+
+    return (!remote_in_recovery && primary_slot_valid);

The primary_slot_valid could be uninitialized in this function.

Best Regards,
Hou zj

pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: PGC_SIGHUP shared_buffers?
Next
From: Alexander Korotkov
Date:
Subject: Re: Removing unneeded self joins