RE: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: Synchronizing slots from primary to standby
Date
Msg-id OS0PR01MB57167608D8CB5152F25EA1DD9483A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (Ajin Cherian <itsajin@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
On Thursday, November 23, 2023 6:06 PM Ajin Cherian <itsajin@gmail.com> wrote:
> 
> On Tue, Nov 21, 2023 at 8:32 PM shveta malik <shveta.malik@gmail.com>
> wrote:
> >
> > v37 fails to apply to HEAD due to a recent commit e83aa9f92fdd,
> > rebased the patches.  PFA v37_2 patches.
> 
> Thanks for the patch. Some comments:

Thanks for the comments.

> subscriptioncmds.c:
> CreateSubscription()
> and tablesync.c:
> process_syncing_tables_for_apply()
>                  walrcv_create_slot(wrconn, opts.slot_name, false,
> twophase_enabled,
> -                                   CRS_NOEXPORT_SNAPSHOT, NULL);
> -
> -                if (twophase_enabled)
> -                    UpdateTwoPhaseState(subid,
> LOGICALREP_TWOPHASE_STATE_ENABLED);
> -
> +                                   failover_enabled,
> CRS_NOEXPORT_SNAPSHOT, NULL);
> 
> either here or in libpqrcv_create_slot(), shouldn't you check the remote server
> version if it supports the failover flag?

I think we expect the create slot to fail if the server doesn't support failover.
The same is true for two_phase option.

> 
> 
> +
> +            /*
> +             * If only the slot_name is specified, it is possible
> that the user intends to
> +             * use an existing slot on the publisher, so here we
> enable failover for the
> +             * slot if requested.
> +             */
> +            else if (opts.slot_name && failover_enabled)
> +            {
> +                walrcv_alter_slot(wrconn, opts.slot_name, opts.failover);
> +                ereport(NOTICE,
> +                        (errmsg("enabled failover for replication
> slot \"%s\" on publisher",
> +                                opts.slot_name)));
> +            }
> 
> Here, the code only alters the slot if failover = true. You could use "else if
> (opts.slot_name && IsSet(opts.specified_opts, SUBOPT_FAILOVER)" to check if
> the failover flag is specified and alter for failover=false as well. 

Adjusted.

> Also, shouldn't
> you check for the server version if the command ALTER_REPLICATION_SLOT is
> supported?

Similar to create_slot, we expect the command fail if the target server doesn't
support failover the user specified failover = true.

> 
> slot.c:
> ReplicationSlotAlter()
> 
> +void
> +ReplicationSlotAlter(const char *name, bool failover) {
> +    Assert(MyReplicationSlot == NULL);
> +
> +    ReplicationSlotAcquire(name, true);
> +
> +    if (SlotIsPhysical(MyReplicationSlot))
> +        ereport(ERROR,
> +                errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> +                errmsg("cannot use %s with a physical replication slot",
> +                       "ALTER_REPLICATION_SLOT"));
> 
> shouldn't you release the slot by calling ReplicationSlotRelease before
> erroring out?

No, I think the release of the replication slot will be handled by either
WalSndErrorCleanup, ReplicationSlotShmemExit, or the ReplicationSlotRelease
call in PostgresMain().

> 
> slot.c:
> +/*
> + * A helper function to validate slots specified in standby_slot_names GUCs.
> + */
> +static bool
> +validate_standby_slots(char **newval)
> +{
> +    char       *rawname;
> +    List       *elemlist;
> +    ListCell   *lc;
> +
> +    /* Need a modifiable copy of string */
> +    rawname = pstrdup(*newval);
> 
> rawname is not always freed.

The code has been changed due to other comments.

> 
> launcher.c:
> 
> +    SlotSyncWorker->hdr.proc = MyProc;
> +
> +    before_shmem_exit(slotsync_worker_detach, (Datum) 0);
> +
> +    LWLockRelease(SlotSyncWorkerLock);
> +}
> 
> before_shmem_exit() can error out leaving the lock acquired. Maybe you
> should release the lock prior to calling before_shmem_exit() because you don't
> need the lock there.

This has been fixed.

Best Regards,
Hou zj

pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: Synchronizing slots from primary to standby
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node