Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | CAA4eK1++T5K1SKE_=y8PPZa-VxkFrRPA8XjbttUCx9VpVXS-yw@mail.gmail.com Whole thread Raw |
In response to | RE: Synchronizing slots from primary to standby ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
Responses |
RE: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
On Thu, Nov 2, 2023 at 2:35 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > Here is the new version patch set(V29) which addressed Peter comments[1][2] and > fixed one doc compile error. > Few comments: ============== 1. + <varlistentry id="sql-createsubscription-params-with-failover"> + <term><literal>failover</literal> (<type>boolean</type>)</term> + <listitem> + <para> + Specifies whether the replication slot assocaited with the subscription + is enabled to be synced to the physical standbys so that logical + replication can be resumed from the new primary after failover. + The default is <literal>true</literal>. Why do you think it is a good idea to keep the default value as true? I think the user needs to enable standby for syncing slots which is not a default feature, so by default, the failover property should also be false. AFAICS, it is false for create_slot SQL API as per the below change; so that way also keeping default true for a subscription doesn't make sense. @@ -479,6 +479,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot( IN slot_name name, IN plugin name, IN temporary boolean DEFAULT false, IN twophase boolean DEFAULT false, + IN failover boolean DEFAULT false, OUT slot_name name, OUT lsn pg_lsn) BTW, the below change indicates that the code treats default as false; so, it seems to be a documentation error. @@ -157,6 +158,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options, opts->runasowner = false; if (IsSet(supported_opts, SUBOPT_ORIGIN)) opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY); + if (IsSet(supported_opts, SUBOPT_FAILOVER)) + opts->failover = false; 2. - /* * Common option parsing function for CREATE and ALTER SUBSCRIPTION commands. * Spurious line removal. 3. + else if (opts.slot_name && failover_enabled) + { + walrcv_alter_slot(wrconn, opts.slot_name, opts.failover); + ereport(NOTICE, + (errmsg("altered replication slot \"%s\" on publisher", + opts.slot_name))); + } I think we can add a comment to describe why it makes sense to enable the failover property of the slot in this case. Can we change the notice message to: "enabled failover for replication slot \"%s\" on publisher" 4. libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname, - bool temporary, bool two_phase, CRSSnapshotAction snapshot_action, - XLogRecPtr *lsn) + bool temporary, bool two_phase, bool failover, + CRSSnapshotAction snapshot_action, XLogRecPtr *lsn) { PGresult *res; StringInfoData cmd; @@ -913,7 +917,14 @@ libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname, else appendStringInfoChar(&cmd, ' '); } - + if (failover) + { + appendStringInfoString(&cmd, "FAILOVER"); + if (use_new_options_syntax) + appendStringInfoString(&cmd, ", "); + else + appendStringInfoChar(&cmd, ' '); + } I don't see a corresponding change in repl_gram.y. I think the following part of the code needs to be changed: /* CREATE_REPLICATION_SLOT slot [TEMPORARY] LOGICAL plugin [options] */ | K_CREATE_REPLICATION_SLOT IDENT opt_temporary K_LOGICAL IDENT create_slot_options You also need to update the docs for the same. See [1]. 5. @@ -228,6 +230,28 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin NameStr(MyReplicationSlot->data.plugin), format_procedure(fcinfo->flinfo->fn_oid)))); .. + if (XLogRecPtrIsInvalid(upto_lsn)) + wal_to_wait = end_of_wal; + else + wal_to_wait = Min(upto_lsn, end_of_wal); + + /* Initialize standby_slot_names_list */ + SlotSyncInitConfig(); + + /* + * Wait for specified streaming replication standby servers (if any) + * to confirm receipt of WAL upto wal_to_wait. + */ + WalSndWaitForStandbyConfirmation(wal_to_wait); + + /* + * The memory context used to allocate standby_slot_names_list will be + * freed at the end of this call. So free and nullify the list in + * order to avoid usage of freed list in the next call to this + * function. + */ + SlotSyncFreeConfig(); What if there is an error in WalSndWaitForStandbyConfirmation() before calling SlotSyncFreeConfig()? I think the problem you are trying to avoid by freeing it here can occur. I think it is better to do this in a logical decoding context and free the list along with it as we are doing in commit c7256e6564(see PG15). Also, it is better to allocate this list somewhere in WalSndWaitForStandbyConfirmation(), probably in WalSndGetStandbySlots, that will make the code look neat and also avoid allocating this list when failover is not enabled for the slot. 6. +/* ALTER_REPLICATION_SLOT slot */ +alter_replication_slot: + K_ALTER_REPLICATION_SLOT IDENT '(' generic_option_list ')' I think you need to update the docs for this new command. See existing docs [1]. [1] - https://www.postgresql.org/docs/devel/protocol-replication.html -- With Regards, Amit Kapila.
pgsql-hackers by date: