Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From shveta malik
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAJpy0uAqCqVfygd3oV0bFFbrPu8b9yMMiM4JGqihz9iejs0n5w@mail.gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Wed, Jan 31, 2024 at 2:02 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> ---
> +static char *
> +wait_for_valid_params_and_get_dbname(void)
> +{
> +   char       *dbname;
> +   int         rc;
> +
> +   /* Sanity check. */
> +   Assert(enable_syncslot);
> +
> +   for (;;)
> +   {
> +       if (validate_parameters_and_get_dbname(&dbname))
> +           break;
> +       ereport(LOG, errmsg("skipping slot synchronization"));
> +
> +       ProcessSlotSyncInterrupts(NULL);
>
> When reading this function, I expected that the slotsync worker would
> resume working once the parameters became valid, but it was not
> correct. For example, if I changed hot_standby_feedback from off to
> on, the slotsync worker reads the config file, exits, and then
> restarts. Given that the slotsync worker ends up exiting on parameter
> changes anyway, why do we want to have it wait for parameters to
> become valid? IIUC even if the slotsync worker exits when a parameter
> is not valid, it restarts at some intervals.

Thanks for the feedback Changed this functionality in v75. Now we do
not exit in wait_for_valid_params_and_get_dbname() on GUC change. We
re-validate the new values and if found valid, carry on with
slot-syncing else continue waiting.

> ---
> +bool
> +SlotSyncWorkerCanRestart(void)
> +{
> +#define SLOTSYNC_RESTART_INTERVAL_SEC 10
> +
>
> IIUC depending on how busy the postmaster is and the timing, the user
> could wait for 1 min to re-launch the slotsync worker. But I think the
> user might want to re-launch the slotsync worker more quickly for
> example when the slotsync worker restarts due to parameter changes.
> IIUC SloSyncWorkerCanRestart() doesn't consider the fact that the
> slotsync worker previously exited with 0 or 1.

Modified this in v75. As you suggested in [1], we reset
last_start_time on GUC change before proc_exit, so that the postmaster
restarts worker immediately without waiting.

> ---
> +       /* We are a normal standby */
> +       valid = DatumGetBool(slot_getattr(tupslot, 2, &isnull));
> +       Assert(!isnull);
>
> What do you mean by "normal standby"?
>
> ---
> +   appendStringInfo(&cmd,
> +                    "SELECT pg_is_in_recovery(), count(*) = 1"
> +                    " FROM pg_replication_slots"
> +                    " WHERE slot_type='physical' AND slot_name=%s",
> +                    quote_literal_cstr(PrimarySlotName));
>
> I think we need to make "pg_replication_slots" schema-qualified.

Modified.

> ---
> +                   errdetail("The primary server slot \"%s\" specified by"
> +                             " \"%s\" is not valid.",
> +                             PrimarySlotName, "primary_slot_name"));
>
> and
>
> +               errmsg("slot sync worker will shutdown because"
> +                      " %s is disabled", "enable_syncslot"));
>
> It's better to write it in one line for better greppability.

Modified.

[1]: https://www.postgresql.org/message-id/CAD21AoAv6FwZ6UPNTj6%3D7A%2B3O2m4utzfL8ZGS6X1EGexikG66A%40mail.gmail.com

thanks
Shveta



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Synchronizing slots from primary to standby
Next
From: Shinya Kato
Date:
Subject: Re: Fix bugs not to discard statistics when changing stats_fetch_consistency