On Wed, Jan 31, 2024 at 2:02 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> ---
> +static char *
> +wait_for_valid_params_and_get_dbname(void)
> +{
> + char *dbname;
> + int rc;
> +
> + /* Sanity check. */
> + Assert(enable_syncslot);
> +
> + for (;;)
> + {
> + if (validate_parameters_and_get_dbname(&dbname))
> + break;
> + ereport(LOG, errmsg("skipping slot synchronization"));
> +
> + ProcessSlotSyncInterrupts(NULL);
>
> When reading this function, I expected that the slotsync worker would
> resume working once the parameters became valid, but it was not
> correct. For example, if I changed hot_standby_feedback from off to
> on, the slotsync worker reads the config file, exits, and then
> restarts. Given that the slotsync worker ends up exiting on parameter
> changes anyway, why do we want to have it wait for parameters to
> become valid? IIUC even if the slotsync worker exits when a parameter
> is not valid, it restarts at some intervals.
Thanks for the feedback Changed this functionality in v75. Now we do
not exit in wait_for_valid_params_and_get_dbname() on GUC change. We
re-validate the new values and if found valid, carry on with
slot-syncing else continue waiting.
> ---
> +bool
> +SlotSyncWorkerCanRestart(void)
> +{
> +#define SLOTSYNC_RESTART_INTERVAL_SEC 10
> +
>
> IIUC depending on how busy the postmaster is and the timing, the user
> could wait for 1 min to re-launch the slotsync worker. But I think the
> user might want to re-launch the slotsync worker more quickly for
> example when the slotsync worker restarts due to parameter changes.
> IIUC SloSyncWorkerCanRestart() doesn't consider the fact that the
> slotsync worker previously exited with 0 or 1.
Modified this in v75. As you suggested in [1], we reset
last_start_time on GUC change before proc_exit, so that the postmaster
restarts worker immediately without waiting.
> ---
> + /* We are a normal standby */
> + valid = DatumGetBool(slot_getattr(tupslot, 2, &isnull));
> + Assert(!isnull);
>
> What do you mean by "normal standby"?
>
> ---
> + appendStringInfo(&cmd,
> + "SELECT pg_is_in_recovery(), count(*) = 1"
> + " FROM pg_replication_slots"
> + " WHERE slot_type='physical' AND slot_name=%s",
> + quote_literal_cstr(PrimarySlotName));
>
> I think we need to make "pg_replication_slots" schema-qualified.
Modified.
> ---
> + errdetail("The primary server slot \"%s\" specified by"
> + " \"%s\" is not valid.",
> + PrimarySlotName, "primary_slot_name"));
>
> and
>
> + errmsg("slot sync worker will shutdown because"
> + " %s is disabled", "enable_syncslot"));
>
> It's better to write it in one line for better greppability.
Modified.
[1]: https://www.postgresql.org/message-id/CAD21AoAv6FwZ6UPNTj6%3D7A%2B3O2m4utzfL8ZGS6X1EGexikG66A%40mail.gmail.com
thanks
Shveta