Thread: RE: crash with synchronized_standby_slots

RE: crash with synchronized_standby_slots

From
"Zhijie Hou (Fujitsu)"
Date:
On Thursday, November 28, 2024 8:16 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> 
> Gabriele just reported a crash when changing synchronized_standby_slots
> under SIGHUP and logging collector working.  The problem apparently is
> that validate_sync_standby_slots is run for the GUC check routine, and
> it requires acquiring an LWLock; but syslogger doesn't have a PGPROC so
> that doesn't work.
> 
> Apparently we already have a hack in place against the same problem in
> postmaster (and elsewhere?), by testing that ReplicationSlotCtl is not
> null.  But that hack seems to be incomplete, as the crash here attests.
> 
> To reproduce, simply start with no synchronized_standby_slots setting
> and logging_collector=on, then change the value in postgresql.conf and
> reload.
> 
> One option to fix this might be as attached.  The point here is that
> processes without a PGPROC don't need or care to have a correct setting,
> so we just skip verifying it on those.  AFAICS this is more general than
> the test for ReplicationSlotCtl, so I just removed that one.

Thanks for the fix! It looks good to me.

I can also reproduce this bug and confirmed that the bug is fixed after
applying the patch. In addition to the regression tests, I also manually tested
the behavior of the postmaster, walsender, and user backend after reloading the
configuration, and they all work as expected.

Best Regards,
Hou zj