On Thu, Nov 28, 2024 at 5:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> Gabriele just reported a crash when changing synchronized_standby_slots
> under SIGHUP and logging collector working. The problem apparently is
> that validate_sync_standby_slots is run for the GUC check routine, and
> it requires acquiring an LWLock; but syslogger doesn't have a PGPROC so
> that doesn't work.
>
> Apparently we already have a hack in place against the same problem in
> postmaster (and elsewhere?), by testing that ReplicationSlotCtl is not
> null. But that hack seems to be incomplete, as the crash here attests.
>
I tried it on my Windows machine and noticed that ReplicationSlotCtl
is NULL for syslogger, so the problem doesn't occur. The reason is
that we don't attach to shared memory in syslogger, so ideally
ReplicationSlotCtl should be NULL. Because we inherit everything
through the fork for Linux systems and then later for processes that
don't want to attach to shared memory, we call PGSharedMemoryDetach()
from postmaster_child_launch(). The PGSharedMemoryDetach won't
reinitialize the memory pointed to by ReplicationSlotCtl, so, it would
be an invalid memory.
> To reproduce, simply start with no synchronized_standby_slots setting
> and logging_collector=on, then change the value in postgresql.conf and
> reload.
>
> One option to fix this might be as attached. The point here is that
> processes without a PGPROC don't need or care to have a correct setting,
> so we just skip verifying it on those. AFAICS this is more general than
> the test for ReplicationSlotCtl, so I just removed that one.
>
Yeah, I can't think of a better fix for this.
--
With Regards,
Amit Kapila.