At Fri, 12 Apr 2024 09:10:35 +0900, Michael Paquier <michael@paquier.xyz> wrote in
> On Thu, Apr 11, 2024 at 04:55:59PM +0500, Kirill Reshke wrote:
> > The test doesn't fail because pg_terminate_backend actually meets his
> > point: autovac is killed. But while dying, autovac also receives
> > segfault. Thats because of injections points:
> >
> > (gdb) bt
> > #0 0x000056361c3379ea in tas (lock=0x7fbcb9632224 <error: Cannot
> > access memory at address 0x7fbcb9632224>) at
> > ../../../../src/include/storage/s_lock.h:228
> > #1 ConditionVariableCancelSleep () at condition_variable.c:238
...
> > #3 0x000056361c330a40 in CleanupProcSignalState (status=<optimized
out>, arg=<optimized out>) at procsignal.c:240
> > #4 0x000056361c328801 in shmem_exit (code=code@entry=1) at ipc.c:276
> > #9 0x000056361c3378d7 in ConditionVariableTimedSleep
> > (cv=0x7fbcb9632224, timeout=timeout@entry=-1,
...
> I can see this stack trace as well. Capturing a bit more than your
> own stack, this is crashing in the autovacuum worker while waiting on
> a condition variable when processing a ProcessInterrupts().
>
> That may point to a legit bug with condition variables in this
> context, actually? From what I can see, issuing a signal on a backend
> process waiting with a condition variable is able to process the
> interrupt correctly.
ProcSignalInit sets up CleanupProcSignalState to be called via
on_shmem_exit. If the CV is allocated in a dsm segment, shmem_exit
should have detached the region for the CV. CV cleanup code should be
invoked via before_shmem_exit.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center