Re: Postmaster crashed during start - Mailing list pgsql-hackers

From Srinath Reddy
Subject Re: Postmaster crashed during start
Date
Msg-id CAFC+b6o3--zb9GXyh12W0-uM1vN4LgU+sZofdZ3rsA9tVNO6eA@mail.gmail.com
Whole thread Raw
In response to Postmaster crashed during start  (Srinath Reddy <srinath2133@gmail.com>)
List pgsql-hackers


On Wed, Feb 26, 2025 at 9:50 AM Srinath Reddy <srinath2133@gmail.com> wrote:


On Wed, Feb 26, 2025 at 9:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Srinath Reddy <srinath2133@gmail.com> writes:
> when we kill postmaster using kill -9 and start immediately it crashes with
>> FATAL:  pre-existing shared memory block (key 2495405, ID 360501) is still
>> in use

"Doctor, it hurts when I do this!"

"So don't do that!"

This is not a supported way of shutting down the postmaster, and it
never will be.  Use SIGINT, or SIGQUIT if you are in a desperate
hurry and are willing to have the next startup take longer.
i was actually trying to recreate power outage scenario using node->kill9(),node->start() in a custom tap test,then i found this crash.
 

I think the specific reason you are seeing this is that it takes
nonzero time for the postmaster's orphaned child processes to
notice that the postmaster is gone and terminate.  As long as
any of those children remain, the shared memory block will have
a nonzero reference count.  The new postmaster sees that and
refuses to start, for the very sound reason that it risks
data corruption if it brings up a new set of worker processes
while any of the old ones are still running.

                        regards, tom lane

i am guessing you mean "reference count to shared memory block"  means shmem_nattach right? i think this will be incremented by 1 when a process attached to the shmem segment using shmat() in postgres case its the postmaster who attaches during creation of shmem segment and detaches during postmaster's on_shmem_exit is called during if it exits properly or not dies suddenly (as the case with kill -9) ,during detaching only the shmem_nattach will be decremented by 1 ,AFAIK the child processes will get to use the shmem segment but never attaches or detaches so they are not effecting the shmem_nattach.so as the shmem_nattach is not 0 PGSharedMemoryAttach thinks the shmem state is still attached and in use.


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fix logging for invalid recovery timeline
Next
From: vignesh C
Date:
Subject: Enhances pg_createsubscriber documentation for the -d option.