Weaker shmem interlock w/o postmaster.pid - Mailing list pgsql-hackers

From Noah Misch
Subject Weaker shmem interlock w/o postmaster.pid
Date
Msg-id 20130911033341.GD225735@tornado.leadboat.com
Whole thread Raw
Responses Re: Weaker shmem interlock w/o postmaster.pid
Re: Weaker shmem interlock w/o postmaster.pid
List pgsql-hackers
If a starting postmaster's CreateLockFile() finds an existing postmaster.pid,
it subjects the shared memory segment named therein to the careful scrutiny of
PGSharedMemoryIsInUse().  If that segment matches the current data directory
and has any attached processes, we bail with the "pre-existing shared memory
block ... is still in use" error.  When the postmaster.pid file is missing,
there's inherently less we can do to reliably detect this situation; in
particular, an old postmaster could have chosen an unusual key due to the
usual 1+(port*1000) key being in use.  That being said, PGSharedMemoryCreate()
typically will stumble upon the old segment, and it (its sysv variant, anyway)
applies checks much weaker than those of PGSharedMemoryIsInUse().  If the
segment has a PGShmemHeader and the postmaster PID named in that header is not
alive, PGSharedMemoryCreate() will delete the segment and proceed.  Shouldn't
it instead check the same things as PGSharedMemoryIsInUse()?

The concrete situation in which I encountered this involved PostgreSQL 9.2 and
an immediate shutdown with a backend that had blocked SIGQUIT.  The backend
survived the immediate shutdown as one would expect.  The postmaster
nonetheless removed postmaster.pid before exiting, and I could immediately
restart PostgreSQL despite the survival of the SIGQUIT-blocked backend.  If I
instead SIGKILL the postmaster, postmaster.pid remains, and I must kill stray
backends before restarting.  The postmaster should not remove postmaster.pid
unless it has verified that its children have exited.  Concretely, that means
not removing postmaster.pid on immediate shutdown in 9.3 and earlier.  That's
consistent with the rough nature of an immediate shutdown, anyway.

I'm thinking to preserve postmaster.pid at immediate shutdown in all released
versions, but I'm less sure about back-patching a change to make
PGSharedMemoryCreate() pickier.  On the one hand, allowing startup to proceed
with backends still active in the same data directory is a corruption hazard.
On the other hand, it could break weird shutdown/restart patterns that permit
trivial lifespan overlap between backends of different postmasters.  Opinions?

Thanks,
nm

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Valgrind Memcheck support
Next
From: Satoshi Nagayasu
Date:
Subject: Re: New statistics for WAL buffer dirty writes