Re: Weaker shmem interlock w/o postmaster.pid - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Weaker shmem interlock w/o postmaster.pid
Date
Msg-id 20130911153200.GD2706@tamriel.snowman.net
Whole thread Raw
In response to Weaker shmem interlock w/o postmaster.pid  (Noah Misch <noah@leadboat.com>)
Responses Re: Weaker shmem interlock w/o postmaster.pid
List pgsql-hackers
* Noah Misch (noah@leadboat.com) wrote:
> Shouldn't it instead check the same things as PGSharedMemoryIsInUse()?

Offhand, I tend to agree that we should really be doing a very careful
job of looking at if an existing segment is still in use.

> The concrete situation in which I encountered this involved PostgreSQL 9.2 and
> an immediate shutdown with a backend that had blocked SIGQUIT.  The backend
> survived the immediate shutdown as one would expect.

Well..  We expect this now because of the analysis you did in the
adjacent thread showing how it can happen.

> The postmaster
> nonetheless removed postmaster.pid before exiting, and I could immediately
> restart PostgreSQL despite the survival of the SIGQUIT-blocked backend.  If I
> instead SIGKILL the postmaster, postmaster.pid remains, and I must kill stray
> backends before restarting.  The postmaster should not remove postmaster.pid
> unless it has verified that its children have exited.

This makes sense, however..

> Concretely, that means
> not removing postmaster.pid on immediate shutdown in 9.3 and earlier.  That's
> consistent with the rough nature of an immediate shutdown, anyway.

I don't like leaving the postmaster.pid file around, even on an
immediate shutdown.  I don't have any great suggestions regarding what
to do, given what we try to do wrt 'immediate', so perhaps it's
acceptable for future releases.

> I'm thinking to preserve postmaster.pid at immediate shutdown in all released
> versions, but I'm less sure about back-patching a change to make
> PGSharedMemoryCreate() pickier.  On the one hand, allowing startup to proceed
> with backends still active in the same data directory is a corruption hazard.

The corruption risk, imv anyway, is sufficient to backpatch the change
and overrides the concerns around very fast shutdown/restarts.
Thanks,
    Stephen

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Next
From: Alvaro Herrera
Date:
Subject: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers