Re: CreateLockFile() race condition - Mailing list pgsql-hackers

From Tom Lane
Subject Re: CreateLockFile() race condition
Date
Msg-id 18481.1344009540@sss.pgh.pa.us
Whole thread Raw
In response to CreateLockFile() race condition  (Noah Misch <noah@leadboat.com>)
Responses Re: CreateLockFile() race condition
Re: CreateLockFile() race condition
List pgsql-hackers
Noah Misch <noah@leadboat.com> writes:
> The problem here is a race between concluding the assessment of a PID file as
> defunct and unlinking it; during that period, another postmaster may have
> replaced the PID file and proceeded.  As far as I've been able to figure, this
> flaw is fundamental to any PID file invalidation algorithm relying solely on
> atomic filesystem operations like unlink(2), link(2), rename(2) and small
> write(2) for mutual exclusion.  Do any of you see a way to remove the race?

Nasty.  Still, the issue only exists for two postmasters launched at
just about exactly the same time, which is an unlikely case.

> I think we should instead implement postmaster mutual exclusion by way of
> fcntl(F_SETLK) on Unix and CreateFile(..., FILE_SHARE_READ, ...) on Windows.

I'm a bit worried about what new problems this solution is going to open
up.  It seems not unlikely that the cure is worse than the disease.
Having locking that actually works on (some) NFS setups would be nice,
but ...

> The hazard[4] keeping fcntl locking from replacing the PGSharedMemoryIsInUse()
> check does not apply here, because the postmaster itself does not run
> arbitrary code that might reopen postmaster.pid.

False.  See shared_preload_libraries.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: CreateLockFile() race condition
Next
From: Robert Haas
Date:
Subject: Re: CreateLockFile() race condition