Re: fcntl(SETLK) [was Re: 2nd update on TOAST] - Mailing list pgsql-hackers

From Tom Lane
Subject Re: fcntl(SETLK) [was Re: 2nd update on TOAST]
Date
Msg-id 25302.963072829@sss.pgh.pa.us
Whole thread Raw
In response to Re: fcntl(SETLK) [was Re: 2nd update on TOAST]  (Mike Mascari <mascarm@mascari.com>)
List pgsql-hackers
Mike Mascari <mascarm@mascari.com> writes:
> I don't get this. Isn't there a race condition here?

Strictly speaking, there is, but the race window is only a couple
of kernel calls wide, and as Bruce pointed out we do not need something
that is absolutely gold-plated bulletproof.  We are just trying to
prevent dbadmins from accidentally starting two postmasters on the
same port number.

The way this would work is that pqcomm.c would do something like
if (socketFileAlreadyExists) {    try to open connection to existing postmaster;    if (successful) {        report
portconflict and die;    }    delete existing socket file;}bind(socket);  // kernel creates new socket file
herelisten();

The race condition here is that if newly-started postmaster A has
executed bind() but not yet listen(), then newly-started postmaster B
could come along, observe the existing socket file, try to open
connection, fail, delete socket file, proceed.  AFAIK B will be allowed
to bind() and create a new socket file, and A ends up listening to a
port that's lost in hyperspace --- no one else can ever connect to it
because it has no visible representative in the filesystem.

But as soon as A has executed listen() it's safe --- even though it's
not really ready to accept connections yet, the attempted connect from
B will wait till it does.  (We should, therefore, use a plain vanilla
connect attempt for the probe --- no non-blocking connect or anything
fancy.)

The bind-to-listen delay in pqcomm.c is currently several lines long,
but there's no reason they couldn't be successive kernel calls with
nothing but a test for bind() failure between.

That strikes me as plenty close enough...
        regards, tom lane


pgsql-hackers by date:

Previous
From: The Hermit Hacker
Date:
Subject: Re: Changes to handling version numbers internally
Next
From: Tom Lane
Date:
Subject: Re: crash in 7.0.2...