Re: fcntl(SETLK) [was Re: 2nd update on TOAST] - Mailing list pgsql-hackers

From Tom Lane
Subject Re: fcntl(SETLK) [was Re: 2nd update on TOAST]
Date
Msg-id 25483.963078821@sss.pgh.pa.us
Whole thread Raw
In response to Re: fcntl(SETLK) [was Re: 2nd update on TOAST]  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
Peter Eisentraut <peter_e@gmx.net> writes:
> It seems that that would completely reverse the assumption of risk.
> Currently, the postmaster may fail to start because there's a stale socket
> file lying around, out of respect to a running colleague. With this idea
> it would be the running postmaster's job to "defend" his socket against
> newly starting colleagues. That doesn't seem fair.

True, it would reverse the most probable failure mode, but I'm not sure
that's a bad thing.

> The other problem is a socket file left behind by a crashed postmaster. I
> don't consider this such a big problem; a crashed postmaster is not the
> normal mode of operation. The friendly message we have right now seems
> alright to me. And it's a way of tell that the postmaster crashed at all.

No, actually this is a *big* problem.  That friendly message is no help
to a system boot script that can't read it (the same point you've made
repeatedly w.r.t configure issues; surprised you don't see it here).

If I do a fast shutdown of my Unix system (the kind where shutdown does
a 'kill -9' on all user processes --- on HPUX systems this is invoked by
hitting the power switch or by the power supply overtemperature sensor)
then the postmaster doesn't get a chance to clean out its socket file.
After reboot, the postmaster fails to start up until I manually
intervene by removing the socket file.  That's not robust and not
acceptable.

The way I currently get around this (and I believe it's a pretty popular
thing to do) is that my postmaster-start script unconditionally deletes
the socket file before launching the postmaster.  That's actually far
riskier than what we are discussing, because there is *no* safety check
for an already-started postmaster.  A connection check would be a big
improvement.

I consider failure-to-start during normal system bootup to be a far
graver risk than the possibility that a second postmaster will usurp
a first postmaster's Unix socket --- especially since the latter could
only happen if the first postmaster isn't answering connections, in
which case allowing it to keep the socket is of dubious value anyhow.
So reversing the presumption of innocence seems like a good idea to me.

> ... The solution to this is to make the path of the socket file
> configurable more easily so that the administrator has the choice of
> putting it a safer place that he prepared appropriately.

We talked about that in the original discussion (you might want to
review the flock pghackers thread from late August '98).  The trouble is
that the socket file path is a critical part of the client-to-postmaster
protocol: change the path, and existing clients don't know where to
connect.  Oops.  So even though /tmp is obviously a pretty bogus place
to keep the socket, the compatibility headaches of moving it are so
great that no one really wants to bite the bullet.

We talked about compromises like keeping the real socket in some safer
directory, with a symlink from /tmp for old clients, and I think that's
what will happen eventually.  But please note that if the socket file
path is "easily configurable" then the same problem comes right back
to bite you again. It's *not* "easy" to change your mind about where
the socket files live; on any given platform that decision had better be
graven on stone tablets, because you want all your clients of whatever
vintage to be able to find your postmaster(s).  I'm inclined to think
that a configure option might be counterproductive --- nailing it down
in the per-OS template file seems much less likely to get screwed up.

The major problem with a hard-wired socket path that's not /tmp is
that you can't install the socket directory if you're not root, so the
ability to fire up a postmaster with no root privs whatever would no
longer exist.  We could get around that if it were possible to run with
only TCP connection support, making Unix-domain connections an option
instead of the base requirement.

> A complementary solution is of course to add an option to run without Unix
> socket, since we don't rely on the socket file for data directory locking
> anymore. In fact, does anybody mind if I add such an option? We can have
> tcpip_socket = yes|no
> unix_socket = yes|no

Yup, it would make a lot of sense to have an option for no Unix socket
connections (we already have that as an #ifdef for a couple of platforms
with no Unix socket support, but not as a postmaster start-time choice).

> (Security-conscious users may choose to turn off both. :-))

Uh, not at the moment, because we use the port interlock(s) as a proxy
for a shared-memory interlock.  Really there are three resources that
we must prevent concurrent postmasters from sharing:* data directory;* listen port number;* shared-memory blocks (and
semaphoresets).
 
We have a good solution in place now for locking the data directory, but
the port interlock still needs work.  Currently we use the port number
to assign shmem/sema keys, and there is no separate interlock to guard
against shmem conflicts.  I believe we had a discussion a few months ago
about rejiggering the shmem key assignment method so that shmem
conflicts would be detected and dealt with cleanly --- might be a good
idea to make that happen before we go too far with port interlock
changes.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: crash in 7.0.2...
Next
From: Tom Lane
Date:
Subject: Re: libpq / SQL3