Re: Proposal to add a QNX 6.5 port to PostgreSQL - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Proposal to add a QNX 6.5 port to PostgreSQL
Date
Msg-id 10466.1406675178@sss.pgh.pa.us
Whole thread Raw
In response to Re: Proposal to add a QNX 6.5 port to PostgreSQL  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Proposal to add a QNX 6.5 port to PostgreSQL
Re: Proposal to add a QNX 6.5 port to PostgreSQL
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jul 25, 2014 at 6:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This isn't really acceptable for production usage; if it were, we'd have
>> done it already.  The POSIX APIs lack any way to tell how many processes
>> are attached to a shmem segment, which is *necessary* functionality for
>> us (it's a critical part of the interlock against starting multiple
>> postmasters in one data directory).

> I think it would be good to spend some energy figuring out what to do
> about this.

Well, we've been around on this multiple times before, but if we have
any new ideas, sure ...

> In our last discussion on this topic, we talked about using file locks
> as a substitute for nattch.  You concluded that fcntl was totally
> broken for this purpose because of the possibility of some other piece
> of code accidentally opening and closing the lock file.[2]  lockf
> appears to have the same problem, but flock might not, at least on
> some systems.

My Linux man page for flock says
      flock()  does not lock files over NFS.  Use fcntl(2) instead: that does      work over NFS, given a sufficiently
recent version  of  Linux  and  a      server which supports locking.
 

which seems like a showstopper problem; we might try to tell people not to
put their databases on NFS, but they're not gonna listen.  It also says
      flock()  and  fcntl(2)  locks  have different semantics with respect to      forked processes and dup(2).  On
systemsthat implement  flock()  using      fcntl(2),  the  semantics  of  flock()  will  be  different  from those
describedin this manual page.
 

which is pretty scary if it's accurate for any still-extant platforms;
we might think we're using flock and still get fcntl behavior.  It's
also of concern that (AFAICS) flock is not in POSIX, which means we
can't even expect that platforms will agree on how it *should* behave.

I also noted that flock does not support atomic downgrade of exclusive
lock to shared lock, which seems like a problem for the lock inheritance
scheme sketched in
http://www.postgresql.org/message-id/18162.1340761845@sss.pgh.pa.us
... but OTOH, it sounds like flock locks are not only inherited through
fork() but even preserved across exec(), which would mean that we don't
need that scheme for file lock inheritance, even with EXEC_BACKEND.
Still, it's not clear to me how we could put much faith in flock.

> Finally, how about named pipes? Linux says that trying to open a
> named pipe for write when there are no readers will return ENXIO, and
> attempting to write to an already-open pipe with no remaining readers
> will cause SIGPIPE.  So: create a permanent named pipe in the data
> directory that all PostgreSQL processes keep open.  When the
> postmaster starts, it opens the pipe for read, then for write, then
> closes it for read.  It then tries to write to the pipe.  If this
> fails to result in SIGPIPE, then somebody else has got the thing open;
> so the new postmaster should die at once.   But if does get a SIGPIPE
> then there are as of that moment no other readers.

Hm.  That particular protocol is broken: two postmasters doing it at the
same time would both pass (because neither has it open for read at the
instant where they try to write).  But we could possibly frob the idea
until it works.  Bigger question is how portable is this behavior?
I see named pipes (fifos) in SUS v2, which is our usual baseline
assumption about what's portable across Unixen, so maybe it would work.
But does NFS support named pipes?
        regards, tom lane



pgsql-hackers by date:

Previous
From: Marko Tiikkaja
Date:
Subject: Re: plpgsql.consistent_into
Next
From: "Baker, Keith [OCDUS Non-J&J]"
Date:
Subject: Re: Proposal to add a QNX 6.5 port to PostgreSQL