Re: Proposal to add a QNX 6.5 port to PostgreSQL - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Proposal to add a QNX 6.5 port to PostgreSQL |
Date | |
Msg-id | CA+TgmobA_Cs2PZJs1PjTEQG_+Ns1kQ1XwRmvBsgqp3iSWZR_+g@mail.gmail.com Whole thread Raw |
In response to | Re: Proposal to add a QNX 6.5 port to PostgreSQL (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Proposal to add a QNX 6.5 port to PostgreSQL
|
List | pgsql-hackers |
On Fri, Jul 25, 2014 at 6:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> * QNX lacks System V shared memory: I created "src/backend/port/posix_shmem.c" which replaces System V calls (shmget,shmat, shmdt, ...) with POSIX calls (shm_open, mmap, munmap, shm_unlink) > > This isn't really acceptable for production usage; if it were, we'd have > done it already. The POSIX APIs lack any way to tell how many processes > are attached to a shmem segment, which is *necessary* functionality for > us (it's a critical part of the interlock against starting multiple > postmasters in one data directory). I think it would be good to spend some energy figuring out what to do about this. The Linux developers, for reasons I have not been able to understand, appear to hate System V shared memory, and rumors have circulated here that they would like to get rid of it altogether. And quite apart from that, even using a few bytes of System V shared memory is apparently inconvenient for people who run many copies of PostgreSQL on the same machine or who run in environments where it's not available, such as FreeBSD jails for which it hasn't been specifically enabled.[1] Now, in fairness, all of the alternative systems have their own share of problems. POSIX shared memory isn't available everywhere, and the anonymous mmap we're now using doesn't work in EXEC_BACKEND builds, can't be used for dynamic shared memory, and apparently performs poorly on BSD systems.[1] In spite of that, I think that having an option to use POSIX shared memory would make a reasonable number of PostgreSQL users happier than they are today; and maybe even attract a few new ones. In our last discussion on this topic, we talked about using file locks as a substitute for nattch. You concluded that fcntl was totally broken for this purpose because of the possibility of some other piece of code accidentally opening and closing the lock file.[2] lockf appears to have the same problem, but flock might not, at least on some systems. The semantics as described in my copy of the Linux man pages are that a child created by fork() inherits a copy of the filehandle pointing to the same lock, and that the lock is released when either ANY process with a copy of that filehandle makes an explicit unlock request or ALL copies of the filehandle are closed. That seems like it'd be OK for our purposes, though the Linux guys seem to think the semantics might be different on other platforms, and note that it won't work over NFS. Another thing that strikes me is that lsof works on just about every platform I've ever used, and it tells you who has got a certain file open. Of course it has to use different methods to do that on different platforms, but at least on Linux, /proc/self/fd/N is a symlink to the file you've got open, and shared memory segments are files in /dev/shm. So maybe at least on particular platforms where we care enough, we could install operating-system-specific code to provide an interlock using a mechanism of this type. Not sure if that will fly, but it's a thought. Yet another idea is to somehow use POSIX semaphores, which are distinct from POSIX shared memory. semop() has a SEM_UNDO flag which causes whatever operation you perform to reversed out a process exit. So you could have each new postgres process increment the semaphore value in such a way that it would be decremented on exit, although I'm not sure how to avoid a race if the postmaster dies before a new child has a chance to increment the semaphore. Finally, how about named pipes? Linux says that trying to open a named pipe for write when there are no readers will return ENXIO, and attempting to write to an already-open pipe with no remaining readers will cause SIGPIPE. So: create a permanent named pipe in the data directory that all PostgreSQL processes keep open. When the postmaster starts, it opens the pipe for read, then for write, then closes it for read. It then tries to write to the pipe. If this fails to result in SIGPIPE, then somebody else has got the thing open; so the new postmaster should die at once. But if does get a SIGPIPE then there are as of that moment no other readers. I'm not sure if any of this helps QNX or not, but maybe if we figure out which of these mechanisms (or others) might be acceptable we can cross-check that against what QNX supports. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] See comments on http://rhaas.blogspot.com/2012/06/absurd-shared-memory-limits.html [2] http://www.postgresql.org/message-id/18958.1340764854@sss.pgh.pa.us
pgsql-hackers by date: