Re: Proposal to add a QNX 6.5 port to PostgreSQL - Mailing list pgsql-hackers

From Baker, Keith [OCDUS Non-J&J]
Subject Re: Proposal to add a QNX 6.5 port to PostgreSQL
Date
Msg-id 25171C9D43848A4A9FFF65373179D8025AC100CF@ITSUSRAGMDGD05.jnj.com
Whole thread Raw
In response to Re: Proposal to add a QNX 6.5 port to PostgreSQL  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Proposal to add a QNX 6.5 port to PostgreSQL
List pgsql-hackers
Robert and Tom,

I assume you guys are working on other priorities, so I did some locking experiments on QNX.

I know fcntl() locking has downsides, but I think it deserves a second look:
- it is POSIX, so should be fairly consistent across platforms (at least more consistent than lockf and flock)
- the "accidental" open/close lock release can be easily avoided (simply don't add new code which touches the new,
uniquelock file)
 
- don't know if it will work on NFS, but that is not a priority for me (is that really a requirement for a QNX port?)

Existing System V shared memory locking can be left in place for all existing platforms (so nothing lost), while
fcntl()-stylelocking could be limited to platforms which lack System V shared memory (like QNX).
 

Experimental patch is attached, but logic is basically this:
a. postmaster obtains exclusive lock on data dir file "postmaster.fcntl" (or FATAL)
b. postmaster then downgrades to shared lock (or FATAL)
c. all other backend processes obtain shared lock on this file (or FATAL)

A quick test on QNX 6.5 appeared to behave well (orphan backends left behind after kill -9 of postmaster held their
locks,thus database restart was prevented as desired).
 
Let me know if there are other test scenarios to consider.

Thanks!

-Keith Baker


> -----Original Message-----
> From: Robert Haas [mailto:robertmhaas@gmail.com]
> Sent: Thursday, July 31, 2014 12:58 PM
> To: Tom Lane
> Cc: Baker, Keith [OCDUS Non-J&J]; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
> 
> On Wed, Jul 30, 2014 at 11:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > So it seems like we could possibly go this route, assuming we can
> > think of a variant of your proposal that's race-condition-free.  A
> > disadvantage compared to a true file lock is that it would not protect
> > against people trying to start postmasters from two different NFS
> > client machines --- but we don't have protection against that now.
> > (Maybe we could do this *and* do a regular file lock to offer some
> > protection against that case, even if it's not bulletproof?)
> 
> That's not a bad idea.  By the way, it also wouldn't be too hard to test at
> runtime whether or not flock() has first-close semantics.  Not that we'd want
> this exact design, but suppose you configure shmem_interlock=flock in
> postgresql.conf.  On startup, we test whether flock is reliable, determine
> that it is, and proceed accordingly.
> Now, you move your database onto an NFS volume and the semantics
> change (because, hey, breaking userspace assumptions is fun) and try to
> restart up your database, and it says FATAL: flock() is broken.
> Now you can either move the database back, or set shmem_interlock to
> some other value.
> 
> Now maybe, as you say, it's best to use multiple locking protocols and hope
> that at least one will catch whatever the dangerous situation is.
> I'm just trying to point out that we need not blindly assume the semantics we
> want are there (or that they are not); we can check.
> 
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> Company

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: how to implement selectivity injection in postgresql
Next
From: Tom Lane
Date:
Subject: What happened to jsonb's JENTRY_ISFIRST?