Thread: Re: [COMMITTERS] pgsql/doc/TODO.detail (alpha default distinct flock fsync function limit null pg_shadow primary)
Re: [COMMITTERS] pgsql/doc/TODO.detail (alpha default distinct flock fsync function limit null pg_shadow primary)
From
Peter Eisentraut
Date:
Tom Lane writes: > but if one or both postmasters is started without -i then there's got > to be some interlock on the Unix socket file. > > I don't much like depending on flock for that, since it isn't available > everywhere. The only portable answer is to build a pid-containing > interlock file for each socket file, as discussed in the TODO item. But the flock code isn't used because the configure test for it is broken, and has been broken ever since it was introduced AFAICT. It seems that we have been relying on the mere existence of the socket file. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Re: [COMMITTERS] pgsql/doc/TODO.detail (alpha default distinct flock fsync function limit null pg_shadow primary)
From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane writes: >> but if one or both postmasters is started without -i then there's got >> to be some interlock on the Unix socket file. >> >> I don't much like depending on flock for that, since it isn't available >> everywhere. The only portable answer is to build a pid-containing >> interlock file for each socket file, as discussed in the TODO item. > But the flock code isn't used because the configure test for it is broken, > and has been broken ever since it was introduced AFAICT. It seems that we > have been relying on the mere existence of the socket file. Oooh, no kidding? That explains why we're still hearing complaints about the postmaster failing to start up when there's a socket file left over from a previous run: the code that's supposed to delete an old socket file is part of the #ifdef HAVE_FCNTL_SETLK path. (Tries it out ... sure enough, it's broken ...) The flock is not really needed to protect the port number; it's there to prevent a second postmaster from deleting the socket file that belongs to a still-active old postmaster. But if you have no delete logic at all, you can't cope with a leftover socket file. The flock code *did* work at one time; I recall testing it. Evidently someone broke the configure test for it later on. I think the shortest path to a solution is to fix the configure test, unless you have the ambition to tackle setting up a set of lock files for port numbers --- which'd require resolving such thorny questions as where to keep the lock files. (/tmp is right out, IMHO.) regards, tom lane