Thread: Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

From

Larry Rosenman

Date:

25 November 2000, 17:40:54

* Tom Lane <tgl@sss.pgh.pa.us> [001125 16:37]:
> "Joel Burton" <jburton@scw.org> writes:
> 
> This story does indicate that we need a less fragile interlock against
> starting two postmasters on one database.  I have to admit that it
> hadn't occurred to me that you could break the port-number interlock
> so easily as that :-(.  But obviously you can, so we need a different
> way of representing the interlock.  Hackers, any thoughts?
how about a .pid/.port/.???  file in the /data directory, and a lock on that? 


-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

Re: Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

From

Tom Lane

Date:

25 November 2000, 18:10:56

Larry Rosenman <ler@lerctr.org> writes:
> * Tom Lane <tgl@sss.pgh.pa.us> [001125 16:37]:
>> This story does indicate that we need a less fragile interlock against
>> starting two postmasters on one database.  I have to admit that it
>> hadn't occurred to me that you could break the port-number interlock
>> so easily as that :-(.  But obviously you can, so we need a different
>> way of representing the interlock.  Hackers, any thoughts?

> how about a .pid/.port/.???  file in the /data directory, and a lock on that?

Nope, 'cause it wouldn't protect you against two postmasters in
different data directories trying to use the same port number.
The port-number lock has to use a system-wide mechanism.

You may want to go back and review the previous threads that have
discussed interlock issues.  We have really three independent resources
that we have to ensure only one postmaster is using at a time:

1. Port number (for Unix socket, IP address, etc)

2. Data directory (database files)

3. Shared memory.

Up to now shared memory has been protected more or less implicitly
by the port-number lock, since the shared memory IPC key is derived
from the port number.  However, the "virtual host" patch that we
recently accepted (way prematurely IMHO) breaks that correspondence.
I suspect that we really ought to try to have an independent interlock
on the shared memory block itself.  There was a thread around 4/30/00
concerning changing the way that shmem IPC keys are generated, and
maybe that would address this concern.

If we weren't relying on port number to protect shared memory, I think
the existing interlocks on port would be sufficient.  The kernel
enforces an interlock on listening to the same IP address, so that's
OK, and an advisory lock on the socket file is OK for preventing two
postmasters from listening to the same socket file.  (There's no real
reason to prevent postmasters from using similarly-socket-numbered
socket files in different directories, other than the shmem key issue,
so a lock on the socket file is really just what we want for that
specific resource.)

There is a related issue on my todo list, though --- didn't we find out
awhile back that some older Linux kernels crash and burn if one attempts
to get an advisory lock on a socket file?  (See thread 7/6/00)  Were we
going to fix that, and if so how?  Or will we just tell people that they
have to update their kernel to run Postgres?  The current configure
script "works around" this by disabling the advisory lock on *all*
versions of Linux, which I regard as a completely unacceptable
solution...
        regards, tom lane

Re: Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

From

Peter Eisentraut

Date:

25 November 2000, 19:22:43

Tom Lane writes:

> There is a related issue on my todo list, though --- didn't we find out
> awhile back that some older Linux kernels crash and burn if one attempts
> to get an advisory lock on a socket file?  (See thread 7/6/00)  Were we
> going to fix that, and if so how?  Or will we just tell people that they
> have to update their kernel to run Postgres?  The current configure
> script "works around" this by disabling the advisory lock on *all*
> versions of Linux, which I regard as a completely unacceptable
> solution...

Firstly, AFAIK there's no official production kernel that fixes this.  
When and if it gets fixed we can change that logic.

I have simple test program that exhibits the problem (taken from the
kernel mailing list), but

a) You shouldn't run test programs in configure.

b) You really shouldn't run test programs in configure that set up  networking connections.

c) You definitely shouldn't run test programs in configure that provoke  kernel exceptions.

We could use flock() on Linux, though.

Maybe we could name the socket file .s.PGSQL.port.pid and make
.s.PGSQL.port a symlink.  Then you can find out whether the postmaster
that created the file is still running.  (You could even put the actual
socket file into the data directory, although that would require
re-thinking the file permissions on the latter.)

Actually, this turns out to be similar to what you wrote in
http://www.postgresql.org/mhonarc/pgsql-hackers/1998-08/msg00835.html

But we really should be fixing the IPC interlock with IPC_EXCL, but the
code changes look to be non-trivial.

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/

Re: Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

From

Tom Lane

Date:

25 November 2000, 19:42:05

Peter Eisentraut <peter_e@gmx.net> writes:
> Maybe we could name the socket file .s.PGSQL.port.pid and make
> .s.PGSQL.port a symlink.  Then you can find out whether the postmaster
> that created the file is still running.

Or just create a lockfile /tmp/.s.PGSQL.port#.lock, ie, same name as
socket file with ".lock" added (containing postmaster's PID).  Then we
could share code with the data-directory-lockfile case.

> Actually, this turns out to be similar to what you wrote in
> http://www.postgresql.org/mhonarc/pgsql-hackers/1998-08/msg00835.html

Well, we've talked before about moving the socket files to someplace
safer than /tmp.  The problem is to find another place that's not
platform-dependent --- else you've got a major configuration headache.

> But we really should be fixing the IPC interlock with IPC_EXCL, but the
> code changes look to be non-trivial.

AFAIR the previous thread, it wasn't that bad, it was just a matter of
someone taking the time to do it.  Maybe I'll have a go at it...
        regards, tom lane

Re: Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

From

Marko Kreen

Date:

27 November 2000, 08:11:48

On Sat, Nov 25, 2000 at 07:41:52PM -0500, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > Actually, this turns out to be similar to what you wrote in
> > http://www.postgresql.org/mhonarc/pgsql-hackers/1998-08/msg00835.html
> 
> Well, we've talked before about moving the socket files to someplace
> safer than /tmp.  The problem is to find another place that's not
> platform-dependent --- else you've got a major configuration headache.

Could this be described in e.g. /etc/postgresql/pg_client.conf?
a la the dbname idea?

I cant remember the exact terminology, but there is a
configuration file for clients, set at compile time where are
set the connection params for clients.

---------

[db_foo]
type=inet
host=srv3.devel.net
port=1234
# there should be a way of specifing dbname later too
database=asdf

[db_baz]
type=unix
socket=/var/lib/postgres/comm/db_baz

--------

Also there should be possible to give another configuration file
with env vars or command-line parameters.

Well, just a idea.

-- 
marko

Re: Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

From

Tom Lane

Date:

27 November 2000, 11:05:49

Marko Kreen <marko@l-t.ee> writes:
>> Well, we've talked before about moving the socket files to someplace
>> safer than /tmp.  The problem is to find another place that's not
>> platform-dependent --- else you've got a major configuration headache.

> Could this be described in e.g. /etc/postgresql/pg_client.conf?

The major objection to that is that if we rely on such a config file,
then you *cannot* install postgres without root permission (to make
the config file).  Currently it's possible to fire up a test postmaster
without any special privileges whatever, and that's a nice feature.

A related objection is that such a file will itself become a source of
contention among multiple postmasters.  Suppose I'm setting up a test
installation of a new version, while still running the prior release
as my main database.  OK, I fire up the test postmaster on a different
port, and now I want to launch some of my usual clients for testing.
Oops, they connect to the old postmaster because that's what it says
to do in /etc/postgresql/pg_client.conf.  I can't get them to connect
to the new postmaster unless I change /etc/postgresql/pg_client.conf,
which I *don't* want to do at this stage --- it'll break non-test
instances of these same clients.

I see some value in the pg_client.conf idea as a *per user* address
book, to shortcut full specification of all the databases that user
might want to connect to.  As a system-wide configuration file, I think
it's a terrible idea.
        regards, tom lane

Re: Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

From

Marko Kreen

Date:

27 November 2000, 12:14:45

On Mon, Nov 27, 2000 at 11:05:40AM -0500, Tom Lane wrote:
> Marko Kreen <marko@l-t.ee> writes:
> >> Well, we've talked before about moving the socket files to someplace
> >> safer than /tmp.  The problem is to find another place that's not
> >> platform-dependent --- else you've got a major configuration headache.
> 
> > Could this be described in e.g. /etc/postgresql/pg_client.conf?
> 
> The major objection to that is that if we rely on such a config file,
> then you *cannot* install postgres without root permission (to make
> the config file).  Currently it's possible to fire up a test postmaster
> without any special privileges whatever, and that's a nice feature.

I do not see this much of a problem tho'.

[ I use the words XCONFIG and XNAME because I have no good idea
what they should be called. ]

server startup precedence:

1) postmaster --xconfig ./foo.cfg
2) PG_XCONFIG=./foo.cfg
3) /etc/postgresql/pg_xconfig (compile time spec)

there is also a thing 'xname' which is the section of config
file to use:

1) --xname foodb
2) PG_XNAME=foodb
3) default_xname specified in config.

so, client (libpq (psql)) startup:

1) psql --xconfig ./xxx
2) PG_XCONFIG=./xxx
3) ~/.pg_xconfig
4) /etc/postgresql/pg_xconfig

and xname as in server.


It may be better if server config is in separate file because we
may want to give more options to server (ipc keys, data dirs,
etc).  But I guess its sipler when they read the same file and
client simply ignores server directives.  And server ignores
remote servers.

Also it should be possible to put all directives into commend
line too, for both client and server.

> 
> A related objection is that such a file will itself become a source of
> contention among multiple postmasters.  Suppose I'm setting up a test
> installation of a new version, while still running the prior release
> as my main database.  OK, I fire up the test postmaster on a different
> port, and now I want to launch some of my usual clients for testing.
> Oops, they connect to the old postmaster because that's what it says
> to do in /etc/postgresql/pg_client.conf.  I can't get them to connect
> to the new postmaster unless I change /etc/postgresql/pg_client.conf,
> which I *don't* want to do at this stage --- it'll break non-test
> instances of these same clients.

postmaster --xconfig ./test.cfg --xname testdb &
psql --xconfig ./test.cfg --xname testdb

> 
> I see some value in the pg_client.conf idea as a *per user* address
> book, to shortcut full specification of all the databases that user
> might want to connect to.  As a system-wide configuration file, I think
> it's a terrible idea.

So what you think of the above idea?

-- 
marko