Thread: Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files
* Tom Lane <tgl@sss.pgh.pa.us> [001125 16:37]: > "Joel Burton" <jburton@scw.org> writes: > > This story does indicate that we need a less fragile interlock against > starting two postmasters on one database. I have to admit that it > hadn't occurred to me that you could break the port-number interlock > so easily as that :-(. But obviously you can, so we need a different > way of representing the interlock. Hackers, any thoughts? how about a .pid/.port/.??? file in the /data directory, and a lock on that? -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: ler@lerctr.org US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
Larry Rosenman <ler@lerctr.org> writes: > * Tom Lane <tgl@sss.pgh.pa.us> [001125 16:37]: >> This story does indicate that we need a less fragile interlock against >> starting two postmasters on one database. I have to admit that it >> hadn't occurred to me that you could break the port-number interlock >> so easily as that :-(. But obviously you can, so we need a different >> way of representing the interlock. Hackers, any thoughts? > how about a .pid/.port/.??? file in the /data directory, and a lock on that? Nope, 'cause it wouldn't protect you against two postmasters in different data directories trying to use the same port number. The port-number lock has to use a system-wide mechanism. You may want to go back and review the previous threads that have discussed interlock issues. We have really three independent resources that we have to ensure only one postmaster is using at a time: 1. Port number (for Unix socket, IP address, etc) 2. Data directory (database files) 3. Shared memory. Up to now shared memory has been protected more or less implicitly by the port-number lock, since the shared memory IPC key is derived from the port number. However, the "virtual host" patch that we recently accepted (way prematurely IMHO) breaks that correspondence. I suspect that we really ought to try to have an independent interlock on the shared memory block itself. There was a thread around 4/30/00 concerning changing the way that shmem IPC keys are generated, and maybe that would address this concern. If we weren't relying on port number to protect shared memory, I think the existing interlocks on port would be sufficient. The kernel enforces an interlock on listening to the same IP address, so that's OK, and an advisory lock on the socket file is OK for preventing two postmasters from listening to the same socket file. (There's no real reason to prevent postmasters from using similarly-socket-numbered socket files in different directories, other than the shmem key issue, so a lock on the socket file is really just what we want for that specific resource.) There is a related issue on my todo list, though --- didn't we find out awhile back that some older Linux kernels crash and burn if one attempts to get an advisory lock on a socket file? (See thread 7/6/00) Were we going to fix that, and if so how? Or will we just tell people that they have to update their kernel to run Postgres? The current configure script "works around" this by disabling the advisory lock on *all* versions of Linux, which I regard as a completely unacceptable solution... regards, tom lane
Tom Lane writes: > There is a related issue on my todo list, though --- didn't we find out > awhile back that some older Linux kernels crash and burn if one attempts > to get an advisory lock on a socket file? (See thread 7/6/00) Were we > going to fix that, and if so how? Or will we just tell people that they > have to update their kernel to run Postgres? The current configure > script "works around" this by disabling the advisory lock on *all* > versions of Linux, which I regard as a completely unacceptable > solution... Firstly, AFAIK there's no official production kernel that fixes this. When and if it gets fixed we can change that logic. I have simple test program that exhibits the problem (taken from the kernel mailing list), but a) You shouldn't run test programs in configure. b) You really shouldn't run test programs in configure that set up networking connections. c) You definitely shouldn't run test programs in configure that provoke kernel exceptions. We could use flock() on Linux, though. Maybe we could name the socket file .s.PGSQL.port.pid and make .s.PGSQL.port a symlink. Then you can find out whether the postmaster that created the file is still running. (You could even put the actual socket file into the data directory, although that would require re-thinking the file permissions on the latter.) Actually, this turns out to be similar to what you wrote in http://www.postgresql.org/mhonarc/pgsql-hackers/1998-08/msg00835.html But we really should be fixing the IPC interlock with IPC_EXCL, but the code changes look to be non-trivial. -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Peter Eisentraut <peter_e@gmx.net> writes: > Maybe we could name the socket file .s.PGSQL.port.pid and make > .s.PGSQL.port a symlink. Then you can find out whether the postmaster > that created the file is still running. Or just create a lockfile /tmp/.s.PGSQL.port#.lock, ie, same name as socket file with ".lock" added (containing postmaster's PID). Then we could share code with the data-directory-lockfile case. > Actually, this turns out to be similar to what you wrote in > http://www.postgresql.org/mhonarc/pgsql-hackers/1998-08/msg00835.html Well, we've talked before about moving the socket files to someplace safer than /tmp. The problem is to find another place that's not platform-dependent --- else you've got a major configuration headache. > But we really should be fixing the IPC interlock with IPC_EXCL, but the > code changes look to be non-trivial. AFAIR the previous thread, it wasn't that bad, it was just a matter of someone taking the time to do it. Maybe I'll have a go at it... regards, tom lane
On Sat, Nov 25, 2000 at 07:41:52PM -0500, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > Actually, this turns out to be similar to what you wrote in > > http://www.postgresql.org/mhonarc/pgsql-hackers/1998-08/msg00835.html > > Well, we've talked before about moving the socket files to someplace > safer than /tmp. The problem is to find another place that's not > platform-dependent --- else you've got a major configuration headache. Could this be described in e.g. /etc/postgresql/pg_client.conf? a la the dbname idea? I cant remember the exact terminology, but there is a configuration file for clients, set at compile time where are set the connection params for clients. --------- [db_foo] type=inet host=srv3.devel.net port=1234 # there should be a way of specifing dbname later too database=asdf [db_baz] type=unix socket=/var/lib/postgres/comm/db_baz -------- Also there should be possible to give another configuration file with env vars or command-line parameters. Well, just a idea. -- marko
Marko Kreen <marko@l-t.ee> writes: >> Well, we've talked before about moving the socket files to someplace >> safer than /tmp. The problem is to find another place that's not >> platform-dependent --- else you've got a major configuration headache. > Could this be described in e.g. /etc/postgresql/pg_client.conf? The major objection to that is that if we rely on such a config file, then you *cannot* install postgres without root permission (to make the config file). Currently it's possible to fire up a test postmaster without any special privileges whatever, and that's a nice feature. A related objection is that such a file will itself become a source of contention among multiple postmasters. Suppose I'm setting up a test installation of a new version, while still running the prior release as my main database. OK, I fire up the test postmaster on a different port, and now I want to launch some of my usual clients for testing. Oops, they connect to the old postmaster because that's what it says to do in /etc/postgresql/pg_client.conf. I can't get them to connect to the new postmaster unless I change /etc/postgresql/pg_client.conf, which I *don't* want to do at this stage --- it'll break non-test instances of these same clients. I see some value in the pg_client.conf idea as a *per user* address book, to shortcut full specification of all the databases that user might want to connect to. As a system-wide configuration file, I think it's a terrible idea. regards, tom lane
On Mon, Nov 27, 2000 at 11:05:40AM -0500, Tom Lane wrote: > Marko Kreen <marko@l-t.ee> writes: > >> Well, we've talked before about moving the socket files to someplace > >> safer than /tmp. The problem is to find another place that's not > >> platform-dependent --- else you've got a major configuration headache. > > > Could this be described in e.g. /etc/postgresql/pg_client.conf? > > The major objection to that is that if we rely on such a config file, > then you *cannot* install postgres without root permission (to make > the config file). Currently it's possible to fire up a test postmaster > without any special privileges whatever, and that's a nice feature. I do not see this much of a problem tho'. [ I use the words XCONFIG and XNAME because I have no good idea what they should be called. ] server startup precedence: 1) postmaster --xconfig ./foo.cfg 2) PG_XCONFIG=./foo.cfg 3) /etc/postgresql/pg_xconfig (compile time spec) there is also a thing 'xname' which is the section of config file to use: 1) --xname foodb 2) PG_XNAME=foodb 3) default_xname specified in config. so, client (libpq (psql)) startup: 1) psql --xconfig ./xxx 2) PG_XCONFIG=./xxx 3) ~/.pg_xconfig 4) /etc/postgresql/pg_xconfig and xname as in server. It may be better if server config is in separate file because we may want to give more options to server (ipc keys, data dirs, etc). But I guess its sipler when they read the same file and client simply ignores server directives. And server ignores remote servers. Also it should be possible to put all directives into commend line too, for both client and server. > > A related objection is that such a file will itself become a source of > contention among multiple postmasters. Suppose I'm setting up a test > installation of a new version, while still running the prior release > as my main database. OK, I fire up the test postmaster on a different > port, and now I want to launch some of my usual clients for testing. > Oops, they connect to the old postmaster because that's what it says > to do in /etc/postgresql/pg_client.conf. I can't get them to connect > to the new postmaster unless I change /etc/postgresql/pg_client.conf, > which I *don't* want to do at this stage --- it'll break non-test > instances of these same clients. postmaster --xconfig ./test.cfg --xname testdb & psql --xconfig ./test.cfg --xname testdb > > I see some value in the pg_client.conf idea as a *per user* address > book, to shortcut full specification of all the databases that user > might want to connect to. As a system-wide configuration file, I think > it's a terrible idea. So what you think of the above idea? -- marko