Re: Yet another failure mode in pg_upgrade - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: Yet another failure mode in pg_upgrade |
Date | |
Msg-id | 20120901154558.GA2969@momjian.us Whole thread Raw |
In response to | Re: Yet another failure mode in pg_upgrade (Magnus Hagander <magnus@hagander.net>) |
Responses |
Re: Yet another failure mode in pg_upgrade
Re: Yet another failure mode in pg_upgrade Re: Yet another failure mode in pg_upgrade |
List | pgsql-hackers |
On Mon, Aug 13, 2012 at 12:46:43PM +0200, Magnus Hagander wrote: > On Mon, Aug 13, 2012 at 4:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I've been experimenting with moving the Unix socket directory to > > /var/run/postgresql for the Fedora distribution (don't ask :-(). > > It's mostly working, but I found out yet another way that pg_upgrade > > can crash and burn: it doesn't consider the possibility that the > > old or new postmaster is compiled with a different default > > unix_socket_directory than what is compiled into the libpq it's using > > or that pg_dump is using. > > > > This is another hazard that we could forget about if we had some way for > > pg_upgrade to run standalone backends instead of starting a postmaster. > > Yeah, that would be nice. > > > > But in the meantime, I suggest it'd be a good idea for pg_upgrade to > > explicitly set unix_socket_directory (or unix_socket_directories in > > HEAD) when starting the postmasters, and also explicitly set PGHOST > > to ensure that the client-side code plays along. > > That sounds like a good idea for other reasons as well - manual > connections attempting to get in during an upgrade will just fail with > a "no connection" error, which makes sense... > > So, +1. OK, I looked this over, and I have a patch, attached. Because we are already playing with socket directories, this patch creates the socket files in the current directory for upgrades and non-live checks, but not live checks. This eliminates the "someone accidentally connects" problem, at least on Unix, plus we are using port 50432 already. This also turns off TCP connections for unix domain socket systems. For "live check" operation, you are checking a running server, so assuming the socket is in the current directory is not going to work. What the code does is to read the 5th line from the running server's postmaster.pid file, which has the socket directory in PG >= 9.1. For pre-9.1, pg_upgrade uses the compiled-in defaults for socket directory. If the defaults are different between the two servers, the new binaries, e.g. pg_dump, will not work. The fix is for the user to set pg_upgrade -O to match the old socket directory, and set PGHOST before running pg_upgrade. I could not find a good way to generate a proper error message because we are blind to the socket directory in pre-9.1. Frankly, this is a problem if the old pre-9.1 server is running in a user-configured socket directory too, so a documentation addition seems right here. So, in summary, this patch moves the socket directory to the current directory all but live check operation, and handles different socket directories for old cluster >= 9.1. I have added a documentation mention of how to make this work for for pre-9.1 old servers. Thus completes another "surgery on a moving train" that is pg_upgrade development. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Attachment
pgsql-hackers by date: