Re: Yet another failure mode in pg_upgrade - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Yet another failure mode in pg_upgrade |
Date | |
Msg-id | CA+TgmoY1XFhJ9WoENiY=-NNvSy4PjDPYmqt4-2NCy8gWUcmyZA@mail.gmail.com Whole thread Raw |
In response to | Re: Yet another failure mode in pg_upgrade (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: Yet another failure mode in pg_upgrade
|
List | pgsql-hackers |
On Sat, Sep 1, 2012 at 11:45 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Mon, Aug 13, 2012 at 12:46:43PM +0200, Magnus Hagander wrote: >> On Mon, Aug 13, 2012 at 4:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> > I've been experimenting with moving the Unix socket directory to >> > /var/run/postgresql for the Fedora distribution (don't ask :-(). >> > It's mostly working, but I found out yet another way that pg_upgrade >> > can crash and burn: it doesn't consider the possibility that the >> > old or new postmaster is compiled with a different default >> > unix_socket_directory than what is compiled into the libpq it's using >> > or that pg_dump is using. >> > >> > This is another hazard that we could forget about if we had some way for >> > pg_upgrade to run standalone backends instead of starting a postmaster. >> >> Yeah, that would be nice. >> >> >> > But in the meantime, I suggest it'd be a good idea for pg_upgrade to >> > explicitly set unix_socket_directory (or unix_socket_directories in >> > HEAD) when starting the postmasters, and also explicitly set PGHOST >> > to ensure that the client-side code plays along. >> >> That sounds like a good idea for other reasons as well - manual >> connections attempting to get in during an upgrade will just fail with >> a "no connection" error, which makes sense... >> >> So, +1. > > OK, I looked this over, and I have a patch, attached. > > Because we are already playing with socket directories, this patch creates > the socket files in the current directory for upgrades and non-live > checks, but not live checks. This eliminates the "someone accidentally > connects" problem, at least on Unix, plus we are using port 50432 > already. This also turns off TCP connections for unix domain socket > systems. > > For "live check" operation, you are checking a running server, so > assuming the socket is in the current directory is not going to work. > What the code does is to read the 5th line from the running server's > postmaster.pid file, which has the socket directory in PG >= 9.1. For > pre-9.1, pg_upgrade uses the compiled-in defaults for socket directory. > If the defaults are different between the two servers, the new binaries, > e.g. pg_dump, will not work. The fix is for the user to set pg_upgrade > -O to match the old socket directory, and set PGHOST before running > pg_upgrade. I could not find a good way to generate a proper error > message because we are blind to the socket directory in pre-9.1. > Frankly, this is a problem if the old pre-9.1 server is running in a > user-configured socket directory too, so a documentation addition seems > right here. > > So, in summary, this patch moves the socket directory to the current > directory all but live check operation, and handles different socket > directories for old cluster >= 9.1. I have added a documentation > mention of how to make this work for for pre-9.1 old servers. I don't think this is reducing the number of failure modes; it's just changing it from one set of obscure cases to a slightly different set of obscure cases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: