Re: Request for vote to move forward with recovery.conf overhaul - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Request for vote to move forward with recovery.conf overhaul
Date
Msg-id CAB7nPqQVTh+O=jvJzSu95hxngi1iQ1QK6zLoKjdm+MaqGfsDcw@mail.gmail.com
Whole thread Raw
In response to Re: Request for vote to move forward with recovery.conf overhaul  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: Request for vote to move forward with recovery.conf overhaul  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: Request for vote to move forward with recovery.conf overhaul  (Greg Smith <greg@2ndQuadrant.com>)
List pgsql-hackers


On Wed, Mar 6, 2013 at 2:08 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
Thanks for taking time in typing a complete summary of the situation. That really helps.

On Mon, Mar 4, 2013 at 11:25 AM, Greg Smith <greg@2ndquadrant.com> wrote:
On 1/23/13 6:36 AM, Michael Paquier wrote:
The only problem I see is if the same parameter is defined in recovery.conf and postgresql.conf, is the priority given to recovery.conf?

This sort of thing is what dragged down the past work on this.  I don't really agree with the idea this thread is based on, that it was backward compatibility that kept this from being finished last year.  I put a good bit of time into a proposal that a few people seemed happy with; all its ideas seem to have washed away.  That balanced a couple of goals:

-Allow a "pg_ctl standby" and "pg_ctl recover" command that work similarly to "promote".  This should slim down the work needed for the first replication setup people do.
-Make it obvious when people try to use recovery.conf that it's not supported anymore.
-Provide a migration path for tool authors strictly in the form of some documentation and error message hints.  That was it as far as concessions to backward compatibility.

The wrap-up I did started at http://www.postgresql.org/message-id/4EE91248.8010505@2ndQuadrant.com and only had a few responses/controversy from there.  Robert wrote a good summary:

1. Get rid of recovery.conf - error out if it is seen
2. For each parameter that was previously a recovery.conf parameter, make it a GUC
3. For the parameter that was "does recovery.conf exist?", replace it with "does standby.enabled exist?".
Agreed on that.
 
I thought this stopped from there because no one went back to clean up Fujii's submission, which Simon and Michael have now put more time into.  There is not much distance between it and the last update Michael sent.
Just to be picky. I recommend using the version dated of 23rd of January as a work base if we definitely get rid of recovery.conf, the patch file is called 20130123_recovery_unite.patch.
The second patch I sent on the 27th tried to support both recovery.conf and postgresql.conf, but this finishes with a lot of duplicated code as two sets of functions are necessary to deparse options for both postgresql.conf and recovery.conf...
 
Here's the detailed notes from my original proposal, with updates to incorporate the main feedback I got then; note that much of this is documentation rather than code:

-Creating a standby.enabled file puts the system into recovery mode. That feature needs to save some state, and making those decisions based on existence of a file is already a thing we do.  Rather than emulating the rename to recovery.done that happens now, the server can just delete it, to keep from incorrectly returning to a state it's exited.  A UI along the lines of the promote one, allowing "pg_ctl standby", should fall out of here.

This file can be relocated to the config directory, similarly to how the include directory looks for things.  There was a concern that this would require write permissions that don't exist on systems that relocate configs, like Debian/Ubuntu.  That doesn't look to be a real issue though.  Here's a random Debian server showing the postgres user can write to all of those:

$ ls -ld /etc/postgresql
drwxr-xr-x 4 root root 4096 Jun 29  2012 /etc/postgresql
$ ls -ld /etc/postgresql/9.1
drwxr-xr-x 3 postgres postgres 4096 Jul  1  2012 /etc/postgresql/9.1
$ ls -ld /etc/postgresql/9.1/main
drwxr-xr-x 2 postgres postgres 4096 Mar  3 11:00 /etc/postgresql/9.1/main

-A similar recovery.enabled file turns on PITR recovery.

-It should be possible to copy a postgresql.conf file from master to standby and just use it.  For example:
--"standby_mode = on":  Ignored unless you've started the server with standby.enabled, won't bother the master if you include it.
--"primary_conninfo":  This will look funny on the master showing it connecting to itself, but it will get ignored there too.

-If startup finds a recovery.conf file where it used to live at,
abort--someone is expecting the old behavior.  Hint to RTFM or include a short migration guide right on the spot.  That can have a nice section about how you might use the various postgresql.conf include* features if they want to continue managing those files separately.  Example: rename what you used to make recovery.conf as replication.conf and use include_if_exists if you want to be able to rename it to recovery.done like before.  Or drop it into a config/ directory (similarly to the proposal for SET PERSISTENT) where the rename to recovery.done will make it then skipped.  (Only files ending with .conf are processed by include_dir)

-Tools such as pgpool that want to write a simple configuration file,
only touching the things that used to go into recovery.conf, can tell
people to do the same trick.  End their postgresql.conf with a call to
\include_if_exists replication.conf as part of setup.  While I don't
like pushing problems toward tool vendors, as one I think validating if
this has been done doesn't require the sort of fully GUC compatible
parser people (rightly) want to avoid.  A simple scan of the
postgresql.conf looking for the recommended text at the front of a line
could confirm whether that bit is there.  And adding a single
"include_if_exists" line to the end of the postgresql.conf is not a
terrible edit job to consider pushing toward tools.  None of this is any more complicated than the little search and replace job that initdb does right now.

-If you want to do something special yourself to clean up when recovery finishes, perhaps to better emulate the old "those settings go away" implementation, there's already recovery_end_command available for that.  Let's say you wanted to force the old name and did "include_if_exists config/recovery.conf".  Now you could do:

recovery_end_command = 'rm -f /tmp/pgsql.trigger.5432 && mv conf.d/recovery.conf conf.d/recovery.done'
Looks good to me too.
Based on the patch I already sent before, there are a couple of things missing:
- There are no pg_ctl standby/recover commands implemented yet (no that difficult to add)
- a certain amount of work is needed for documentation

Btw, I have a concern about that: is it really the time to finish that for this release knowing that the 9.3 feature freeze is getting close? I still don't know when it will be but it is... close.
This patch is still in the current commit fest. Any objections in marking it as returned with feedback and put it in the next commit fest? We could work on that again at the next release as it looks that there is not much time to work on it for 9.3.
--
Michael

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Duplicate JSON Object Keys
Next
From: Alvaro Herrera
Date:
Subject: Re: Request for vote to move forward with recovery.conf overhaul