On 08.08.2013 20:15, Josh Berkus wrote:
> Bruce, all:
>
>> We seem to be all over the map with the fast promotion code --- some
>> people don't trust it, some people want an option to enable the old
>> method, and some people want the old method removed.
>
> Having read over this thread, the only reason given for retaining any
> ability to use "old" promotion code is because people are worried about
> "fast" promotion being buggy. This seems wrong.
>
> Either we have confidence is fast promotion, or we don't. If we don't
> have confidence, then either (a) more testing is needed, or (b) it
> shouldn't be the default. Again, here, we are coming up against our
> lack of any kind of broad replication failure testing.
Well, I don't see much harm in keeping the old behavior as an
undocumented escape hatch, as it is now. The way I'd phrase the current
situation is this: 9.3 now always does "fast promotion". However, for
debugging and testing purposes, you can still trigger the old behavior
by manually creating a file in $PGDATA. That should never be necessary
in the field, however.
There's one thing that irks me with the current situation, however: if
you use 9.2 version of pg_ctl against a 9.3 server, it will
inadvertently trigger slow promotion, because it creates the "promote"
file. Since fast mode is the default, and not only the default but the
only documented mode, it's confusing if you can accidentally trigger the
old behavior like that.
And it's even worse if you use 9.3 pg_ctl against a 9.2 server: it will
create a filed called "fast_promote" and return success, but it won't
actually do anything.
I think "promote" file should trigger the fast promotion, and the
filename to trigger the slow mode should be called "fallback_promote" or
"safe_promote" or something like that. There wasn't any good reason to
change the filename primarily used. It might even break people's scripts
for no good reason, if people are creating the $PGDATA/promote file
themselves without using pg_ctl.
(I raised this back in April, but Simon argued strongly for the current
situation. I never understood why.
http://www.postgresql.org/message-id/517798AE.30203@vmware.com)
- Heikki