Re: Should we remove "not fast" promotion at all? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Should we remove "not fast" promotion at all?
Date
Msg-id 20130808173418.GA15275@alap2.anarazel.de
Whole thread Raw
In response to Re: Should we remove "not fast" promotion at all?  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On 2013-08-08 10:15:14 -0700, Josh Berkus wrote:
> Bruce, all:
> 
> > We seem to be all over the map with the fast promotion code --- some
> > people don't trust it, some people want an option to enable the old
> > method, and some people want the old method removed.
> 
> Having read over this thread, the only reason given for retaining any
> ability to use "old" promotion code is because people are worried about
> "fast" promotion being buggy.  This seems wrong.

Well, it's touching one of the more complex parts of pg.

> Either we have confidence is fast promotion, or we don't.  If we don't
> have confidence, then either (a) more testing is needed, or (b) it
> shouldn't be the default.  Again, here, we are coming up against our
> lack of any kind of broad replication failure testing.

While I think we definitely miss out there I don't think any regression
suite would help much here. I am wary of unknown problems, not ones
we already have tests for. The subtle ones aren't easy to test, even
with a regression suite.

> Of course, even if we have confidence, bugs are always possible, and
> leaving the old promotion code in there would make it somewhat easier to
> ship a 9.3.2 update which reverts the behavior.  But maybe we should
> focus on shipping a version which is relatively bug-free instead?

The problem is that, especially involving HS, there's lots of subtle
corner cases. And those are pretty hard to forsee and thus hard to
test. Being able to tell somebody to touch some file and kill a certain
process instead of pg_ctl triggering is certainly better than to have
them apply complex patches which then only exhibit the old behaviour.
It's not about letting people regularly use it or such. It's about being
able to verify problems.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: mvcc catalo gsnapshots and TopTransactionContext
Next
From: Josh Berkus
Date:
Subject: Re: Should we remove "not fast" promotion at all?