Re: [CORE] Restore-reliability mode - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: [CORE] Restore-reliability mode
Date
Msg-id 556F3798.6010707@agliodbs.com
Whole thread Raw
In response to Re: [CORE] postpone next week's release  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [CORE] Restore-reliability mode  (Andres Freund <andres@anarazel.de>)
Re: [CORE] Restore-reliability mode  (Stephen Frost <sfrost@snowman.net>)
Re: [CORE] Restore-reliability mode  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 06/03/2015 06:50 AM, Noah Misch wrote:
> Subject changed from "Re: [CORE] postpone next week's release".
> 
> On Sat, May 30, 2015 at 10:48:45PM -0400, Bruce Momjian wrote:
>> If we have to totally stop feature development until we are all happy
>> with the code we have, so be it.  If people feel they have to get into
>> cleanup mode or they will never get to add a feature to Postgres again,
>> so be it.  If people say, heh, I am not going to do anything and just
>> come back when cleanup is done (by someone else), then we will end up
>> with a smaller but more dedicated development team, and I am fine with
>> that too.  I am suggesting that until everyone is happy with the code we
>> have, we should not move forward.
> 
> I like the essence of this proposal.  Two suggestions.  We can't achieve or
> even robustly measure "everyone is happy with the code," so let's pick
> concrete exit criteria.  Given criteria framed like "Files A,B,C and patches
> X,Y,Z have a sign-off from a committer other than their original committer."
> anyone can monitor progress and find specific ways to contribute.  Second, I
> would define the subject matter as "bug fixes, testing and review", not
> "restructuring, testing and review."  Different code structures are clearest
> to different hackers.  Restructuring, on average, adds bugs even more quickly
> than feature development adds them.

So, historically, this is what the period between feature freeze and
beta1 was for; the "consolidation" phase was supposed to deal with this.The problem over the last few years, by my
observation,has been that
 
consolidation has been left to just a few people (usually just Bruce &
Tom or Tom & Robert) and our code base is now much to large for that.

The way other projects deal with this is having continuous testing as
stuff comes in, and *more* testing that just our regression tests (e.g.
acceptance tests, integration tests, performance tests, etc.). So our
other issue has been that our code complexity has been growing faster
than our test suite.  Part of that is that this community has never
placed much value in automated testing or testers, so people who are
interested in it find other projects to contribute to.

I would argue that if we delay 9.5 in order to do a 100% manual review
of code, without adding any new automated tests or other non-manual
tools for improving stability, then it's a waste of time; we might as
well just release the beta, and our users will find more issues than we
will.  I am concerned that if we declare a cleanup period, especially in
the middle of the summer, all that will happen is that the project will
go to sleep for an extra three months.

I will also point out that there is a major adoption cost to delaying
9.5.   Right now users are excited about UPSERT, big data, and extra
JSON features. If they have to wait another 7 months, they'll be a lot
less excited, and we'll lose more potential users to the new databases
and the MySQL forks.  It could also delay the BDR project (Simon/Craig
can speak to this) which would suck.

Reliability of having a release every year is important as well as
database reliability ... and for a lot of the new webdev generation,
PostgreSQL is already the most reliable piece of software infrastructure
they use.  So if we're going to have a cleanup delay, then let's please
make it an *intensive* cleanup delay, with specific goals, milestones,
and a schedule.  Otherwise, don't bother.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: anole: assorted stability problems
Next
From: Andres Freund
Date:
Subject: Re: [CORE] Restore-reliability mode