Re: Two weeks to feature freeze - Mailing list pgsql-hackers

From Dann Corbit
Subject Re: Two weeks to feature freeze
Date
Msg-id D90A5A6C612A39408103E6ECDD77B829408B33@voyager.corporate.connx.com
Whole thread Raw
In response to Two weeks to feature freeze  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Two weeks to feature freeze  (ow <oneway_111@yahoo.com>)
Re: Two weeks to feature freeze  (The Hermit Hacker <scrappy@postgresql.org>)
Re: Two weeks to feature freeze  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Two weeks to feature freeze  (Sailesh Krishnamurthy <sailesh@cs.berkeley.edu>)
List pgsql-hackers
> -----Original Message-----
> From: Jason Earl [mailto:jason.earl@simplot.com]
> Sent: Friday, June 20, 2003 4:43 PM
> To: Dann Corbit
> Cc: Jason Earl; PostgreSQL-development
> Subject: Re: [HACKERS] Two weeks to feature freeze
>
>
> "Dann Corbit" <DCorbit@connx.com> writes:
>
> >> -----Original Message-----
> >> From: Jason Earl [mailto:jason.earl@simplot.com]
> >> Sent: Friday, June 20, 2003 3:32 PM
> >> To: Dann Corbit
> >> Cc: Jason Earl; The Hermit Hacker; PostgreSQL-development
> >> Subject: Re: [HACKERS] Two weeks to feature freeze
> >>
> >>
> >> "Dann Corbit" <DCorbit@connx.com> writes:
> >> >>
> >> >> Why couldn't you just release the win32 version of 7.4
> when it was
> >> >> finished.  If it takes an extra month then that just gives you
> >> >> guys the chance to circulate *two* press releases.
> >> >> The Native Win32 port is likely to make a big enough splash
> >> >> all by itself.
> >> >
> >> > A formal release needs a big testing effort.  Two
> separate releases
> >> > will double the work of validation.
> >>
> >> There are several problems with that statement.  The first is
> >> that PostgreSQL's "testing effort" happens right here on this
> >> mailing list.
> >
> > That's not exactly reassuring.  There is no regression test
> that gets
> > formal acceptance?!
>
> Yes, there are regression tests, and new tests get invented
> all of the time whenever the real world finds new bugs.
> Regression tests are excellent for making sure that you don't
> make the same mistake twice, but they aren't a substitute for
> handing the code over to actual end users.

After testing & QA, there is a beta period.  You don't hand beta code
over to actual end users.  In the corporate world it would be a clear
case of both negligence and incompetence.
> >> The various PostgreSQL hackers code stuff up,
> >> and we try and break it. There's very little /effort/
> >> involved.  People that want the new features go out on a limb
> >> and start using them.  If they don't work, then they bring it
> >> up on the list.  If they do work then very little gets said.
> >>
> >> As it now stands Tom Lane is on the record as stating that
> >> the new Win32 version isn't going to be ready for production
> >> anyhow.  If anything the Win32 version *should* get released
> >> separately simply because we don't want people mistaking the
> >> Win32 version as being up to the PostgreSQL teams high
> >> standards.  Those people that want the Win32 version to
> >> become production ready are going to have to risk their
> >> precious data.  Otherwise, the Win32 version will likely
> >> remain a second class citizen forever.
> >>
> >> The fact of the matter is that the Win32 specific bits are
> >> the parts that are likely to break in the new port.  If
> >> anything the Windows version will *benefit* from an earlier
> >> *nix release because the *nix users will chase down the bugs
> >> in the new PostgreSQL features.  Once the *nix version is up
> >> to 7.4.2 (or so) then a Windows release of 7.4.2 should allow
> >> the PostgreSQL hackers to simply chase down Windows
> specific problems.
> >
> > Then using the same logic, the new Windows version should wait
> > indefinitely, since the *nix version will always be shaking
> out bugs.
>
> That's not true at all.  Despite the excellent work by the
> PostgreSQL team, and despite the beta testing that will be
> done by volunteers, if history repeats itself, there will be
> problems with version 7.4.0, even on platforms that have been
> well supported by PostgreSQL forever. I am not saying that we
> should hold off indefinitely on the Win32 port, I am simply
> saying that it probably wouldn't hurt to shake out the normal
> .0 release bugs before throwing the unique Win32 bugs into the mix.
>
> My guess is that reported Win32 bugs are going blamed on the
> Win32 specific bits at first no matter what happens.  Unless
> the bug can be demonstrated on a *nix version it will be
> assumed that the problem is a shortcoming of the Win32
> specific code.  That's just common sense.

You are right that a new feature will add new bugs.  I am saying that
the Win32 port is a new feature that will need a shakedown, but the
shakedown should occur in the testing and beta phase, like any other
feature.
> >> Adding a new platform--especially a platform as diverse from
> >> the rest of PostgreSQL's supported platforms as Windows--is
> >> what adds the work. Testing the new platform is relatively
> >> easy.  All you need to do is to start using the Win32 version
> >> with real live data.
> >
> > That is not testing.  Using the world as your beta team
> seems to be a
> > business model used by a few software giants that is
> largely frowned
> > upon.  I would think that there is an opportunity to do things
> > differently. [Read 'properly'].
>
> Hmm... I must have missed the huge corporation paying for in
> house testing of PostgreSQL.  In the Free Software world the
> "beta team" is all of those people that need the new features
> so badly that they are willing to risk their own data and
> hardware testing it.

I don't see how this model can possibly succeed then.  You can't just
hope that your end users will:
1.  Exhaustively test
2.  Accurately report the findings

> You might not like the way that this
> sounds, but in practice it works astoundingly well.  Chances
> are you can't name 25 pieces of commercial software that run
> on the wide array of hardware platforms and OSes as
> PostgreSQL, and PostgreSQL has a earned a well deserved
> reputation for being a solid piece of software.  Clearly the
> PostgreSQL team is doing
> *something* right.

I don't argue that PostgreSQL is a good piece of software.  I happen to
like it very much and have been a staunch advocate for its use with our
commercial products as well as in house.  What I am saying is that it
may be possible to improve the process.

If the corporate world knew that the only testing applied to PostgreSQL
was ad-hoc, I doubt that it would be accepted at all anywhere.  The fact
that PostgreSQL does succeed shows that the installed base of users must
be highly intelligent and highly motivated.
> > We (at CONNX Solutions Inc.) have a formal release procedure that
> > includes many tens of thousands of automated tests using dozens of
> > different platforms.  There are literally dozens of
> machines (I would
> > guess 70 or so total) running around the clock for 7 days before we
> > even know if we have a release candidate.  The QA team is distinct
> > from the development team, and if they say "FLOP!" the release goes
> > nowhere.  No formal release until QA passes it.
>
> And yet when you release the software your customers
> invariably find bugs, don't they?

Our beta customers do help us to find bugs.  Bugs reported by customers
for released products are extremely rare.  The total issue count is 2495
in our bug tracking database (active since the late 1980's).  There are
82 issues found by the customers in that list, and 7 with an issue ID
over 2000 (recent issues).  Our code base is several hundred thousand
lines of code, and we have many thousands of customers world-wide.  When
I first started here, testing was less rigorous, and largely done by the
programmers instead of separate testing teams.  Since formal testing
procedures have been established, technical support incidents have gone
way down and quality has gone way up.

> Don't get me wrong.  I am all for testing, regression tests,
> and such, but the fact of the matter is that there simply is
> no way that a centralized authority could afford to really
> test PostgreSQL on even a fraction of the supported platforms
> and configurations.  The way it stands now the PostgreSQL
> teams gets the best testbed you could hope for (the world)
> for the price of hosting a few web and FTP servers (thanks Marc).
>
> PostgreSQL betas almost certainly gest tested on an order of
> magnitude more systems than the 70 that you boast about.

Maybe it does.  Maybe it doesn't.  You have no way of knowing, since no
formal reporting procedure exists.
> PostgreSQL gets tested on everything from Sony Playstations
> to AMD Opterons to IBM mainframes.  Heck, there are probably
> more than 70 machines running CVS versions of PostgreSQL
> right this minute (Marc, any download numbers to back this
> up?).

If your count all the end-users workstations, our products have millions
of seats.  We run on UNIX (Solaris, Linux, AIX, Tru64, HP/UX, etc.) and
OpenVMS and MVS and Win32 and OS/400 and others.  As you can well
imagine, we *must* have an enormous testing effort.

> More importantly, PostgreSQL gets tested on a wide
> variety of real world tasks, and not some lab application or
> some test scripts.

Spoken like a programmer.  Yes, real world data *always* turns up things
that neither the testers nor the programmers imagined.  But a huge and
comprehensive conformance testing effort will turn up 99% of the
problems.

> Like I have mentioned several times
> before. PostgreSQL gets tested by folks that put their actual
> data into the beta versions and try it out.  Even with this
> sort of testing, however, bugs still make it into the release
> version.

Bug cost as a function of discovery stage is exponential.
1.  Discovered in design phase: nearly free to fix (designer sees bug,
designer fixes bug)
2.  Discovered in unit test phase: very cheap to fix (programmer sees
bug, programmer fixes bug)
3.  Discovered in integration test phase: inexpensive to fix (other
engineers become involved)
4.  Discovered in beta test phase: expensive to fix (customers,
tech-support, sales, programmers, engineers involved)
5.  Discovered in release: catastrophic cost to fix (as above, but now
every single customer must be upgraded, tens of thousands of dollars
lost, minimum -- possibly millions)

> Even with a large group of beta testers we simply
> can't test all of the possible ways that the software might
> get used on every available platform.

100% code coverage is impossible.
Program proving is impossible.
0% defect code delivery is impossible.

But you should try to approach the ideal as closely as can be attained.
> > If there is no procedure for PostgreSQL of this nature, then there
> > really needs to be.  I am sure that MySQL must have
> something in place
> > like that.  Their "Crash-Me" test suite shows (at least) that they
> > have put a large effort into testing.
>
> Yow!  Have you read the crash-me script.  It's possible that
> they have improved dramatically in the year or so since I
> last took a look at them, but it used to be that MySQL's
> crash-me scripts were the worst amalgamation of marketeering
> and poor relational theory ever conceived by the human mind.

The tests are good tests.  They cover a wide range of features and
functions and discover if you can cause permanent damage to a database
by simply performing end-user queries.  The scripts are a bit hokey, but
it isn't all that difficult to get them to run.

> Basically the crash-me scripts were nothing more than an
> attempt to put MySQL's competition in the worst light
> possible.

I disagree.  In fact, in their matrix, PostgreSQL looks remarkably good.
In fact, I would choose it over MySQL, if the only examination made was
of the information contained in the matrix (but nobody would really
drive a decision based on that data alone).

> Basically any time a competitor differed from
> MySQL an error would be generated (despite the fact that it
> was very likely that it was MySQL that was wrong).

This is unfair and untrue. (I have no connection whatsoever with the
MySQL group, BTW).
> MySQL even tried to pawn this single-process monstrosity off
> as a "benchmark."  What a laugh.  It was a perfectly valid
> benchmark if your database was designed to be used by one
> user at a time and one of your biggest criteria was the time
> it took to create a valid connection from a perl script.

You can call it a conformance benchmark.  It is not touted as a
performance benchmark.  And nobody would fall for it if it were, since
it does not contain time information.
> PostgreSQL's regression tests (IMHO) are much better than
> MySQL's crash-me scripts.

They are less thorough in coverage, but not too bad.

Here is what I suggest:

PostgreSQL has an excellent programming team.  Why not try to recruit a
similar testing team?  I think it would strongly differentiate the tool
set from similar free stuff.

Perhaps all that is needed is some sort of automated, formal reporting
procedure.  For example, a large test set might be created that runs a
thorough regression feature list.  When the test completes, a data file
is emailed to some central repository, parsed, and stored in a database.

If the end-users can simply start some shell script and take off for the
weekend, then it would be possible to collect a large volume of data.


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Commands to change name, schema, owner
Next
From: Joe Conway
Date:
Subject: compile failure on cvs tip --with-krb5