Re: New regression test time - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: New regression test time
Date
Msg-id CAMkU=1zeRxAUNa8G5ayP5w1eG=8tqto6KpuVJomUBbcrYjdgMA@mail.gmail.com
Whole thread Raw
In response to Re: New regression test time  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
On Sat, Jun 29, 2013 at 3:43 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 06/29/2013 05:59 PM, Josh Berkus wrote:

Maybe there is a good case for these last two in a different set of tests.
If we had a different set of tests, that would be a valid argument.  But
we don't, so it's not.  And nobody has offered to write a feature to
split our tests either.

I have to say, I'm really surprised at the level of resistance people on
this list are showing to the idea of increasing test coverage. I thought
that Postgres was all about reliability?   For a project as mature as we
are, our test coverage is abysmal, and I think I'm starting to see why.



Dividing the tests into different sections is as simple as creating one schedule file per section.

I'm not at all resistant to it. In fact, of someone wants to set up separate sections and add new tests to the different sections I'll be more than happy to provide buildfarm support for it. Obvious candidates could include:

 * code coverage
 * bugs
 * tests too big to run in everyday developer use


I don't really see a difference in the first two.  If we were sure the uncovered code had no bugs, we wouldn't need to cover it.  At least if you consider unintended behavior changes to be bugs.  I think it would make more sense to split them up by what computers it makes sense to run them on.

Tests that take too much RAM to be run by everyone.
Tests that take too many CPUs (in order to be meaningful) to run by everyone most of the time.
Tests that take too much disk space...
Tests that take too much wall-clock time....
And maybe that tests that take too much wall-clock time specifically under CLOBBER_CACHE_ALWAYS...

Some of these sets would probably be empty currently, because candidates that belong in them were not committed at all since they were not wanted in the default and they there was no other place to add them.

If we are very worried about how long the tests take, we should probably also spend some time trying to make the existing ones faster.  Parallelization does not cut the test time very much (~20% with 8 CPUs), because the tests are poorly packed.  In a parallel group all the tests finish fast except one, and the whole group is then dominated by that one test.  (The main goal of parallelization is probably not to make the test faster, but to make them more realistic from a concurrency perspective, but if there is little actual parallelism, it doesn't achieve that very well, either).  I don't know how much freedom there is to re-order the tests without breaking dependencies, though.  I think prepared_xacts and stats could be usefully run together, as both take a long time sleeping but impose little real load that would interfere with each other.  Perhaps prepared_xacts could be re-written to get what it needs without the long statement_timeouts.  Testing the timeout itself doesn't seem to be the goal.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [PATCH] big test separation POC
Next
From: Jeff Davis
Date:
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)