Re: [COMMITTERS] pgsql: Bloom index contrib module - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [COMMITTERS] pgsql: Bloom index contrib module
Date
Msg-id 28003.1460252943@sss.pgh.pa.us
Whole thread Raw
In response to Re: [COMMITTERS] pgsql: Bloom index contrib module  (Noah Misch <noah@leadboat.com>)
Responses Re: [COMMITTERS] pgsql: Bloom index contrib module  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Noah Misch <noah@leadboat.com> writes:
> On Sat, Apr 09, 2016 at 11:50:08AM -0400, Tom Lane wrote:
>> Would it be possible to dial down the amount of runtime consumed by
>> the regression tests for this module?

> I find this added test duration reasonable.  If someone identifies a way to
> realize similar coverage with lower duration, I'd value that contribution.  -1
> for meeting some runtime target at the expense of coverage.  Older modules
> have rather little test coverage, so they're poor as benchmarks.

That argument is fine in principle, but the thing about applications such
as SQL databases is that shoving more rows through them doesn't in itself
increase test coverage; it may just iterate the loops more times.

The contrib/bloom regression test currently creates a 100000-row table.
I wondered how many rows were really required to achieve the same level
of code coverage.  I experimented with gcov, and what I find is that
as of HEAD that test provides these levels of coverage:
                Line Coverage           Functions

contrib/bloom/:  75.4 % 374 / 496       93.1 %  27 / 29
generic_xlog.c:  68.5 % 98 / 143        77.8 %  7 / 9

(I looked specifically at generic_xlog.c because the main point of this
contrib module, so far as the core code is concerned, is to exercise
that file.)

I was depressed, though not entirely surprised, to find that you get
exactly that same line-count coverage if the table size is cut back
to ONE row.  Only when you remove the INSERT command entirely is there
any change in these gcov statistics.  Therefore, the last 99999 rows
are wasting my time, and yours, and that of every other developer who
will ever run this test suite in future.  I do not like having the
community's time wasted.

I'm all for improving our test coverage, but just shoving lots of rows
through a not-very-well-designed-in-the-first-place regression test
isn't a reliable way to do that.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [COMMITTERS] pgsql: Bloom index contrib module
Next
From: Tom Lane
Date:
Subject: Re: [COMMITTERS] pgsql: Bloom index contrib module