Thread: numeric_big in make check?

numeric_big in make check?

From
Daniel Gustafsson
Date:
numeric_big has been left out of parallel_schedule, requiring EXTRA_TESTS to
run it, since going in back in 1999 (AFAICT it was even the reason EXTRA_TESTS
was invented).  The original commit states that it's huge, and it probably was.
Today it runs faster than many tests we have in parallel_schedule, even on slow
hardware like my ~5 year old laptop.  Repeated runs in CI at various parallel
groups place it at 50% the runtime of triggers.sql and 25% of brin.sql.

To make sure it's executed and not silently breaks, is it time to add this to
the regular make check?

--
Daniel Gustafsson




Re: numeric_big in make check?

From
Tom Lane
Date:
Daniel Gustafsson <daniel@yesql.se> writes:
> numeric_big has been left out of parallel_schedule, requiring EXTRA_TESTS to
> run it, since going in back in 1999 (AFAICT it was even the reason EXTRA_TESTS
> was invented).  The original commit states that it's huge, and it probably was.
> Today it runs faster than many tests we have in parallel_schedule, even on slow
> hardware like my ~5 year old laptop.  Repeated runs in CI at various parallel
> groups place it at 50% the runtime of triggers.sql and 25% of brin.sql.

> To make sure it's executed and not silently breaks, is it time to add this to
> the regular make check?

Or we could just flush it.  It's never detected a bug, and I think
you'd find that it adds zero code coverage (or if not, we could
fix that in a far more surgical and less expensive manner).

            regards, tom lane



Re: numeric_big in make check?

From
Daniel Gustafsson
Date:
> On 19 Feb 2024, at 12:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Daniel Gustafsson <daniel@yesql.se> writes:

>> To make sure it's executed and not silently breaks, is it time to add this to
>> the regular make check?
>
> Or we could just flush it.  It's never detected a bug, and I think
> you'd find that it adds zero code coverage (or if not, we could
> fix that in a far more surgical and less expensive manner).

I don't have a problem with that, there isn't much value in keeping it
(especially when not connected to make check so that it actually runs).  That
also means we can remove two make targets which hadn't been ported to meson to
get us a hair closer to parity.

--
Daniel Gustafsson


Attachment

Re: numeric_big in make check?

From
Dean Rasheed
Date:
> > On 19 Feb 2024, at 12:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Or we could just flush it.  It's never detected a bug, and I think
> > you'd find that it adds zero code coverage (or if not, we could
> > fix that in a far more surgical and less expensive manner).
>

Off the top of my head, I can't say to what extent that's true, but it
wouldn't surprise me if at least some of the tests added in the last 4
commits to touch that file aren't covered by tests elsewhere. Indeed
that certainly looks like the case for 18a02ad2a5. I'm sure those
tests could be pared down though.

Regards,
Dean



Re: numeric_big in make check?

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On 19 Feb 2024, at 12:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Or we could just flush it.  It's never detected a bug, and I think
>> you'd find that it adds zero code coverage (or if not, we could
>> fix that in a far more surgical and less expensive manner).

> Off the top of my head, I can't say to what extent that's true, but it
> wouldn't surprise me if at least some of the tests added in the last 4
> commits to touch that file aren't covered by tests elsewhere. Indeed
> that certainly looks like the case for 18a02ad2a5. I'm sure those
> tests could be pared down though.

I thought I'd try to acquire some actual facts here, so I compared
the code coverage shown by "make check" as of HEAD, versus "make
check" after adding numeric_big to parallel_schedule.  I saw the
following lines of numeric.c as being covered in the second run
and not the first:

numeric():
1285             || !NUMERIC_IS_SHORT(num)))
1293             new->choice.n_long.n_sign_dscale = NUMERIC_SIGN(new) |
1294                 ((uint16) dscale & NUMERIC_DSCALE_MASK);
div_var_fast():
9185             idivisor = idivisor * NBASE + var2->digits[1];
9186             idivisor_weight--;
sqrt_var():
10073         res_ndigits = res_weight + 1 - (-rscale - 1) / DEC_DIGITS;

Pretty poor return on investment for the runtime consumed.  I don't
object to adding something to numeric.sql that's targeted at adding
coverage for these (or anyplace else that's not covered), but let's
not just throw cycles at the problem.

Oddly, there were a few lines in numeric_poly_combine and
int8_avg_combine that were hit in the first run and not the second.
Apparently our tests of parallel aggregation aren't as reproducible as
one could wish.

            regards, tom lane



Re: numeric_big in make check?

From
Dean Rasheed
Date:
On Mon, 19 Feb 2024 at 15:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I thought I'd try to acquire some actual facts here, so I compared
> the code coverage shown by "make check" as of HEAD, versus "make
> check" after adding numeric_big to parallel_schedule.  I saw the
> following lines of numeric.c as being covered in the second run
> and not the first:
>

I don't think that tells the whole story. Code coverage only tells you
whether a particular line of code has been hit, not whether it has
been properly tested with all the values that might lead to different
cases. For example:

> sqrt_var():
> 10073         res_ndigits = res_weight + 1 - (-rscale - 1) / DEC_DIGITS;
>

To get code coverage of this line, all you need to do is ensure that
sqrt_var() is called with rscale < -1 (which can only happen from the
range-reduction code in ln_var()). You could do that by computing
ln(1e50), which results in calling sqrt_var() with rscale = -2,
causing that line of code in sqrt_var() to be hit. That would satisfy
code coverage, but I would argue that you've only really tested that
line of code properly if you also hit it with rscale = -3, and
probably a few more values, to check that the round-down division is
working as intended.

Similarly, looking at commit 18a02ad2a5, the crucial code change was
the following in power_var():

-    val = Max(val, -NUMERIC_MAX_RESULT_SCALE);
-    val = Min(val, NUMERIC_MAX_RESULT_SCALE);
    val *= 0.434294481903252;    /* approximate decimal result weight */

Any test that calls numeric_power() is going to hit those lines of
code, but to see a failure, it was necessary to hit them with the
absolute value of "val" greater than NUMERIC_MAX_RESULT_SCALE, which
is why that commit added 2 new test cases to numeric_big, calling
power_var() with "val" outside that range. Code coverage is never
going to tell you whether or not that is being tested, since the code
change was to delete lines. Even if that weren't the case, any line of
code involving Min() or Max() has 2 branches, and code coverage won't
tell you if you've hit both of them.

> Pretty poor return on investment for the runtime consumed.  I don't
> object to adding something to numeric.sql that's targeted at adding
> coverage for these (or anyplace else that's not covered), but let's
> not just throw cycles at the problem.
>

I agree that blindly performing a bunch of large computations (just
throwing cycles at the problem) is not a good approach to testing. And
maybe that's a fair characterisation of parts of numeric_big. However,
it also contains some fairly well-targeted tests intended to test
specific edge cases that only occur with particular ranges of inputs,
which don't necessarily show up as increased code coverage.

So I think this requires more careful analysis. Simply deleting
numeric_big and adding tests to numeric.sql until the same level of
code coverage is achieved will not give the same level of testing.

It's also worth noting that the cost of running numeric_big has come
down very significantly over the last few years, as can be seen by
running the current numeric_big script against old backends. There's a
lot of random variation in the timings, but the trend is very clear:

9.5    1.641s
9.6    0.856s
10    0.845s
11    0.750s
12    0.760s
13    0.672s
14    0.430s
15    0.347s
16    0.336s

Arguably, this is a useful benchmark to spot performance regressions
and test proposed performance-improving patches.

If I run "EXTRA_TESTS=numeric_big make check | grep 'ok ' | sort
-nrk5", numeric_big is not in the top 10 most expensive tests (it's
usually down at around 15'th place).

Looking at the script itself, the addition, subtraction,
multiplication and division tests at the top are probably pointless,
since I would expect those operations to be tested adequately (and
probably more thoroughly) by the transcendental test cases. In fact, I
think it would probably be OK to delete everything above line 650, and
just keep the bottom half of the script -- the pow(), exp(), ln() and
log() tests, which cover various edge cases, as well as exercising
basic arithmetic operations internally. We might want to check that
I/O of large numerics is still being tested properly though.

If we did that, numeric_big would be even further down the list of
expensive tests, and I'd say it should be run by default.

Regards,
Dean



Re: numeric_big in make check?

From
Daniel Gustafsson
Date:
> On 20 Feb 2024, at 14:23, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

> If we did that, numeric_big would be even further down the list of
> expensive tests, and I'd say it should be run by default.

My motivation for raising this was to get a test which is executed as part of
parallel_schedule to make failures aren't missed.  If we get there by slimming
down numeric_big to keep the unique coverage then that sounds like a good plan
to me.

--
Daniel Gustafsson




Re: numeric_big in make check?

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> Looking at the script itself, the addition, subtraction,
> multiplication and division tests at the top are probably pointless,
> since I would expect those operations to be tested adequately (and
> probably more thoroughly) by the transcendental test cases. In fact, I
> think it would probably be OK to delete everything above line 650, and
> just keep the bottom half of the script -- the pow(), exp(), ln() and
> log() tests, which cover various edge cases, as well as exercising
> basic arithmetic operations internally.

I could go with that, but let's just transpose those into numeric.

            regards, tom lane



Re: numeric_big in make check?

From
Dean Rasheed
Date:
On Tue, 20 Feb 2024 at 15:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> > Looking at the script itself, the addition, subtraction,
> > multiplication and division tests at the top are probably pointless,
> > since I would expect those operations to be tested adequately (and
> > probably more thoroughly) by the transcendental test cases. In fact, I
> > think it would probably be OK to delete everything above line 650, and
> > just keep the bottom half of the script -- the pow(), exp(), ln() and
> > log() tests, which cover various edge cases, as well as exercising
> > basic arithmetic operations internally.
>
> I could go with that, but let's just transpose those into numeric.
>

Works for me.

Regards,
Dean