Hello Tom,
Thanks for having a look at this attempt.
>> Attached is an attempt at improving the situation:
>
> I took a quick look at this and can't really convince myself that it
> improves our situation. The fact that it prints nothing if the runtime
> is outside of a tightly constrained range seems likely to hide the sort
> of bug you might hope such a test would detect. A simple example is
> that, if there's an off-by-one bug causing the test to run for 3 seconds
> not 2, this test script won't complain about it. Worse, if pgbench dumps
> core instantly on startup when given -p, this test script won't complain
> about it. And the test is of no value at all on very slow buildfarm
> critters such as valgrind or clobber-cache-always animals.
Hmmm.
Note that an instant core does not fall under "too slow" and is caught
anyway because pgbench return status is checked, and I expect it would not
be zero when core dumped.
Alas, testing time related features with zero assumption about time is
quite difficult:-)
The compromise I'm proposing is to skip time-related stuff if too slow.
The value I see is that it provides coverage for all time-related
features. I can also add a check that the run is *at least* 2 seconds when
two seconds are asked for, because otherwise something was certainly
amiss.
However it cannot do the impossible, i.e. check that something happens
every second if we cannot assume that some cycles are spared for the
process within a second. There is no miracle to expect.
>> (2) the test now only expects "progress: \d" from the output, so it is enough
>> that one progress is shown, whenever it is shown.
>
> Hm. Could we get somewhere by making the test look for that, and
> adjusting the loop logic inside pgbench so that (maybe only with the
> tested switch values) it's guaranteed to print at least one progress
> output regardless of timing, because it won't check for exit until after
> it's printed a log message?
I'll look into ensuring structuraly that at least one progress is shown.
>> For the random-ness related test (--sample-rate), we could add a feature to
>> pgbench which allows to control the random seed, so that the number of samples
>> could be known in advance for testing purposes.
>
> Don't see how locking down the seed gets us anywhere. The behavior of
> random() can't be assumed identical across different machines, even
> with the same seed.
Indeed, I surmised in between that expecting some portability from
random() is naïve. Controlling the seed could still be useful for testing
though, so I have submitted a patch for that.
--
Fabien.