Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2019-Aug-07, Tom Lane wrote:
>> The problem in "timeouts" is that it has to use drearily long timeouts
>> to be sure that the behavior will be stable even on really slow machines
>> (think CLOBBER_CACHE_ALWAYS or valgrind --- it can take seconds for them
>> to reach a waiting state that other machines reach quickly). If we run
>> such tests in parallel with anything else, that risks re-introducing the
>> instability. I'm not very sure what we can do about that. But you might
>> be right that unless we can solve that, there's not going to be much to be
>> gained from parallelizing the rest.
> It runs 8 different permutations serially. If we run the same
> permutations in parallel, it would finish much quicker, and we wouldn't
> run it in parallel with anything that would take up CPU time, since
> they're all just sleeping.
Wrong ... they're *not* just sleeping, in the problem cases. They're
eating cycles due to CLOBBER_CACHE_ALWAYS or valgrind. They're on their
way to sleeping; but they have to get there before the timeout elapses,
or the test shows unexpected results.
Admittedly, as long as you've got more CPUs than tests, it should still
be OK. But if you don't, boom.
regards, tom lane