Re: Timeout control within tests - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Timeout control within tests
Date
Msg-id 20220218071911.GB3506226@rfd.leadboat.com
Whole thread Raw
In response to Re: Timeout control within tests  (Andres Freund <andres@anarazel.de>)
Responses Re: Timeout control within tests  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Feb 17, 2022 at 09:48:25PM -0800, Andres Freund wrote:
> On 2022-02-17 21:28:42 -0800, Noah Misch wrote:
> > I propose to have environment variable PG_TEST_TIMEOUT_DEFAULT control the
> > timeout used in the places that currently hard-code 180s.
> 
> Meson's test runner has the concept of a "timeout multiplier" for ways of
> running tests. Meson's stuff is about entire tests (i.e. one tap test), so
> doesn't apply here, but I wonder if we shouldn't do something similar?

Hmmm.  It is good if the user can express an intent that continues to make
sense if we change the default timeout.  For the buildfarm use case, a
multiplier is moderately better on that axis (PG_TEST_TIMEOUT_MULTIPLIER=100
beats PG_TEST_TIMEOUT_DEFAULT=18000).  For the hacker use case, an absolute
value is substantially better on that axis (PG_TEST_TIMEOUT_DEFAULT=3 beats
PG_TEST_TIMEOUT_MULTIPLIER=.016666).

> That
> way we could adjust different timeouts with one setting, instead of many
> different fobs to adjust?

I expect multiplier vs. absolute value doesn't change the expected number of
settings.  If this change proceeds, we'd have three: PG_TEST_TIMEOUT_DEFAULT,
PGCTLTIMEOUT, and PGISOLATIONTIMEOUT.  PGCTLTIMEOUT is separate for conceptual
reasons, and PGISOLATIONTIMEOUT is separate for historical reasons.  There's
little use case for setting them to unequal values.  If Meson can pass down
the overall timeout in effect for the test file, we could compute all three
variables from the passed-down value.  Orthogonal to Meson, as I mentioned, we
could eliminate PGISOLATIONTIMEOUT.

timeouts.spec used to have substantial timeouts that had to elapse for the
test to pass.  (Commit 741d7f1 ended that era.)  A multiplier would have been
a good fit for that use case.  If a similar test came back, we'd likely want
two multipliers, a low one for elapsing timeouts and a high one for
non-elapsing timeouts.  A multiplier of 10-100 is reasonable for non-elapsing
timeouts, with the exact value being irrelevant on the buildfarm.  Setting an
elapsing timeout higher than necessary causes measurable waste.

One could argue for offering both a multiplier variable and an absolute-value
variable.  If there's just one variable, I think the absolute-value variable
is more compelling, due to the aforementioned hacker use case.  What do you
think?



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: automatically generating node support functions
Next
From: Mikael Kjellström
Date:
Subject: Re: Time to drop plpython2?