On Mon, Mar 09, 2020 at 04:47:27PM +0900, Michael Paquier wrote:
> On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote:
> > The arbitrarily-set timeouts that exist in some of the isolation tests
> > are horrid kluges that have caused us lots of headaches in the past
> > and no doubt will again in the future. Aside from occasionally failing
> > when a machine is particularly overloaded, they cause the tests to
> > take far longer than necessary on decently-fast machines. So ideally
> > we'd get rid of those entirely in favor of some more-dynamic approach.
> > Admittedly, I have no proposal for what that would be. But adding yet
> > more ways to set a (guaranteed-to-be-wrong) timeout seems like the
> > wrong direction to be going in. What's the actual need that you're
> > trying to deal with?
>
> As a matter of fact, the buildfarm member petalura just reported a
> failure with the isolation test "timeouts", the machine being
> extremely slow:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2020-03-08%2011%3A20%3A05
>
> test timeouts ... FAILED 60330 ms
> [...]
> -step update: DELETE FROM accounts WHERE accountid = 'checking'; <waiting ...>
> -step update: <... completed>
> +step update: DELETE FROM accounts WHERE accountid = 'checking';
> ERROR: canceling statement due to statement timeout
Indeed. I guess we could add some kind of environment variable facility in
isolationtester to let slow machine owner put a way bigger timeout without
making the test super slow for everyone else, but that seems overkill for just
one test, and given the other thread about deploying REL_11 build-farm client,
that wouldn't be an immediate fix either.