Re: Add an optional timeout clause to isolationtester step. - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Add an optional timeout clause to isolationtester step.
Date
Msg-id 20200307061615.c47sju2bytuiaazv@nol
Whole thread Raw
In response to Re: Add an optional timeout clause to isolationtester step.  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Add an optional timeout clause to isolationtester step.
List pgsql-hackers
On Sat, Mar 07, 2020 at 10:41:42AM +0900, Michael Paquier wrote:
> On Fri, Mar 06, 2020 at 02:15:47PM +0100, Julien Rouhaud wrote:
> > Here's a patch to add an optional "timeout val" clause to isolationtester's
> > step definition.  When used, isolationtester will actively wait on the query
> > rather than continuing with the permutation next step, and will issue a cancel
> > once the defined timeout is reached.  I also added as a POC the previous
> > regression tests for invalid TOAST indexes, updated to use this new
> > infrastructure (which won't pass as long as the original bug for invalid TOAST
> > indexes isn't fixed).
>
> One problem with this approach is that it does address the stability
> of the test on very slow machines, and there are some of them in the
> buildfarm.  Taking your patch, I can make the test fail by applying
> the following sleep because the query would be cancelled before some
> of the indexes are marked as invalid:
> --- a/src/backend/commands/indexcmds.c
> +++ b/src/backend/commands/indexcmds.c
> @@ -3046,6 +3046,8 @@ ReindexRelationConcurrently(Oid relationOid, int
> options)
>     CommitTransactionCommand();
>         StartTransactionCommand();
>
> +   pg_usleep(100000L * 10); /* 10s */
> +
>     /*
>      * Phase 2 of REINDEX CONCURRENTLY
>
> Another problem is that on faster machines the test is slow because of
> the timeout used.  What are your thoughts about having instead a
> cancel meta-command instead?

Looking at timeouts.spec and e.g. a7921f71a3c, it seems that we already chose
to fix this problem by having a timeout long enough to statisfy the slower
buildfarm members, even when running on fast machines, so I assumed that the
same approach could be used here.

I agree that the 1s timeout I used is maybe too low, but that's easy enough to
change.  Another point is that it's possible to have a close behavior without
this patch by using a statement_timeout (the active wait does change things
though), but the spec files would be more verbose.



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Fastpath while arranging the changes in LSN order in logical decoding
Next
From: Peter Eisentraut
Date:
Subject: Remove utils/acl.h from catalog/objectaddress.h