Re: why can the isolation tester handle only one waiting process? - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: why can the isolation tester handle only one waiting process?
Date
Msg-id 20150815051716.GT5232@alvherre.pgsql
Whole thread Raw
In response to Re: why can the isolation tester handle only one waiting process?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: why can the isolation tester handle only one waiting process?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas wrote:
> On Fri, Aug 14, 2015 at 2:57 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> > Hmm, clearly you couldn't attach the info to the step itself, because a
> > step that blocks in one permutation doesn't necessarily block in every
> > permutation.  You could attach it to each step that needed it in the
> > permutation, but then it wouldn't work to leave permutation
> > specification out for such a test.  Maybe that's an acceptable
> > restriction if you cause the test to fail with a specific error instead
> > of stalling forever (which is what happens currently IIRC).
> 
> After some study, I think the best thing to do here is change the way
> we handle the case where we reach a step that the use of a connection
> that is currently blocked on a lock.  Right now, we handle that by
> declaring the permutation invalid; I'd like to change that so that
> consider that a cue to wait for that connnection to unblock itself.
> This will require a number of tests that currently blindly run through
> all permutations to specify a list of permutations, or they will hang.

Well, hanging forever may not be all that great.  Buildfarm animals with
test processes stuck probably won't be happy.  Maybe put a cap on the
time we're willing to wait; something like a minute should suffice for
all reasonable tests.  At the same time I wonder if iterating as quickly
as possible is really such a hot idea; why don't we sleep even 100ms if
nothing is to be done immediately?  That would reduce log traffic if you
have log_statements=all, for one thing ...

I guess (from a patch author perspective) we can just use
isolationtester -n to produce appropriate permutation lines when
developing a spec file, and then prune the ones causing trouble.

FWIW I tried this with the spec I posted at 
http://www.postgresql.org/message-id/20141212205254.GC1768@alvh.no-ip.org
and it seems to work fine (modulo a bug in the spec itself).  I didn't
try reverting the patch that fixed the bug.

> But I'm not sure that's such a bad thing, because running through all
> permutations in those cases provides no additional test coverage.
> Each invalid permutation runs the sequence of steps only up until the
> point where it chooses an invalid next step.  Therefore, each invalid
> permutation is testing an initial prefix of the steps tested by some
> valid permutation.  If the "invalid" permutation ceased to be invalid,
> because the command at which we give up returned immediately rather
> than waiting, that would also change the test output of the other,
> valid test of which it is the initial prefix.  And therefore, at least
> as it seems to me, testing the invalid permutations is just a waste of
> CPU time, and we'd be better off not doing it.

Well, the number of tests that actually exercise this is not large.
More time is spent in the timeout test, ISTM (even though the CPU is
sleeping during that, but it's still wasted clock time).

> Actually, I'm really rather wondering if the list of valid
> permutations should also be pruned for some of these tests.  Some of
> these output files are thousands of lines long, and I'm not sure that
> somebody has really gone through that whole file and made sure that
> the output of each permutation is expected.  And I'm sure some of them
> are functionally identical.

No objections there, but alter-table-1 and alter-table-2 seem to be the
only tests that have thousands of lines long of expected output and also
have invalid permutations in the expected output.  The only others with
1k+ lines are two-ids, receipt-reports and prepared-transactions, which
don't have invalid permutations.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: why can the isolation tester handle only one waiting process?
Next
From: Petr Jelinek
Date:
Subject: Re: Test code is worth the space