Thread: SSI 2PC coverage

SSI 2PC coverage

From

"Kevin Grittner"

Date:

05 July 2011, 14:06:58

[resending after gzip of test patch]

In reviewing the recent fix to 2PC coverage in SSI, I found some
cases which didn't seem to be covered.  Dan bit the bullet and came
up with an additional isolation test to rigorously cover all the
permutations, to find *all* 2PC statement orderings which weren't
working right.  Because it was so big, he pared out tests which were
redundant, in that they exercised the same code paths and pointed at
the same issues.  A patch to add this test is attached.  Run against
HEAD it shows the errors.  It's kinda big, but I think it's worth
having.

Attached is also a patch to fix those, so that all permutations
work.

I hope that this one be committed for beta3.

-Kevin

Attachment

Re: SSI 2PC coverage

From

Heikki Linnakangas

Date:

05 July 2011, 15:17:26

On 05.07.2011 20:06, Kevin Grittner wrote:
> [resending after gzip of test patch]
>
> In reviewing the recent fix to 2PC coverage in SSI, I found some
> cases which didn't seem to be covered.  Dan bit the bullet and came
> up with an additional isolation test to rigorously cover all the
> permutations, to find *all* 2PC statement orderings which weren't
> working right.  Because it was so big, he pared out tests which were
> redundant, in that they exercised the same code paths and pointed at
> the same issues.  A patch to add this test is attached.  Run against
> HEAD it shows the errors.  It's kinda big, but I think it's worth
> having.
>
> Attached is also a patch to fix those, so that all permutations
> work.

I think that needs some explanation, why only those SxactIsCommitted() 
tests need to be replaced with SxactIsPrepared()? What about the first 
SxactIsCommitted() test in OnConflict_CheckForSerializationFailure(), 
for instance?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: SSI 2PC coverage

From

"Kevin Grittner"

Date:

05 July 2011, 15:20:22

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
>> Attached is also a patch to fix those, so that all permutations
>> work.
> 
> I think that needs some explanation, why only those
> SxactIsCommitted() tests need to be replaced with
> SxactIsPrepared()? What about the first SxactIsCommitted() test in
> OnConflict_CheckForSerializationFailure(), for instance?
Well, that's covered in the other patch.  This one has the minimum
required to get all the permutations of 2PC working correctly.  It
was looking at just such questions as you pose here that led us to
the other patch.  Neither macro has quite the right semantics
without the lower level work in the "atomic commit" patch.
-Kevin

Re: SSI 2PC coverage

From

Dan Ports

Date:

05 July 2011, 17:31:39

On Tue, Jul 05, 2011 at 09:14:30PM +0300, Heikki Linnakangas wrote:
> I think that needs some explanation, why only those SxactIsCommitted() 
> tests need to be replaced with SxactIsPrepared()?

Here is the specific problem this patch addresses:

If there's a dangerous structure T0 ---> T1 ---> T2, and T2 commits
first, we need to abort something. If T2 commits before both conflicts
appear, then it should be caught by
OnConflict_CheckForSerializationFailure. If both conflicts appear
before T2 commits, it should be caught by
PreCommit_CheckForSerializationFailure. But that is actually run before
T2 *prepares*.

So the problem occurs if T2 is prepared but not committed when the
second conflict is detected. OnConflict_CFSF deems that OK, because T2
hasn't committed yet. PreCommit_CFSF doesn't see a problem either,
because the conflict didn't exist when it ran (before the transaction
prepared)

This patch fixes that by having OnConflict_CFSF declare a serialization
failure in this circumstance if T2 is already prepared, not just if
it's committed.

Although it fixes the situation described above, I wasn't initially
confident that there weren't other problematic cases. I wrote the
attached test case to convince myself of that. Because it tests all
possible sequences of conflict/prepare/commit that should lead to
serialization failure, the fact that they do all fail (with this patch)
indicates that these are the only checks that need to be changed.

> What about the first 
> SxactIsCommitted() test in OnConflict_CheckForSerializationFailure(), 
> for instance?

That one only comes up if the SERIALIZABLEXACT for one of the
transactions involved has been freed, and the RWConflict that points to
it has been replaced by a flag. That only happens if "writer" has
previously called ReleasePredicateLocks.

Dan

-- 
Dan R. K. Ports              MIT CSAIL                http://drkp.net/

Re: SSI 2PC coverage

From

Heikki Linnakangas

Date:

07 July 2011, 12:12:51

On 05.07.2011 20:06, Kevin Grittner wrote:
> [resending after gzip of test patch]
>
> In reviewing the recent fix to 2PC coverage in SSI, I found some
> cases which didn't seem to be covered.  Dan bit the bullet and came
> up with an additional isolation test to rigorously cover all the
> permutations, to find *all* 2PC statement orderings which weren't
> working right.  Because it was so big, he pared out tests which were
> redundant, in that they exercised the same code paths and pointed at
> the same issues.  A patch to add this test is attached.  Run against
> HEAD it shows the errors.  It's kinda big, but I think it's worth
> having.

I agree it'd be very nice to have this test, but 2.3 MB of expected 
output is a bit excessive. Let's try to cut that down into something 
more palatable.

There's two expected output files for this, one for max_prepared_xacts=0 
and another for the "normal" case. The max_prepared_xacts=0 case isn't 
very interesting, since all the PREPARE TRANSACTION commands fail. I 
think we should adjust the test harness to not run these tests at all if 
max_prepared_xacts=0. It would be better to skip the test and print out 
a notice pointing out that it was not run, it'll just give a false sense 
of security to run the test and report success, when it didn't test 
anything useful.

That alone cuts the size of the expected output to about 1 MB. That's 
much better, although it's still a lot of weight just for expected 
output. However, it compresses extremely well, to about 16 KB, so this 
isn't an issue for the size of distribution tarballs and such, only for 
git checkouts and on-disk size of extracted tarballs. I think that would 
be acceptable, although we could easily cut it a bit further if we want 
to. For example leaving out the word "step" from all the lines of 
executed test steps would cut it by about 80 KB.

> Attached is also a patch to fix those, so that all permutations
> work.

Thanks, committed the bug fix with some additional comments.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: SSI 2PC coverage

From

"Kevin Grittner"

Date:

07 July 2011, 12:44:17

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
> On 05.07.2011 20:06, Kevin Grittner wrote:
>> In reviewing the recent fix to 2PC coverage in SSI, I found some
>> cases which didn't seem to be covered.  Dan bit the bullet and
>> came up with an additional isolation test to rigorously cover all
>> the permutations, to find *all* 2PC statement orderings which
>> weren't working right.  Because it was so big, he pared out tests
>> which were redundant, in that they exercised the same code paths
>> and pointed at the same issues.  A patch to add this test is
>> attached.  Run against HEAD it shows the errors.  It's kinda big,
>> but I think it's worth having.
> 
> I agree it'd be very nice to have this test, but 2.3 MB of
> expected output is a bit excessive. Let's try to cut that down
> into something more palatable.
OK
> There's two expected output files for this, one for
> max_prepared_xacts=0 and another for the "normal" case. The
> max_prepared_xacts=0 case isn't very interesting, since all the
> PREPARE TRANSACTION commands fail. I think we should adjust the
> test harness to not run these tests at all if
> max_prepared_xacts=0. It would be better to skip the test and
> print out a notice pointing out that it was not run, it'll just
> give a false sense of security to run the test and report success,
> when it didn't test anything useful.
> 
> That alone cuts the size of the expected output to about 1 MB.
OK.  I'll work on this tonight unless Dan jumps in to claim it.
> That's much better, although it's still a lot of weight just for
> expected output. However, it compresses extremely well, to about
> 16 KB, so this isn't an issue for the size of distribution
> tarballs and such, only for git checkouts and on-disk size of
> extracted tarballs. I think that would be acceptable, although we
> could easily cut it a bit further if we want to. For example
> leaving out the word "step" from all the lines of executed test
> steps would cut it by about 80 KB.
That seems simple enough.  Any other ideas?
>> Attached is also a patch to fix those, so that all permutations
>> work.
> 
> Thanks, committed the bug fix with some additional comments.
Thanks!
-Kevin

Re: SSI 2PC coverage

From

Heikki Linnakangas

Date:

18 August 2011, 10:57:54

On 07.07.2011 18:43, Kevin Grittner wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  wrote:
>> On 05.07.2011 20:06, Kevin Grittner wrote:
>
>> There's two expected output files for this, one for
>> max_prepared_xacts=0 and another for the "normal" case. The
>> max_prepared_xacts=0 case isn't very interesting, since all the
>> PREPARE TRANSACTION commands fail. I think we should adjust the
>> test harness to not run these tests at all if
>> max_prepared_xacts=0. It would be better to skip the test and
>> print out a notice pointing out that it was not run, it'll just
>> give a false sense of security to run the test and report success,
>> when it didn't test anything useful.
>>
>> That alone cuts the size of the expected output to about 1 MB.
>
> OK.  I'll work on this tonight unless Dan jumps in to claim it.

Committed. I removed the second expected output file, and marked the 
prepared-transactions tests in the schedule as "ignore" instead. That 
way if max_prepared_transactions=0, you get a notice that the test case 
failed, but pg_regress still returns 0 as exit status.

>> That's much better, although it's still a lot of weight just for
>> expected output. However, it compresses extremely well, to about
>> 16 KB, so this isn't an issue for the size of distribution
>> tarballs and such, only for git checkouts and on-disk size of
>> extracted tarballs. I think that would be acceptable, although we
>> could easily cut it a bit further if we want to. For example
>> leaving out the word "step" from all the lines of executed test
>> steps would cut it by about 80 KB.
>
> That seems simple enough.  Any other ideas?

I think it's good enough as it is. I did change the lexer slightly, to 
trim whitespace from the beginning and end of SQL blocks. This cuts the 
size of expected output a bit, and makes it look nicer anyway.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: SSI 2PC coverage

From

Tom Lane

Date:

18 August 2011, 11:07:59

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> On 07.07.2011 18:43, Kevin Grittner wrote:
>> OK.  I'll work on this tonight unless Dan jumps in to claim it.

> Committed. I removed the second expected output file, and marked the 
> prepared-transactions tests in the schedule as "ignore" instead. That 
> way if max_prepared_transactions=0, you get a notice that the test case 
> failed, but pg_regress still returns 0 as exit status.

Why did you only commit on 9.1 branch?
        regards, tom lane

Re: SSI 2PC coverage

From

Heikki Linnakangas

Date:

18 August 2011, 11:13:02

On 18.08.2011 17:07, Tom Lane wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
>> On 07.07.2011 18:43, Kevin Grittner wrote:
>>> OK.  I'll work on this tonight unless Dan jumps in to claim it.
>
>> Committed. I removed the second expected output file, and marked the
>> prepared-transactions tests in the schedule as "ignore" instead. That
>> way if max_prepared_transactions=0, you get a notice that the test case
>> failed, but pg_regress still returns 0 as exit status.
>
> Why did you only commit on 9.1 branch?

I did "git push origin master REL9_1_STABLE", and didn't notice that it 
failed on master. Rebased and pushed, thanks!

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: SSI 2PC coverage

From

"Kevin Grittner"

Date:

23 August 2011, 15:13:45

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

> Committed. I removed the second expected output file, and marked
> the prepared-transactions tests in the schedule as "ignore"
> instead. That way if max_prepared_transactions=0, you get a notice
> that the test case failed, but pg_regress still returns 0 as exit
> status.

Thanks!  Sorry I didn't get to it.  Things got really busy here.

> I did change the lexer slightly, to trim whitespace from the
> beginning and end of SQL blocks. This cuts the size of expected
> output a bit, and makes it look nicer anyway.

OK.  You missed the alternative outputs for a couple files.  These
are needed to get successful tests with
default_transaction_isolation = 'serializable' or 'repeatable read'.
Patch attached.

-Kevin

Attachment

isolation-test-ws-1.patch

Re: SSI 2PC coverage

From

Alvaro Herrera

Date:

24 August 2011, 18:36:14

Excerpts from Kevin Grittner's message of mar ago 23 15:07:33 -0300 2011:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

> > I did change the lexer slightly, to trim whitespace from the
> > beginning and end of SQL blocks. This cuts the size of expected
> > output a bit, and makes it look nicer anyway.
>  
> OK.  You missed the alternative outputs for a couple files.  These
> are needed to get successful tests with
> default_transaction_isolation = 'serializable' or 'repeatable read'.
> Patch attached.

Thanks, applied.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: SSI 2PC coverage

From

Alvaro Herrera

Date:

24 August 2011, 18:48:51

Excerpts from Heikki Linnakangas's message of jue ago 18 09:57:34 -0400 2011:

> Committed. I removed the second expected output file, and marked the 
> prepared-transactions tests in the schedule as "ignore" instead. That 
> way if max_prepared_transactions=0, you get a notice that the test case 
> failed, but pg_regress still returns 0 as exit status.

After having to play with this, I didn't like it very much, because
regression.diffs gets spammed with the (rather massive and completely
useless) diff in that test.  For the xml tests, rather than ignoring it
fail on an installation without libxml, we use an alternative output.

Unless there are objections, I will commit the alternative file proposed
by Dan.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: SSI 2PC coverage

From

Tom Lane

Date:

24 August 2011, 22:12:29

Alvaro Herrera <alvherre@commandprompt.com> writes:
> After having to play with this, I didn't like it very much, because
> regression.diffs gets spammed with the (rather massive and completely
> useless) diff in that test.  For the xml tests, rather than ignoring it
> fail on an installation without libxml, we use an alternative output.

> Unless there are objections, I will commit the alternative file proposed
> by Dan.

+1 ... "ignore" is a pretty ugly hack here.

Eventually we need some way of detecting that specific tests should be
skipped because they're irrelevant to the current system configuration.
contrib/sepgsql is already doing something of the sort, but it's rather
crude ...
        regards, tom lane

Re: SSI 2PC coverage

From

Robert Haas

Date:

24 August 2011, 22:30:48

On Wed, Aug 24, 2011 at 9:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> After having to play with this, I didn't like it very much, because
>> regression.diffs gets spammed with the (rather massive and completely
>> useless) diff in that test.  For the xml tests, rather than ignoring it
>> fail on an installation without libxml, we use an alternative output.
>
>> Unless there are objections, I will commit the alternative file proposed
>> by Dan.
>
> +1 ... "ignore" is a pretty ugly hack here.
>
> Eventually we need some way of detecting that specific tests should be
> skipped because they're irrelevant to the current system configuration.
> contrib/sepgsql is already doing something of the sort, but it's rather
> crude ...

I'm fairly unhappy with the fact that we don't have a better way of
deciding whether we should even *build* sepgsql.  The --with-selinux
flag basically doesn't do anything except enable the compile of that
contrib module, and that's kind of a lame use of a configure flag.

Not that I have a better idea...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: SSI 2PC coverage

From

Alvaro Herrera

Date:

25 August 2011, 17:50:04

Excerpts from Tom Lane's message of mié ago 24 22:11:58 -0300 2011:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > After having to play with this, I didn't like it very much, because
> > regression.diffs gets spammed with the (rather massive and completely
> > useless) diff in that test.  For the xml tests, rather than ignoring it
> > fail on an installation without libxml, we use an alternative output.
> 
> > Unless there are objections, I will commit the alternative file proposed
> > by Dan.
> 
> +1 ... "ignore" is a pretty ugly hack here.

Done.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support