Thread: SSI 2PC coverage
[resending after gzip of test patch] In reviewing the recent fix to 2PC coverage in SSI, I found some cases which didn't seem to be covered. Dan bit the bullet and came up with an additional isolation test to rigorously cover all the permutations, to find *all* 2PC statement orderings which weren't working right. Because it was so big, he pared out tests which were redundant, in that they exercised the same code paths and pointed at the same issues. A patch to add this test is attached. Run against HEAD it shows the errors. It's kinda big, but I think it's worth having. Attached is also a patch to fix those, so that all permutations work. I hope that this one be committed for beta3. -Kevin
Attachment
On 05.07.2011 20:06, Kevin Grittner wrote: > [resending after gzip of test patch] > > In reviewing the recent fix to 2PC coverage in SSI, I found some > cases which didn't seem to be covered. Dan bit the bullet and came > up with an additional isolation test to rigorously cover all the > permutations, to find *all* 2PC statement orderings which weren't > working right. Because it was so big, he pared out tests which were > redundant, in that they exercised the same code paths and pointed at > the same issues. A patch to add this test is attached. Run against > HEAD it shows the errors. It's kinda big, but I think it's worth > having. > > Attached is also a patch to fix those, so that all permutations > work. I think that needs some explanation, why only those SxactIsCommitted() tests need to be replaced with SxactIsPrepared()? What about the first SxactIsCommitted() test in OnConflict_CheckForSerializationFailure(), for instance? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: >> Attached is also a patch to fix those, so that all permutations >> work. > > I think that needs some explanation, why only those > SxactIsCommitted() tests need to be replaced with > SxactIsPrepared()? What about the first SxactIsCommitted() test in > OnConflict_CheckForSerializationFailure(), for instance? Well, that's covered in the other patch. This one has the minimum required to get all the permutations of 2PC working correctly. It was looking at just such questions as you pose here that led us to the other patch. Neither macro has quite the right semantics without the lower level work in the "atomic commit" patch. -Kevin
On Tue, Jul 05, 2011 at 09:14:30PM +0300, Heikki Linnakangas wrote: > I think that needs some explanation, why only those SxactIsCommitted() > tests need to be replaced with SxactIsPrepared()? Here is the specific problem this patch addresses: If there's a dangerous structure T0 ---> T1 ---> T2, and T2 commits first, we need to abort something. If T2 commits before both conflicts appear, then it should be caught by OnConflict_CheckForSerializationFailure. If both conflicts appear before T2 commits, it should be caught by PreCommit_CheckForSerializationFailure. But that is actually run before T2 *prepares*. So the problem occurs if T2 is prepared but not committed when the second conflict is detected. OnConflict_CFSF deems that OK, because T2 hasn't committed yet. PreCommit_CFSF doesn't see a problem either, because the conflict didn't exist when it ran (before the transaction prepared) This patch fixes that by having OnConflict_CFSF declare a serialization failure in this circumstance if T2 is already prepared, not just if it's committed. Although it fixes the situation described above, I wasn't initially confident that there weren't other problematic cases. I wrote the attached test case to convince myself of that. Because it tests all possible sequences of conflict/prepare/commit that should lead to serialization failure, the fact that they do all fail (with this patch) indicates that these are the only checks that need to be changed. > What about the first > SxactIsCommitted() test in OnConflict_CheckForSerializationFailure(), > for instance? That one only comes up if the SERIALIZABLEXACT for one of the transactions involved has been freed, and the RWConflict that points to it has been replaced by a flag. That only happens if "writer" has previously called ReleasePredicateLocks. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
On 05.07.2011 20:06, Kevin Grittner wrote: > [resending after gzip of test patch] > > In reviewing the recent fix to 2PC coverage in SSI, I found some > cases which didn't seem to be covered. Dan bit the bullet and came > up with an additional isolation test to rigorously cover all the > permutations, to find *all* 2PC statement orderings which weren't > working right. Because it was so big, he pared out tests which were > redundant, in that they exercised the same code paths and pointed at > the same issues. A patch to add this test is attached. Run against > HEAD it shows the errors. It's kinda big, but I think it's worth > having. I agree it'd be very nice to have this test, but 2.3 MB of expected output is a bit excessive. Let's try to cut that down into something more palatable. There's two expected output files for this, one for max_prepared_xacts=0 and another for the "normal" case. The max_prepared_xacts=0 case isn't very interesting, since all the PREPARE TRANSACTION commands fail. I think we should adjust the test harness to not run these tests at all if max_prepared_xacts=0. It would be better to skip the test and print out a notice pointing out that it was not run, it'll just give a false sense of security to run the test and report success, when it didn't test anything useful. That alone cuts the size of the expected output to about 1 MB. That's much better, although it's still a lot of weight just for expected output. However, it compresses extremely well, to about 16 KB, so this isn't an issue for the size of distribution tarballs and such, only for git checkouts and on-disk size of extracted tarballs. I think that would be acceptable, although we could easily cut it a bit further if we want to. For example leaving out the word "step" from all the lines of executed test steps would cut it by about 80 KB. > Attached is also a patch to fix those, so that all permutations > work. Thanks, committed the bug fix with some additional comments. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 05.07.2011 20:06, Kevin Grittner wrote: >> In reviewing the recent fix to 2PC coverage in SSI, I found some >> cases which didn't seem to be covered. Dan bit the bullet and >> came up with an additional isolation test to rigorously cover all >> the permutations, to find *all* 2PC statement orderings which >> weren't working right. Because it was so big, he pared out tests >> which were redundant, in that they exercised the same code paths >> and pointed at the same issues. A patch to add this test is >> attached. Run against HEAD it shows the errors. It's kinda big, >> but I think it's worth having. > > I agree it'd be very nice to have this test, but 2.3 MB of > expected output is a bit excessive. Let's try to cut that down > into something more palatable. OK > There's two expected output files for this, one for > max_prepared_xacts=0 and another for the "normal" case. The > max_prepared_xacts=0 case isn't very interesting, since all the > PREPARE TRANSACTION commands fail. I think we should adjust the > test harness to not run these tests at all if > max_prepared_xacts=0. It would be better to skip the test and > print out a notice pointing out that it was not run, it'll just > give a false sense of security to run the test and report success, > when it didn't test anything useful. > > That alone cuts the size of the expected output to about 1 MB. OK. I'll work on this tonight unless Dan jumps in to claim it. > That's much better, although it's still a lot of weight just for > expected output. However, it compresses extremely well, to about > 16 KB, so this isn't an issue for the size of distribution > tarballs and such, only for git checkouts and on-disk size of > extracted tarballs. I think that would be acceptable, although we > could easily cut it a bit further if we want to. For example > leaving out the word "step" from all the lines of executed test > steps would cut it by about 80 KB. That seems simple enough. Any other ideas? >> Attached is also a patch to fix those, so that all permutations >> work. > > Thanks, committed the bug fix with some additional comments. Thanks! -Kevin
On 07.07.2011 18:43, Kevin Grittner wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote: >> On 05.07.2011 20:06, Kevin Grittner wrote: > >> There's two expected output files for this, one for >> max_prepared_xacts=0 and another for the "normal" case. The >> max_prepared_xacts=0 case isn't very interesting, since all the >> PREPARE TRANSACTION commands fail. I think we should adjust the >> test harness to not run these tests at all if >> max_prepared_xacts=0. It would be better to skip the test and >> print out a notice pointing out that it was not run, it'll just >> give a false sense of security to run the test and report success, >> when it didn't test anything useful. >> >> That alone cuts the size of the expected output to about 1 MB. > > OK. I'll work on this tonight unless Dan jumps in to claim it. Committed. I removed the second expected output file, and marked the prepared-transactions tests in the schedule as "ignore" instead. That way if max_prepared_transactions=0, you get a notice that the test case failed, but pg_regress still returns 0 as exit status. >> That's much better, although it's still a lot of weight just for >> expected output. However, it compresses extremely well, to about >> 16 KB, so this isn't an issue for the size of distribution >> tarballs and such, only for git checkouts and on-disk size of >> extracted tarballs. I think that would be acceptable, although we >> could easily cut it a bit further if we want to. For example >> leaving out the word "step" from all the lines of executed test >> steps would cut it by about 80 KB. > > That seems simple enough. Any other ideas? I think it's good enough as it is. I did change the lexer slightly, to trim whitespace from the beginning and end of SQL blocks. This cuts the size of expected output a bit, and makes it look nicer anyway. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > On 07.07.2011 18:43, Kevin Grittner wrote: >> OK. I'll work on this tonight unless Dan jumps in to claim it. > Committed. I removed the second expected output file, and marked the > prepared-transactions tests in the schedule as "ignore" instead. That > way if max_prepared_transactions=0, you get a notice that the test case > failed, but pg_regress still returns 0 as exit status. Why did you only commit on 9.1 branch? regards, tom lane
On 18.08.2011 17:07, Tom Lane wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >> On 07.07.2011 18:43, Kevin Grittner wrote: >>> OK. I'll work on this tonight unless Dan jumps in to claim it. > >> Committed. I removed the second expected output file, and marked the >> prepared-transactions tests in the schedule as "ignore" instead. That >> way if max_prepared_transactions=0, you get a notice that the test case >> failed, but pg_regress still returns 0 as exit status. > > Why did you only commit on 9.1 branch? I did "git push origin master REL9_1_STABLE", and didn't notice that it failed on master. Rebased and pushed, thanks! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Committed. I removed the second expected output file, and marked > the prepared-transactions tests in the schedule as "ignore" > instead. That way if max_prepared_transactions=0, you get a notice > that the test case failed, but pg_regress still returns 0 as exit > status. Thanks! Sorry I didn't get to it. Things got really busy here. > I did change the lexer slightly, to trim whitespace from the > beginning and end of SQL blocks. This cuts the size of expected > output a bit, and makes it look nicer anyway. OK. You missed the alternative outputs for a couple files. These are needed to get successful tests with default_transaction_isolation = 'serializable' or 'repeatable read'. Patch attached. -Kevin
Attachment
Excerpts from Kevin Grittner's message of mar ago 23 15:07:33 -0300 2011: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > > I did change the lexer slightly, to trim whitespace from the > > beginning and end of SQL blocks. This cuts the size of expected > > output a bit, and makes it look nicer anyway. > > OK. You missed the alternative outputs for a couple files. These > are needed to get successful tests with > default_transaction_isolation = 'serializable' or 'repeatable read'. > Patch attached. Thanks, applied. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Excerpts from Heikki Linnakangas's message of jue ago 18 09:57:34 -0400 2011: > Committed. I removed the second expected output file, and marked the > prepared-transactions tests in the schedule as "ignore" instead. That > way if max_prepared_transactions=0, you get a notice that the test case > failed, but pg_regress still returns 0 as exit status. After having to play with this, I didn't like it very much, because regression.diffs gets spammed with the (rather massive and completely useless) diff in that test. For the xml tests, rather than ignoring it fail on an installation without libxml, we use an alternative output. Unless there are objections, I will commit the alternative file proposed by Dan. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > After having to play with this, I didn't like it very much, because > regression.diffs gets spammed with the (rather massive and completely > useless) diff in that test. For the xml tests, rather than ignoring it > fail on an installation without libxml, we use an alternative output. > Unless there are objections, I will commit the alternative file proposed > by Dan. +1 ... "ignore" is a pretty ugly hack here. Eventually we need some way of detecting that specific tests should be skipped because they're irrelevant to the current system configuration. contrib/sepgsql is already doing something of the sort, but it's rather crude ... regards, tom lane
On Wed, Aug 24, 2011 at 9:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> After having to play with this, I didn't like it very much, because >> regression.diffs gets spammed with the (rather massive and completely >> useless) diff in that test. For the xml tests, rather than ignoring it >> fail on an installation without libxml, we use an alternative output. > >> Unless there are objections, I will commit the alternative file proposed >> by Dan. > > +1 ... "ignore" is a pretty ugly hack here. > > Eventually we need some way of detecting that specific tests should be > skipped because they're irrelevant to the current system configuration. > contrib/sepgsql is already doing something of the sort, but it's rather > crude ... I'm fairly unhappy with the fact that we don't have a better way of deciding whether we should even *build* sepgsql. The --with-selinux flag basically doesn't do anything except enable the compile of that contrib module, and that's kind of a lame use of a configure flag. Not that I have a better idea... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Excerpts from Tom Lane's message of mié ago 24 22:11:58 -0300 2011: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > After having to play with this, I didn't like it very much, because > > regression.diffs gets spammed with the (rather massive and completely > > useless) diff in that test. For the xml tests, rather than ignoring it > > fail on an installation without libxml, we use an alternative output. > > > Unless there are objections, I will commit the alternative file proposed > > by Dan. > > +1 ... "ignore" is a pretty ugly hack here. Done. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support