Thread: Visibility regression test

Visibility regression test

From
Manfred Koizar
Date:
A new regression test trying to detect runaway INSERTs/UPDATEs.

Please add to CVS:
    src/test/regress/expected/visibility.out
    src/test/regress/sql/visibility.sql

Servus
 Manfred
diff -ruN ../base/src/test/regress/expected/visibility.out src/test/regress/expected/visibility.out
--- ../base/src/test/regress/expected/visibility.out    1970-01-01 01:00:00.000000000 +0100
+++ src/test/regress/expected/visibility.out    2002-08-29 14:05:17.000000000 +0200
@@ -0,0 +1,23 @@
+--
+-- VISIBILITY
+--
+-- Try to detect the so-called halloween problem, where a command
+-- sees its own modifications.  If this happens, the UPDATE/INSERT
+-- gets into an endless loop, which is eventually aborted due to
+-- statement_timeout.
+-- This test might fail on a *very* slow machine.
+CREATE TABLE vistst (i INT);
+SET statement_timeout = 10000;  -- 10 seconds
+BEGIN;
+INSERT INTO vistst VALUES (0);
+UPDATE vistst SET i = i + 1;
+INSERT INTO vistst SELECT i + 1 FROM vistst;
+COMMIT;
+SELECT * FROM vistst;
+ i
+---
+ 1
+ 2
+(2 rows)
+
+DROP TABLE vistst;
diff -ruN ../base/src/test/regress/parallel_schedule src/test/regress/parallel_schedule
--- ../base/src/test/regress/parallel_schedule    2002-07-20 17:27:23.000000000 +0200
+++ src/test/regress/parallel_schedule    2002-08-29 13:49:03.000000000 +0200
@@ -38,7 +38,7 @@
 # ----------
 # The third group of parallel test
 # ----------
-test: constraints triggers create_misc create_aggregate create_operator create_index inherit vacuum
+test: constraints triggers create_misc create_aggregate create_operator create_index inherit visibility vacuum

 # Depends on the above
 test: create_view
diff -ruN ../base/src/test/regress/serial_schedule src/test/regress/serial_schedule
--- ../base/src/test/regress/serial_schedule    2002-07-20 17:27:23.000000000 +0200
+++ src/test/regress/serial_schedule    2002-08-29 13:46:46.000000000 +0200
@@ -50,6 +50,7 @@
 test: create_operator
 test: create_index
 test: inherit
+test: visibility
 test: vacuum
 test: create_view
 test: sanity_check
diff -ruN ../base/src/test/regress/sql/visibility.sql src/test/regress/sql/visibility.sql
--- ../base/src/test/regress/sql/visibility.sql    1970-01-01 01:00:00.000000000 +0100
+++ src/test/regress/sql/visibility.sql    2002-08-29 13:45:25.000000000 +0200
@@ -0,0 +1,21 @@
+--
+-- VISIBILITY
+--
+-- Try to detect the so-called halloween problem, where a command
+-- sees its own modifications.  If this happens, the UPDATE/INSERT
+-- gets into an endless loop, which is eventually aborted due to
+-- statement_timeout.
+-- This test might fail on a *very* slow machine.
+
+CREATE TABLE vistst (i INT);
+SET statement_timeout = 10000;  -- 10 seconds
+
+BEGIN;
+INSERT INTO vistst VALUES (0);
+UPDATE vistst SET i = i + 1;
+INSERT INTO vistst SELECT i + 1 FROM vistst;
+COMMIT;
+
+SELECT * FROM vistst;
+
+DROP TABLE vistst;

Re: Visibility regression test

From
Tom Lane
Date:
Manfred Koizar <mkoi-pg@aon.at> writes:
> A new regression test trying to detect runaway INSERTs/UPDATEs.

Why?

> Please add to CVS:
>     src/test/regress/expected/visibility.out
>     src/test/regress/sql/visibility.sql

Seems like a waste of test cycles to me.

            regards, tom lane

Re: Visibility regression test

From
Manfred Koizar
Date:
On Thu, 29 Aug 2002 10:30:59 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>Manfred Koizar <mkoi-pg@aon.at> writes:
>> A new regression test trying to detect runaway INSERTs/UPDATEs.
>
>Why?

Because we do not want to run a database that gets hung in an endless
loop on INSERT or UPDATE.  Better we find such bugs during regression
testing.

>Seems like a waste of test cycles to me.

I'd agree, if the test was not able to detect the error condition it
was written for.  But I did a test with a buggy postmaster and the bug
was detected.

Servus
 Manfred

Re: Visibility regression test

From
Tom Lane
Date:
Manfred Koizar <mkoi-pg@aon.at> writes:
> On Thu, 29 Aug 2002 10:30:59 -0400, Tom Lane <tgl@sss.pgh.pa.us>
> wrote:
>> Manfred Koizar <mkoi-pg@aon.at> writes:
> A new regression test trying to detect runaway INSERTs/UPDATEs.
>>
>> Why?

> Because we do not want to run a database that gets hung in an endless
> loop on INSERT or UPDATE.  Better we find such bugs during regression
> testing.

If there is such a problem it will surely be found by the other
regression tests.  I don't see a need to insert a test that has an
acknowledged system dependency in order to detect this.

            regards, tom lane

Re: Visibility regression test

From
John Gray
Date:
On Thu, 2002-08-29 at 16:37, Tom Lane wrote:
> Manfred Koizar <mkoi-pg@aon.at> writes:
> > On Thu, 29 Aug 2002 10:30:59 -0400, Tom Lane <tgl@sss.pgh.pa.us>
> > wrote:
> >> Manfred Koizar <mkoi-pg@aon.at> writes:
> > A new regression test trying to detect runaway INSERTs/UPDATEs.
> >>
> >> Why?
>
> > Because we do not want to run a database that gets hung in an endless
> > loop on INSERT or UPDATE.  Better we find such bugs during regression
> > testing.
>
> If there is such a problem it will surely be found by the other
> regression tests.  I don't see a need to insert a test that has an
> acknowledged system dependency in order to detect this.
>

I agree with this, but I think an earlier suggestion of Manfred's,
(namely tests that explicitly check concurrency issues) might be useful
to verify the integrity of MVCC.

How about the following as a possible approach:

We produce an application which opens two (or more?) database
connections and feeds appropriate SQL to them. ISTM that this need not
be a very complicated application. It takes one input file whose lines
begin with (say) '-' for a comment, '1' for connection 1, '2' for
connection 2 etc. followed by the SQL statement to send. (This is all
very sketchy, of course -there might be better ways to format it). The
output from each backend is sent to a separate file for comparison
against the expected results.

Does this sound feasible or useful? It would offer a means to test tuple
visibility, concurrent updates and deadlock detection in a controlled
way without too much difficulty.

Regards

John
--
John Gray
Azuli IT
www.azuli.co.uk



Re: Visibility regression test

From
Tom Lane
Date:
John Gray <jgray@azuli.co.uk> writes:
> I agree with this, but I think an earlier suggestion of Manfred's,
> (namely tests that explicitly check concurrency issues) might be useful
> to verify the integrity of MVCC.

Indeed, we could use some sort of parallel-testing harness to allow
fine-grained verification of concurrent behavior.  This has been
discussed several times, but no one's stepped up to the plate.

Your sketch misses an important point: we want to know not only what
each backend does, but when it does it.  (For example, we'd want the
test harness to be able to check that LOCK actually prevents another
backend from making progress.)  A brute-force way to do that would be
to delay for some amount of time between issuing commands, so that we
can be sure the backends have reached a quiescent state.  Then, logging
all the commands and responses serially into a single file would provide
some idea of causal order.  It could still be tricky though, eg if an
unlock releases two other backends then their results could arrive in
either order.

            regards, tom lane

Re: Visibility regression test

From
Joe Conway
Date:
Tom Lane wrote:
> Your sketch misses an important point: we want to know not only what
> each backend does, but when it does it.  (For example, we'd want the
> test harness to be able to check that LOCK actually prevents another
> backend from making progress.)  A brute-force way to do that would be
> to delay for some amount of time between issuing commands, so that we
> can be sure the backends have reached a quiescent state.  Then, logging
> all the commands and responses serially into a single file would provide
> some idea of causal order.  It could still be tricky though, eg if an
> unlock releases two other backends then their results could arrive in
> either order.

You could actually serialize all of the commands from one backend,
against multiple backends, using dblink.

Joe


Re: Visibility regression test

From
Rod Taylor
Date:
> You could actually serialize all of the commands from one backend,
> against multiple backends, using dblink.

That doesn't help, it changes the connection method but the problem is
still there.

Backend A locks row
Backends B and C are waiting on row
Backend A releases row

The problem is we cannot determine the order that B and C will wake up
in, which makes doing a diff against a standard case difficult.


We don't actually want to serialize the commands as that changes the
test.


Re: Visibility regression test

From
Manfred Koizar
Date:
On Thu, 29 Aug 2002 11:37:39 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>>> Manfred Koizar <mkoi-pg@aon.at> writes:
>> A new regression test trying to detect runaway INSERTs/UPDATEs.
>
>If there is such a problem it will surely be found by the other
>regression tests.

That's what I hoped when I sent my heap tuple header patches.
Actually this test catches a bug, which was in CVS from 2002-07-02
until 2002-07-30 and was not discovered during this time.  You have to
know, that I am a lazy person :-)  I wouldn't have written this test,
if the bug was found by one of the other tests.

>  I don't see a need to insert a test that has an
>acknowledged system dependency in order to detect this.

You mean, that the test might fail on a system that takes more than
ten seconds to INSERT or UPDATE a single row?  I don't think this is a
real problem.  Should we change the timeout to 30 seconds?  60?  3600?

Servus
 Manfred

Re: Visibility regression test

From
Tom Lane
Date:
Rod Taylor <rbt@zort.ca> writes:
> Backend A locks row
> Backends B and C are waiting on row
> Backend A releases row

> The problem is we cannot determine the order that B and C will wake up
> in, which makes doing a diff against a standard case difficult.

Exactly.

> We don't actually want to serialize the commands as that changes the
> test.

Good point.  Maybe what we need is not so much emphasis on getting an
exactly predetermined output, as a way of understanding the allowed
variations in output order and making the tool able to complain just
when unexpected variation occurs.  In other words, something smarter
than diff to check the results with.

            regards, tom lane

Re: Visibility regression test

From
Tom Lane
Date:
Manfred Koizar <mkoi-pg@aon.at> writes:
> You mean, that the test might fail on a system that takes more than
> ten seconds to INSERT or UPDATE a single row?  I don't think this is a
> real problem.

I don't like depending on a timeout *at all* in a regression test;
the exact value of the timeout is not particularly relevant to my
concern about it.

It surprises me quite a bit that there aren't any existing spots in
the regression tests that would expose a Halloween problem ... I guess
my other concern is that we shouldn't need a whole new test for this.

            regards, tom lane

Re: Visibility regression test

From
Manfred Koizar
Date:
On Thu, 29 Aug 2002 13:27:36 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>I don't like depending on a timeout *at all* in a regression test;
>the exact value of the timeout is not particularly relevant to my
>concern about it.

I agree.  But a timeout was the only thing that came to my mind for
aborting an endless loop.  Better suggestions are welcome.  Waiting
for the disk to get full will not be accepted :-)

>It surprises me quite a bit that there aren't any existing spots in
>the regression tests that would expose a Halloween problem ...

Me too.  BTW, why is this called the "Halloween problem"?

> I guess
>my other concern is that we shouldn't need a whole new test for this.

Again I agree.  First I wanted to insert these few lines into an
existing test, but didn't find one, where it seemed to fit.  The most
suitable one seemed to be vacuum.

Servus
 Manfred

Re: Visibility regression test

From
John Gray
Date:
On Thu, 2002-08-29 at 18:22, Tom Lane wrote:
> Rod Taylor <rbt@zort.ca> writes:
> > Backend A locks row
> > Backends B and C are waiting on row
> > Backend A releases row
>
> > The problem is we cannot determine the order that B and C will wake up
> > in, which makes doing a diff against a standard case difficult.
>
> Exactly.
>

I've been wondering about the following (this is pseudocode, with the
main loop unrolled):

LOG("A.out","step 1");
LOG("B.out","step 1");
LOG("C.out","step 1");
PQsendQuery(A, ' whatever '); // Only executed if schedule lists one
PQsendQuery(B, ' whatever 2 ');
PQsendQuery(C, ' whatever 3 ');
sleep(n); // To allow queries to run. [See below for alternative]
PQconsumeInput(A);
if ! PQisbusy(A) {
    LOG("A.out",PQgetResult(A)); // Actually a while(PQgetResult)
 ... etc. for B and C
LOG("A.out","step 2");
LOG("B.out","step 2");
LOG("C.out","step 2");
PQsendQuery(A, ' whatever next ');
... and so on.

This means that if A is holding a lock and B and C are waiting, their
step 1 output will be empty -the "step 1" marker will be logged,
followed by the "step 2" marker. If the next instruction in the schedule
file is a query which results in A releasing the lock, B and C's output
will be picked up and logged (in order of backend connection) between
the "step 2" and "step 3" log markers. Obviously, the schedule file
lists no queries for B and C in step 2 for this to work.

This means that the schedule file must have step markers for each state
that the system should be in -as that would be the only way to guarantee
that backend A had a particular command before backend (C). The sleep
statement above is a little crude (i.e. it might have to be set rather
long). In practice we could just loop until one backend returned, then
wait n seconds before testing all the backends again and moving to the
next stage. n could be much shorter in this case (representing the
likely difference in execution times between the backends) and would
probably be specified as a parameter of the test step.

If I get a chance I might try to write something on these lines.

> > We don't actually want to serialize the commands as that changes the
> > test.
>
> Good point.  Maybe what we need is not so much emphasis on getting an
> exactly predetermined output, as a way of understanding the allowed
> variations in output order and making the tool able to complain just
> when unexpected variation occurs.  In other words, something smarter
> than diff to check the results with.
>

This is another possibility, of course -and might allow for better
analysis of the results.

Regards

John
--
John Gray
Azuli IT
www.azuli.co.uk



Re: Visibility regression test

From
John Gray
Date:
On Fri, 2002-08-30 at 13:22, John Gray wrote:

> I've been wondering about the following (this is pseudocode, with the
> main loop unrolled):
>
> LOG("A.out","step 1");
> LOG("B.out","step 1");
> LOG("C.out","step 1");
> PQsendQuery(A, ' whatever '); // Only executed if schedule lists one
> PQsendQuery(B, ' whatever 2 ');
> PQsendQuery(C, ' whatever 3 ');
> sleep(n); // To allow queries to run. [See below for alternative]
> PQconsumeInput(A);
> if ! PQisbusy(A) {
>     LOG("A.out",PQgetResult(A)); // Actually a while(PQgetResult)
>  ... etc. for B and C
> LOG("A.out","step 2");
> LOG("B.out","step 2");
> LOG("C.out","step 2");
> PQsendQuery(A, ' whatever next ');
> ... and so on.

Following the philosophy of "release early, release often" (a convenient
excuse to release poor code upon humanity) I present the following. This
is my first shot at using libPQ, so I'm sure it has various problems.
However, it does appear to work. It takes a schedule file (also
attached) and demonstrates tuple visibility and a deadlock. Obviously,
that's not very exciting, but it is an example of how such a thing might
work.

The current schedule uses two backends, but there is no limit in
principle to the number of backends involved.

I compiled with:
gcc -Wall -I/usr/local/pgsql/include -L/usr/local/pgsql/lib -lpq -o
vistest vistest.c

and then ran
./vistest sqltest example_schedule examp

(having first created a database called sqltest). Two output files,
examp_0 and examp_1 will be produced, representing the output from the
two backends. Note that comments and step markers are logged to both
files.

This is all a bit ragged and is only intended as a proof of concept
idea.

Regards

John



--
John Gray
Azuli IT
www.azuli.co.uk


Attachment