Thread: Slightly bogus regression test for contrib/dblink

Slightly bogus regression test for contrib/dblink

From
Tom Lane
Date:
Lines 509-512 of contrib/dblink/expected/dblink.out read:

-- this should fail because there is no open transaction
SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM foo');
ERROR:  sql error
DETAIL:  ERROR:  cursor "xact_test" already exists

The error message is not consistent with what the comment claims.
Looking at the test case in total, I think the code is responding
correctly given the rules stated in your last commit message:

2005-10-17 22:55  joe
* contrib/dblink/: dblink.c, expected/dblink.out, sql/dblink.sql:When a cursor is opened using dblink_open, only start
atransactionif there isn't one already open. Upon dblink_close, only commit theopen transaction if it was started by
dblink_open,and only thenwhen all cursors opened by dblink_open are closed. The transactionaccounting is done
individuallyfor all named connections, plus thepersistent unnamed connection.
 

However, this comment is wrong and so is the preceding one, and I think
maybe you want to alter the test case so it does actually exercise
closing the transaction completely.

BTW, I was led to notice this while examining the current buildfarm
failure report from osprey,
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=osprey&dt=2006-06-17%2004:00:16
It looks to me like the diffs are consistent with the idea that the test
is using a copy of dblink that predates this patch ... do you agree?
If so, anyone have an idea how that could happen?  I thought we'd fixed
all the rpath problems, and anyway osprey wasn't failing like this
before today.
        regards, tom lane


Re: Slightly bogus regression test for contrib/dblink

From
Joe Conway
Date:
Tom Lane wrote:
> Lines 509-512 of contrib/dblink/expected/dblink.out read:
>
> -- this should fail because there is no open transaction
> SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM foo');
> ERROR:  sql error
> DETAIL:  ERROR:  cursor "xact_test" already exists
>
> The error message is not consistent with what the comment claims.
> Looking at the test case in total, I think the code is responding

Actually the comment was correct, but there was a bug which caused the
automatically opened transaction to not get closed.

I was really expecting a "DECLARE CURSOR may only be used in transaction
blocks" error there, but didn't notice that I was actually getting a
different error message :-(.

The bug can be reproduced by using dblink_open to automatically open a
transaction, and then using dblink_exec to manually ABORT it:

-- open a test connection
SELECT dblink_connect('myconn','dbname=contrib_regression');
  dblink_connect
----------------
  OK
(1 row)

-- open a cursor incorrectly; this bumps up the refcount
contrib_regression=# SELECT
dblink_open('myconn','rmt_foo_cursor','SELECT * FROM foobar',false);
NOTICE:  sql error
DETAIL:  ERROR:  relation "foobar" does not exist

  dblink_open
-------------
  ERROR
(1 row)

-- manually abort remote transaction; does not maintain refcount
SELECT dblink_exec('myconn','ABORT');
  dblink_exec
-------------
  ROLLBACK
(1 row)

-- test automatic transaction; this bumps up the refcount
SELECT dblink_open('myconn','rmt_foo_cursor','SELECT * FROM foo');
  dblink_open
-------------
  OK
(1 row)

-- this should commit the automatically opened transaction
-- but it doesn't because the refcount is wrong
SELECT dblink_close('myconn','rmt_foo_cursor');
  dblink_close
--------------
  OK
(1 row)

-- this should fail because there is no open transaction
-- but it doesn't because dblink_close did not commit
SELECT dblink_exec('myconn','DECLARE xact_test2 CURSOR FOR SELECT * FROM
foo');
   dblink_exec
----------------
  DECLARE CURSOR
(1 row)

I think the attached patch does the trick in a minimally invasive way.
If there are no objections I'll commit to head and 8.1 stable branches.
The problem doesn't exist before 8.1.

> BTW, I was led to notice this while examining the current buildfarm
> failure report from osprey,
> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=osprey&dt=2006-06-17%2004:00:16
> It looks to me like the diffs are consistent with the idea that the test
> is using a copy of dblink that predates this patch ... do you agree?
> If so, anyone have an idea how that could happen?  I thought we'd fixed
> all the rpath problems, and anyway osprey wasn't failing like this
> before today.

I haven't really looked at the buildfarm before -- I might be blind, but
I couldn't figure out how to see the regression.diff file.

Joe


? .deps
? .regression.diffs.swp
? current.diff
? dblink.sql
? libdblink.so.0.0
? results
Index: dblink.c
===================================================================
RCS file: /cvsroot/pgsql/contrib/dblink/dblink.c,v
retrieving revision 1.55
diff -c -r1.55 dblink.c
*** dblink.c    30 May 2006 22:12:12 -0000    1.55
--- dblink.c    20 Jun 2006 16:28:01 -0000
***************
*** 361,366 ****
--- 361,373 ----
              DBLINK_RES_INTERNALERROR("begin error");
          PQclear(res);
          rconn->newXactForCursor = TRUE;
+         /*
+          * Since transaction state was IDLE, we force cursor count to
+          * initially be 0. This is needed as a previous ABORT might
+          * have wiped out our transaction without maintaining the
+          * cursor count for us.
+          */
+         rconn->openCursorCount = 0;
      }

      /* if we started a transaction, increment cursor count */
Index: expected/dblink.out
===================================================================
RCS file: /cvsroot/pgsql/contrib/dblink/expected/dblink.out,v
retrieving revision 1.16
diff -c -r1.16 dblink.out
*** expected/dblink.out    18 Oct 2005 02:55:49 -0000    1.16
--- expected/dblink.out    20 Jun 2006 16:28:01 -0000
***************
*** 509,515 ****
  -- this should fail because there is no open transaction
  SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM foo');
  ERROR:  sql error
! DETAIL:  ERROR:  cursor "xact_test" already exists

  -- reset remote transaction state
  SELECT dblink_exec('myconn','ABORT');
--- 509,515 ----
  -- this should fail because there is no open transaction
  SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM foo');
  ERROR:  sql error
! DETAIL:  ERROR:  DECLARE CURSOR may only be used in transaction blocks

  -- reset remote transaction state
  SELECT dblink_exec('myconn','ABORT');

Re: Slightly bogus regression test for contrib/dblink

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>> BTW, I was led to notice this while examining the current buildfarm
>> failure report from osprey,
>> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=osprey&dt=2006-06-17%2004:00:16

> I haven't really looked at the buildfarm before -- I might be blind, but 
> I couldn't figure out how to see the regression.diff file.

It's on the cited page, if you scroll down far enough.

Note to Andrew: would it make sense for the larger log files to be split
out as linked pages in a buildfarm report?
        regards, tom lane


Re: Slightly bogus regression test for contrib/dblink

From
Andrew Dunstan
Date:
Tom Lane wrote:
>
> Note to Andrew: would it make sense for the larger log files to be split
> out as linked pages in a buildfarm report?
>
>             regards, tom lane
>
>   
I will put it on the TODO list.

cheers

andrew


Re: Slightly bogus regression test for contrib/dblink

From
Joe Conway
Date:
Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
> 
>>I haven't really looked at the buildfarm before -- I might be blind, but 
>>I couldn't figure out how to see the regression.diff file.
> 
> It's on the cited page, if you scroll down far enough.

OK, I'm officially blind (so much for that lasik work I had done ;-)), I 
had scrolled right on past the diff the first time.
> It looks to me like the diffs are consistent with the idea that the> test is using a copy of dblink that predates
thispatch ... do you> agree? If so, anyone have an idea how that could happen?  I thought> we'd fixed all the rpath
problems,and anyway osprey wasn't failing> like this before today.
 

I would think that the diffs would be significantly larger if that were 
the case. In fact, when was PG_MODULE_MAGIC first made mandatory?

Joe


Re: Slightly bogus regression test for contrib/dblink

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>>> It looks to me like the diffs are consistent with the idea that the
>>> test is using a copy of dblink that predates this patch ...

> I would think that the diffs would be significantly larger if that were 
> the case. In fact, when was PG_MODULE_MAGIC first made mandatory?

Good point.  So then why the failure?
        regards, tom lane


Re: Slightly bogus regression test for contrib/dblink

From
Tom Lane
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Joe Conway <mail@joeconway.com> writes:
>> Tom Lane wrote:
>>> It looks to me like the diffs are consistent with the idea that the
>>> test is using a copy of dblink that predates this patch ...

>> I would think that the diffs would be significantly larger if that were 
>> the case. In fact, when was PG_MODULE_MAGIC first made mandatory?

> Good point.  So then why the failure?

What's even more interesting is that there are now three later runs of
HEAD on osprey, and none of them failed.  So unless Remi's been fooling
with the environment on that machine, this was a one-shot irreproducible
failure.  That's disturbing in a different way ...

http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=osprey&br=HEAD
        regards, tom lane


Re: Slightly bogus regression test for contrib/dblink

From
Joe Conway
Date:
Tom Lane wrote:
> What's even more interesting is that there are now three later runs of
> HEAD on osprey, and none of them failed.  So unless Remi's been fooling
> with the environment on that machine, this was a one-shot irreproducible
> failure.  That's disturbing in a different way ...
> 
> http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=osprey&br=HEAD
> 

I'm trying to imagine how this diff could come about, and all I can 
think is that somehow in this sequence...


--[next line is line 453]------------------------------
-- this should not commit the transaction because the client opened it
SELECT dblink_close('myconn','rmt_foo_cursor'); dblink_close
-------------- OK
(1 row)

-- this should succeed because we have an open transaction
SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM 
foo');  dblink_exec
---------------- DECLARE CURSOR
(1 row)
--[last line is line 465]------------------------------

... if dblink_close() actually (incorrectly) committed the transaction, 
I think you would get exactly the diff we got (both hunks).

Now, why that would happen just once, I'm not sure.

That would imply that rconn->newXactForCursor was somehow TRUE, which in 
turn implies that in this previous sequence...

--[next line is line 439]------------------------------
-- test opening cursor in a transaction
SELECT dblink_exec('myconn','BEGIN'); dblink_exec
------------- BEGIN
(1 row)

-- an open transaction will prevent dblink_open() from opening its own
SELECT dblink_open('myconn','rmt_foo_cursor','SELECT * FROM foo'); dblink_open
------------- OK
(1 row)
--[last line is line 451]------------------------------

...the "BEGIN" statement returned successfully as usual, but for some 
reason left (PQtransactionStatus(conn) != PQTRANS_IDLE), causing 
dblink_open() to start a transaction and later complete it on line 454.

Is that somehow possible?

Joe


Re: Slightly bogus regression test for contrib/dblink

From
Joe Conway
Date:
Joe Conway wrote:
> 
> ...the "BEGIN" statement returned successfully as usual, but for some 
> reason left (PQtransactionStatus(conn) != PQTRANS_IDLE), causing 
> dblink_open() to start a transaction and later complete it on line 454.
> 

Oops, I meant "... some reason left (PQtransactionStatus(conn) == 
PQTRANS_IDLE), causing ..."

Joe