Thread: Slightly bogus regression test for contrib/dblink
Lines 509-512 of contrib/dblink/expected/dblink.out read: -- this should fail because there is no open transaction SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM foo'); ERROR: sql error DETAIL: ERROR: cursor "xact_test" already exists The error message is not consistent with what the comment claims. Looking at the test case in total, I think the code is responding correctly given the rules stated in your last commit message: 2005-10-17 22:55 joe * contrib/dblink/: dblink.c, expected/dblink.out, sql/dblink.sql:When a cursor is opened using dblink_open, only start atransactionif there isn't one already open. Upon dblink_close, only commit theopen transaction if it was started by dblink_open,and only thenwhen all cursors opened by dblink_open are closed. The transactionaccounting is done individuallyfor all named connections, plus thepersistent unnamed connection. However, this comment is wrong and so is the preceding one, and I think maybe you want to alter the test case so it does actually exercise closing the transaction completely. BTW, I was led to notice this while examining the current buildfarm failure report from osprey, http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=osprey&dt=2006-06-17%2004:00:16 It looks to me like the diffs are consistent with the idea that the test is using a copy of dblink that predates this patch ... do you agree? If so, anyone have an idea how that could happen? I thought we'd fixed all the rpath problems, and anyway osprey wasn't failing like this before today. regards, tom lane
Tom Lane wrote: > Lines 509-512 of contrib/dblink/expected/dblink.out read: > > -- this should fail because there is no open transaction > SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM foo'); > ERROR: sql error > DETAIL: ERROR: cursor "xact_test" already exists > > The error message is not consistent with what the comment claims. > Looking at the test case in total, I think the code is responding Actually the comment was correct, but there was a bug which caused the automatically opened transaction to not get closed. I was really expecting a "DECLARE CURSOR may only be used in transaction blocks" error there, but didn't notice that I was actually getting a different error message :-(. The bug can be reproduced by using dblink_open to automatically open a transaction, and then using dblink_exec to manually ABORT it: -- open a test connection SELECT dblink_connect('myconn','dbname=contrib_regression'); dblink_connect ---------------- OK (1 row) -- open a cursor incorrectly; this bumps up the refcount contrib_regression=# SELECT dblink_open('myconn','rmt_foo_cursor','SELECT * FROM foobar',false); NOTICE: sql error DETAIL: ERROR: relation "foobar" does not exist dblink_open ------------- ERROR (1 row) -- manually abort remote transaction; does not maintain refcount SELECT dblink_exec('myconn','ABORT'); dblink_exec ------------- ROLLBACK (1 row) -- test automatic transaction; this bumps up the refcount SELECT dblink_open('myconn','rmt_foo_cursor','SELECT * FROM foo'); dblink_open ------------- OK (1 row) -- this should commit the automatically opened transaction -- but it doesn't because the refcount is wrong SELECT dblink_close('myconn','rmt_foo_cursor'); dblink_close -------------- OK (1 row) -- this should fail because there is no open transaction -- but it doesn't because dblink_close did not commit SELECT dblink_exec('myconn','DECLARE xact_test2 CURSOR FOR SELECT * FROM foo'); dblink_exec ---------------- DECLARE CURSOR (1 row) I think the attached patch does the trick in a minimally invasive way. If there are no objections I'll commit to head and 8.1 stable branches. The problem doesn't exist before 8.1. > BTW, I was led to notice this while examining the current buildfarm > failure report from osprey, > http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=osprey&dt=2006-06-17%2004:00:16 > It looks to me like the diffs are consistent with the idea that the test > is using a copy of dblink that predates this patch ... do you agree? > If so, anyone have an idea how that could happen? I thought we'd fixed > all the rpath problems, and anyway osprey wasn't failing like this > before today. I haven't really looked at the buildfarm before -- I might be blind, but I couldn't figure out how to see the regression.diff file. Joe ? .deps ? .regression.diffs.swp ? current.diff ? dblink.sql ? libdblink.so.0.0 ? results Index: dblink.c =================================================================== RCS file: /cvsroot/pgsql/contrib/dblink/dblink.c,v retrieving revision 1.55 diff -c -r1.55 dblink.c *** dblink.c 30 May 2006 22:12:12 -0000 1.55 --- dblink.c 20 Jun 2006 16:28:01 -0000 *************** *** 361,366 **** --- 361,373 ---- DBLINK_RES_INTERNALERROR("begin error"); PQclear(res); rconn->newXactForCursor = TRUE; + /* + * Since transaction state was IDLE, we force cursor count to + * initially be 0. This is needed as a previous ABORT might + * have wiped out our transaction without maintaining the + * cursor count for us. + */ + rconn->openCursorCount = 0; } /* if we started a transaction, increment cursor count */ Index: expected/dblink.out =================================================================== RCS file: /cvsroot/pgsql/contrib/dblink/expected/dblink.out,v retrieving revision 1.16 diff -c -r1.16 dblink.out *** expected/dblink.out 18 Oct 2005 02:55:49 -0000 1.16 --- expected/dblink.out 20 Jun 2006 16:28:01 -0000 *************** *** 509,515 **** -- this should fail because there is no open transaction SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM foo'); ERROR: sql error ! DETAIL: ERROR: cursor "xact_test" already exists -- reset remote transaction state SELECT dblink_exec('myconn','ABORT'); --- 509,515 ---- -- this should fail because there is no open transaction SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM foo'); ERROR: sql error ! DETAIL: ERROR: DECLARE CURSOR may only be used in transaction blocks -- reset remote transaction state SELECT dblink_exec('myconn','ABORT');
Joe Conway <mail@joeconway.com> writes: > Tom Lane wrote: >> BTW, I was led to notice this while examining the current buildfarm >> failure report from osprey, >> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=osprey&dt=2006-06-17%2004:00:16 > I haven't really looked at the buildfarm before -- I might be blind, but > I couldn't figure out how to see the regression.diff file. It's on the cited page, if you scroll down far enough. Note to Andrew: would it make sense for the larger log files to be split out as linked pages in a buildfarm report? regards, tom lane
Tom Lane wrote: > > Note to Andrew: would it make sense for the larger log files to be split > out as linked pages in a buildfarm report? > > regards, tom lane > > I will put it on the TODO list. cheers andrew
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: > >>I haven't really looked at the buildfarm before -- I might be blind, but >>I couldn't figure out how to see the regression.diff file. > > It's on the cited page, if you scroll down far enough. OK, I'm officially blind (so much for that lasik work I had done ;-)), I had scrolled right on past the diff the first time. > It looks to me like the diffs are consistent with the idea that the> test is using a copy of dblink that predates thispatch ... do you> agree? If so, anyone have an idea how that could happen? I thought> we'd fixed all the rpath problems,and anyway osprey wasn't failing> like this before today. I would think that the diffs would be significantly larger if that were the case. In fact, when was PG_MODULE_MAGIC first made mandatory? Joe
Joe Conway <mail@joeconway.com> writes: > Tom Lane wrote: >>> It looks to me like the diffs are consistent with the idea that the >>> test is using a copy of dblink that predates this patch ... > I would think that the diffs would be significantly larger if that were > the case. In fact, when was PG_MODULE_MAGIC first made mandatory? Good point. So then why the failure? regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Joe Conway <mail@joeconway.com> writes: >> Tom Lane wrote: >>> It looks to me like the diffs are consistent with the idea that the >>> test is using a copy of dblink that predates this patch ... >> I would think that the diffs would be significantly larger if that were >> the case. In fact, when was PG_MODULE_MAGIC first made mandatory? > Good point. So then why the failure? What's even more interesting is that there are now three later runs of HEAD on osprey, and none of them failed. So unless Remi's been fooling with the environment on that machine, this was a one-shot irreproducible failure. That's disturbing in a different way ... http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=osprey&br=HEAD regards, tom lane
Tom Lane wrote: > What's even more interesting is that there are now three later runs of > HEAD on osprey, and none of them failed. So unless Remi's been fooling > with the environment on that machine, this was a one-shot irreproducible > failure. That's disturbing in a different way ... > > http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=osprey&br=HEAD > I'm trying to imagine how this diff could come about, and all I can think is that somehow in this sequence... --[next line is line 453]------------------------------ -- this should not commit the transaction because the client opened it SELECT dblink_close('myconn','rmt_foo_cursor'); dblink_close -------------- OK (1 row) -- this should succeed because we have an open transaction SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM foo'); dblink_exec ---------------- DECLARE CURSOR (1 row) --[last line is line 465]------------------------------ ... if dblink_close() actually (incorrectly) committed the transaction, I think you would get exactly the diff we got (both hunks). Now, why that would happen just once, I'm not sure. That would imply that rconn->newXactForCursor was somehow TRUE, which in turn implies that in this previous sequence... --[next line is line 439]------------------------------ -- test opening cursor in a transaction SELECT dblink_exec('myconn','BEGIN'); dblink_exec ------------- BEGIN (1 row) -- an open transaction will prevent dblink_open() from opening its own SELECT dblink_open('myconn','rmt_foo_cursor','SELECT * FROM foo'); dblink_open ------------- OK (1 row) --[last line is line 451]------------------------------ ...the "BEGIN" statement returned successfully as usual, but for some reason left (PQtransactionStatus(conn) != PQTRANS_IDLE), causing dblink_open() to start a transaction and later complete it on line 454. Is that somehow possible? Joe
Joe Conway wrote: > > ...the "BEGIN" statement returned successfully as usual, but for some > reason left (PQtransactionStatus(conn) != PQTRANS_IDLE), causing > dblink_open() to start a transaction and later complete it on line 454. > Oops, I meant "... some reason left (PQtransactionStatus(conn) == PQTRANS_IDLE), causing ..." Joe