Thread: parallel regression test failure

parallel regression test failure

From
Bruce Momjian
Date:
I am seeing the following parallel regression test failures.  Any idea
on the cause?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
*** ./expected/constraints.out    Fri Jul 25 17:36:36 2003
--- ./results/constraints.out    Fri Jul 25 17:37:07 2003
***************
*** 80,102 ****
  CREATE TABLE CHECK2_TBL (x int, y text, z int,
      CONSTRAINT SEQUENCE_CON
      CHECK (x > 3 and y <> 'check failed' and z < 8));
  INSERT INTO CHECK2_TBL VALUES (4, 'check ok', -2);
  INSERT INTO CHECK2_TBL VALUES (1, 'x check failed', -2);
! ERROR:  new row for relation "check2_tbl" violates CHECK constraint "sequence_con"
  INSERT INTO CHECK2_TBL VALUES (5, 'z check failed', 10);
! ERROR:  new row for relation "check2_tbl" violates CHECK constraint "sequence_con"
  INSERT INTO CHECK2_TBL VALUES (0, 'check failed', -2);
! ERROR:  new row for relation "check2_tbl" violates CHECK constraint "sequence_con"
  INSERT INTO CHECK2_TBL VALUES (6, 'check failed', 11);
! ERROR:  new row for relation "check2_tbl" violates CHECK constraint "sequence_con"
  INSERT INTO CHECK2_TBL VALUES (7, 'check ok', 7);
  SELECT '' AS two, * from CHECK2_TBL;
!  two | x |    y     | z
! -----+---+----------+----
!      | 4 | check ok | -2
!      | 7 | check ok |  7
! (2 rows)
!
  --
  -- Check constraints on INSERT
  --
--- 80,100 ----
  CREATE TABLE CHECK2_TBL (x int, y text, z int,
      CONSTRAINT SEQUENCE_CON
      CHECK (x > 3 and y <> 'check failed' and z < 8));
+ ERROR:  cache lookup failed for relation 126262
  INSERT INTO CHECK2_TBL VALUES (4, 'check ok', -2);
+ ERROR:  relation "check2_tbl" does not exist
  INSERT INTO CHECK2_TBL VALUES (1, 'x check failed', -2);
! ERROR:  relation "check2_tbl" does not exist
  INSERT INTO CHECK2_TBL VALUES (5, 'z check failed', 10);
! ERROR:  relation "check2_tbl" does not exist
  INSERT INTO CHECK2_TBL VALUES (0, 'check failed', -2);
! ERROR:  relation "check2_tbl" does not exist
  INSERT INTO CHECK2_TBL VALUES (6, 'check failed', 11);
! ERROR:  relation "check2_tbl" does not exist
  INSERT INTO CHECK2_TBL VALUES (7, 'check ok', 7);
+ ERROR:  relation "check2_tbl" does not exist
  SELECT '' AS two, * from CHECK2_TBL;
! ERROR:  relation "check2_tbl" does not exist
  --
  -- Check constraints on INSERT
  --

======================================================================

*** ./expected/triggers.out    Fri Jul 25 12:38:34 2003
--- ./results/triggers.out    Fri Jul 25 17:37:06 2003
***************
*** 91,96 ****
--- 91,97 ----
  NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys2 are deleted
  DROP TABLE pkeys;
  DROP TABLE fkeys;
+ ERROR:  cache lookup failed for relation 122552
  DROP TABLE fkeys2;
  -- -- I've disabled the funny_dup17 test because the new semantics
  -- -- of AFTER ROW triggers, which get now fired at the end of a

======================================================================

*** ./expected/sanity_check.out    Wed May 28 12:04:02 2003
--- ./results/sanity_check.out    Fri Jul 25 17:37:14 2003
***************
*** 15,20 ****
--- 15,21 ----
   bt_name_heap        | t
   bt_txt_heap         | t
   fast_emp4000        | t
+  fkeys               | t
   func_index_heap     | t
   hash_f8_heap        | t
   hash_i4_heap        | t
***************
*** 62,68 ****
   shighway            | t
   tenk1               | t
   tenk2               | t
! (52 rows)

  --
  -- another sanity check: every system catalog that has OIDs should have
--- 63,69 ----
   shighway            | t
   tenk1               | t
   tenk2               | t
! (53 rows)

  --
  -- another sanity check: every system catalog that has OIDs should have

======================================================================

*** ./expected/misc.out    Fri Jul 25 17:36:36 2003
--- ./results/misc.out    Fri Jul 25 17:37:17 2003
***************
*** 580,586 ****
   c
   c_star
   char_tbl
-  check2_tbl
   check_seq
   check_tbl
   circle_tbl
--- 580,585 ----
***************
*** 598,603 ****
--- 597,603 ----
   equipment_r
   f_star
   fast_emp4000
+  fkeys
   float4_tbl
   float8_tbl
   func_index_heap

======================================================================


Re: parallel regression test failure

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I am seeing the following parallel regression test failures.  Any idea
> on the cause?

I don't see it here, on either of two different architectures.  Maybe
you need a make distclean and rebuild?
        regards, tom lane


Re: parallel regression test failure

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I am seeing the following parallel regression test failures.  Any idea
> > on the cause?
> 
> I don't see it here, on either of two different architectures.  Maybe
> you need a make distclean and rebuild?

I do (I run tools/pgtest), and see the failure regularly.  It is a
dual-cpu Xeon machine.  I run it every night and it fails 25% of the
time.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: parallel regression test failure

From
Robert Creager
Date:
On Fri, 25 Jul 2003 19:57:04 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I am seeing the following parallel regression test failures.  Any
> > idea on the cause?
> 
> I don't see it here, on either of two different architectures.  Maybe
> you need a make distclean and rebuild?
> 

I was just able to get some problems on my dual Athlon machine
(after about 10 runs) with a clean cvs download.

Linux thunder.mshome.net 2.4.21-0.13_test #35 SMP Wed Apr 9 07:29:10 MDT
2003 i686 unknown unknown GNU/Linux

gcc (GCC) 3.2.2 (Mandrake Linux 9.1 3.2.2-3mdk)

<./configure --with-pgport=5433 --prefix=/usr/local/pgsql_cvs>
<sh src/tools/pgtest>
<sh src/tools/pgtest -n>

*** ./expected/triggers.out    Thu Jul 24 11:52:50 2003
--- ./results/triggers.out    Fri Jul 25 21:20:34 2003
***************
*** 92,97 ****
--- 92,98 ---- DROP TABLE pkeys; DROP TABLE fkeys; DROP TABLE fkeys2;
+ ERROR:  could not open relation with OID 119498 -- -- I've disabled the funny_dup17 test because the new semantics --
--of AFTER ROW triggers, which get now fired at the end of a -- -- query always, cause funny_dup17 to enter an endless
loop.

======================================================================

*** ./expected/sanity_check.out    Wed May 28 10:04:02 2003
--- ./results/sanity_check.out    Fri Jul 25 21:20:37 2003
***************
*** 15,20 ****
--- 15,21 ----  bt_name_heap        | t  bt_txt_heap         | t  fast_emp4000        | t
+  fkeys2              | t  func_index_heap     | t  hash_f8_heap        | t  hash_i4_heap        | t
***************
*** 62,68 ****  shighway            | t  tenk1               | t  tenk2               | t
! (52 rows)  -- -- another sanity check: every system catalog that has OIDs should
have--- 63,69 ----  shighway            | t  tenk1               | t  tenk2               | t
! (53 rows)  -- -- another sanity check: every system catalog that has OIDs should
have

======================================================================

*** ./expected/misc.out    Fri Jul 25 21:14:51 2003
--- ./results/misc.out    Fri Jul 25 21:20:39 2003
***************
*** 598,603 ****
--- 598,604 ----  equipment_r  f_star  fast_emp4000
+  fkeys2  float4_tbl  float8_tbl  func_index_heap
***************
*** 660,666 ****  toyemp  varchar_tbl  xacttest
! (96 rows)  --SELECT name(equipment(hobby_construct(text 'skywalking', text
'mer'))) AS equip_name;  SELECT hobbies_by_name('basketball');
--- 661,667 ----  toyemp  varchar_tbl  xacttest
! (97 rows)  --SELECT name(equipment(hobby_construct(text 'skywalking', text
'mer'))) AS equip_name;  SELECT hobbies_by_name('basketball');

======================================================================


-- 21:23:44 up 8 days,  1:24,  2 users,  load average: 0.11, 1.04, 1.31

Re: parallel regression test failure

From
Robert Creager
Date:
On Fri, 25 Jul 2003 19:57:04 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I am seeing the following parallel regression test failures.  Any
> > idea on the cause?
> 
> I don't see it here, on either of two different architectures.  Maybe
> you need a make distclean and rebuild?
> 

And another failure:

*** ./expected/constraints.out    Fri Jul 25 21:14:51 2003
--- ./results/constraints.out    Fri Jul 25 21:34:09 2003
***************
*** 212,244 **** DROP SEQUENCE INSERT_SEQ; CREATE SEQUENCE INSERT_SEQ START 4; CREATE TABLE tmp (xd INT, yd TEXT, zd
INT);INSERT INTO tmp VALUES (null, 'Y', null); INSERT INTO tmp VALUES (5, '!check failed', null); INSERT INTO tmp
VALUES(null, 'try again', null); INSERT INTO INSERT_TBL(y) select yd from tmp; SELECT '' AS three, * FROM INSERT_TBL;
three| x |       y       | z  
 
! -------+---+---------------+----
!        | 4 | Y             | -4
!        | 5 | !check failed | -5
!        | 6 | try again     | -6
! (3 rows)  INSERT INTO INSERT_TBL SELECT * FROM tmp WHERE yd = 'try again'; INSERT INTO INSERT_TBL(y,z) SELECT yd, -7
FROMtmp WHERE yd = 'try
 
again';  INSERT INTO INSERT_TBL(y,z) SELECT yd, -8 FROM tmp WHERE yd =
'try again';! ERROR:  new row for relation "insert_tbl" violates CHECK
constraint "insert_con"  SELECT '' AS four, * FROM INSERT_TBL;  four | x |       y       | z  
! ------+---+---------------+----
!       | 4 | Y             | -4
!       | 5 | !check failed | -5
!       | 6 | try again     | -6
!       |   | try again     |   
!       | 7 | try again     | -7
! (5 rows)  DROP TABLE tmp; -- -- Check constraints on UPDATE --
--- 212,244 ---- DROP SEQUENCE INSERT_SEQ; CREATE SEQUENCE INSERT_SEQ START 4; CREATE TABLE tmp (xd INT, yd TEXT, zd
INT);
+ ERROR:  relation 126260 deleted while still in use INSERT INTO tmp VALUES (null, 'Y', null);
+ ERROR:  relation "tmp" does not exist INSERT INTO tmp VALUES (5, '!check failed', null);
+ ERROR:  relation "tmp" does not exist INSERT INTO tmp VALUES (null, 'try again', null);
+ ERROR:  relation "tmp" does not exist INSERT INTO INSERT_TBL(y) select yd from tmp;
+ ERROR:  relation "tmp" does not exist SELECT '' AS three, * FROM INSERT_TBL;  three | x | y | z 
! -------+---+---+---
! (0 rows)  INSERT INTO INSERT_TBL SELECT * FROM tmp WHERE yd = 'try again';
+ ERROR:  relation "tmp" does not exist INSERT INTO INSERT_TBL(y,z) SELECT yd, -7 FROM tmp WHERE yd = 'try
again';+ ERROR:  relation "tmp" does not exist INSERT INTO INSERT_TBL(y,z) SELECT yd, -8 FROM tmp WHERE yd = 'try
again';! ERROR:  relation "tmp" does not exist SELECT '' AS four, * FROM INSERT_TBL;  four | x | y | z 
! ------+---+---+---
! (0 rows)  DROP TABLE tmp;
+ ERROR:  table "tmp" does not exist -- -- Check constraints on UPDATE --
***************
*** 246,261 **** UPDATE INSERT_TBL SET x = 6 WHERE x = 6; UPDATE INSERT_TBL SET x = -z, z = -x; UPDATE INSERT_TBL SET x
=z, z = x;
 
- ERROR:  new row for relation "insert_tbl" violates CHECK constraint
"insert_con"  SELECT * FROM INSERT_TBL;  x |       y       | z  
! ---+---------------+----
!  4 | Y             | -4
!    | try again     |   
!  7 | try again     | -7
!  5 | !check failed |   
!  6 | try again     | -6
! (5 rows)  -- DROP TABLE INSERT_TBL; --
--- 246,255 ---- UPDATE INSERT_TBL SET x = 6 WHERE x = 6; UPDATE INSERT_TBL SET x = -z, z = -x; UPDATE INSERT_TBL SET x
=z, z = x; SELECT * FROM INSERT_TBL;  x | y | z 
 
! ---+---+---
! (0 rows)  -- DROP TABLE INSERT_TBL; --

======================================================================



-- 21:34:48 up 8 days,  1:35,  2 users,  load average: 0.89, 0.65, 0.85

Re: parallel regression test failure

From
Robert Creager
Date:
On Fri, 25 Jul 2003 19:57:04 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I am seeing the following parallel regression test failures.  Any
> > idea on the cause?
>
> I don't see it here, on either of two different architectures.  Maybe
> you need a make distclean and rebuild?
>

I've attached a little Perl script which runs pgtest over and over
(with -n option), checking for failures and saving the output
(runX.out) and the diffs (failX.diffs) in /tmp for each failing run.

Run it from the top level (as you would pgtest).

Later,
Rob

--
 22:25:11 up 8 days,  2:26,  2 users,  load average: 2.40, 1.61, 1.57

Attachment

Re: parallel regression test failure

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I run it every night and it fails 25% of the time.

When did you start seeing the problem?

I just wasted an hour running eighty-some iterations of "make check"
on two different machines/OSes/architectures.  Zero failures.  I also
eyeballed recent changes in the relcache/catcache area, which seems to
be what's unhappy, without finding anything.

I think it's up to yunz as are seeing misbehavior to roll up your
sleeves and debug the problem.  There's nothing more I can do.
        regards, tom lane


Re: parallel regression test failure

From
Bruce Momjian
Date:
Let me get the patch queue applied, then use CVS to backtrack and find
the date it started failing.  I think you need a dual cpu machine to see
the failures.

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I run it every night and it fails 25% of the time.
> 
> When did you start seeing the problem?
> 
> I just wasted an hour running eighty-some iterations of "make check"
> on two different machines/OSes/architectures.  Zero failures.  I also
> eyeballed recent changes in the relcache/catcache area, which seems to
> be what's unhappy, without finding anything.
> 
> I think it's up to yunz as are seeing misbehavior to roll up your
> sleeves and debug the problem.  There's nothing more I can do.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: parallel regression test failure

From
Robert Creager
Date:
On Sat, 26 Jul 2003 01:00:46 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I run it every night and it fails 25% of the time.
> 
> When did you start seeing the problem?
> 
> I just wasted an hour running eighty-some iterations of "make check"
> on two different machines/OSes/architectures.  Zero failures.  I also
> eyeballed recent changes in the relcache/catcache area, which seems to
> be what's unhappy, without finding anything.
> 
> I think it's up to yunz as are seeing misbehavior to roll up your
> sleeves and debug the problem.  There's nothing more I can do.
> 

Any suggestions for those of us who are not pg developers how I might
help figure out what's up?
   1 of    25 - failed     0 (0%)   2 of    25 - failed     0 (0%)   3 of    25 - failed     0 (0%)   4 of    25 -
failed    0 (0%)   5 of    25 - failed     0 (0%)   6 of    25 - failed     0 (0%)   7 of    25 - failed     0 (0%)   8
of   25 - failed     0 (0%)   9 of    25 - failed     0 (0%)  10 of    25 - failed     0 (0%)  11 of    25 - failed
1(9%)  12 of    25 - failed     2 (17%)  13 of    25 - failed     2 (15%)  14 of    25 - failed     2 (14%  15 of    25
-failed     3 (20%)  16 of    25 - failed     3 (19%)  17 of    25 - failed     3 (18%)  18 of    25 - failed     4
(22%) 19 of    25 - failed     4 (21%)  20 of    25 - failed     4 (20%)  21 of    25 - failed     5 (24%)  22 of    25
-failed     6 (27%)  23 of    25 - failed     6 (26%)  24 of    25 - failed     7 (29%)  25 of    25 - failed     8
(32%)
constraints failed 1 times
cluster failed 1 times
foreign_key failed 1 times
misc failed 6 times
sanity_check failed 3 times
inherit failed 2 times
triggers failed 4 times


-- 08:21:18 up 8 days, 12:22,  2 users,  load average: 0.08, 0.65, 1.58

Re: parallel regression test failure

From
Bruce Momjian
Date:
I am going to use cvs -d to pull an older CVS and see if that fails, so
we can track down the date it started failing.

---------------------------------------------------------------------------

Robert Creager wrote:
-- Start of PGP signed section.
> 
> On Sat, 26 Jul 2003 01:00:46 -0400
> Tom Lane <tgl@sss.pgh.pa.us> said something like:
> 
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > I run it every night and it fails 25% of the time.
> > 
> > When did you start seeing the problem?
> > 
> > I just wasted an hour running eighty-some iterations of "make check"
> > on two different machines/OSes/architectures.  Zero failures.  I also
> > eyeballed recent changes in the relcache/catcache area, which seems to
> > be what's unhappy, without finding anything.
> > 
> > I think it's up to yunz as are seeing misbehavior to roll up your
> > sleeves and debug the problem.  There's nothing more I can do.
> > 
> 
> Any suggestions for those of us who are not pg developers how I might
> help figure out what's up?
> 
>     1 of    25 - failed     0 (0%)
>     2 of    25 - failed     0 (0%)
>     3 of    25 - failed     0 (0%)
>     4 of    25 - failed     0 (0%)
>     5 of    25 - failed     0 (0%)
>     6 of    25 - failed     0 (0%)
>     7 of    25 - failed     0 (0%)
>     8 of    25 - failed     0 (0%)
>     9 of    25 - failed     0 (0%)
>    10 of    25 - failed     0 (0%)
>    11 of    25 - failed     1 (9%)
>    12 of    25 - failed     2 (17%)
>    13 of    25 - failed     2 (15%)
>    14 of    25 - failed     2 (14%
>    15 of    25 - failed     3 (20%)
>    16 of    25 - failed     3 (19%)
>    17 of    25 - failed     3 (18%)
>    18 of    25 - failed     4 (22%)
>    19 of    25 - failed     4 (21%)
>    20 of    25 - failed     4 (20%)
>    21 of    25 - failed     5 (24%)
>    22 of    25 - failed     6 (27%)
>    23 of    25 - failed     6 (26%)
>    24 of    25 - failed     7 (29%)
>    25 of    25 - failed     8 (32%)
> constraints failed 1 times
> cluster failed 1 times
> foreign_key failed 1 times
> misc failed 6 times
> sanity_check failed 3 times
> inherit failed 2 times
> triggers failed 4 times
> 
> 
> -- 
>  08:21:18 up 8 days, 12:22,  2 users,  load average: 0.08, 0.65, 1.58
-- End of PGP section, PGP failed!

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: parallel regression test failure

From
Bruce Momjian
Date:
If you would like to do the cvs -d testing yourself instead of me, let
me know.  It will take me a few hours to get to it anyway.

---------------------------------------------------------------------------

Robert Creager wrote:
-- Start of PGP signed section.
> 
> On Sat, 26 Jul 2003 01:00:46 -0400
> Tom Lane <tgl@sss.pgh.pa.us> said something like:
> 
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > I run it every night and it fails 25% of the time.
> > 
> > When did you start seeing the problem?
> > 
> > I just wasted an hour running eighty-some iterations of "make check"
> > on two different machines/OSes/architectures.  Zero failures.  I also
> > eyeballed recent changes in the relcache/catcache area, which seems to
> > be what's unhappy, without finding anything.
> > 
> > I think it's up to yunz as are seeing misbehavior to roll up your
> > sleeves and debug the problem.  There's nothing more I can do.
> > 
> 
> Any suggestions for those of us who are not pg developers how I might
> help figure out what's up?
> 
>     1 of    25 - failed     0 (0%)
>     2 of    25 - failed     0 (0%)
>     3 of    25 - failed     0 (0%)
>     4 of    25 - failed     0 (0%)
>     5 of    25 - failed     0 (0%)
>     6 of    25 - failed     0 (0%)
>     7 of    25 - failed     0 (0%)
>     8 of    25 - failed     0 (0%)
>     9 of    25 - failed     0 (0%)
>    10 of    25 - failed     0 (0%)
>    11 of    25 - failed     1 (9%)
>    12 of    25 - failed     2 (17%)
>    13 of    25 - failed     2 (15%)
>    14 of    25 - failed     2 (14%
>    15 of    25 - failed     3 (20%)
>    16 of    25 - failed     3 (19%)
>    17 of    25 - failed     3 (18%)
>    18 of    25 - failed     4 (22%)
>    19 of    25 - failed     4 (21%)
>    20 of    25 - failed     4 (20%)
>    21 of    25 - failed     5 (24%)
>    22 of    25 - failed     6 (27%)
>    23 of    25 - failed     6 (26%)
>    24 of    25 - failed     7 (29%)
>    25 of    25 - failed     8 (32%)
> constraints failed 1 times
> cluster failed 1 times
> foreign_key failed 1 times
> misc failed 6 times
> sanity_check failed 3 times
> inherit failed 2 times
> triggers failed 4 times
> 
> 
> -- 
>  08:21:18 up 8 days, 12:22,  2 users,  load average: 0.08, 0.65, 1.58
-- End of PGP section, PGP failed!

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: parallel regression test failure

From
Robert Creager
Date:
On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

> 
> If you would like to do the cvs -d testing yourself instead of me, let
> me know.  It will take me a few hours to get to it anyway.
> 

I will start doing pulling down old versions (once I figure out the -d
syntax).  Do you recall how long you may of been seeing this?

Thanks,
Rob

-- 08:54:59 up 8 days, 12:55,  2 users,  load average: 2.38, 1.12, 1.14

Re: parallel regression test failure

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I think you need a dual cpu machine to see the failures.

I was wondering about that myself, but we shouldn't fixate on that
assumption without more evidence.  There could be some other factor
explaining why I can't reproduce it.  A couple of questions for both
of you: - what configure options are you using? - can you reproduce the problem with serial tests (make installcheck)?
-exactly how repeatable is it --- when it fails, is it always at the   same places, or do the failures move around?
 

It would also be good to find out exactly where the failures are coming
from.  Please try running the tests with LOG_ERROR_VERBOSITY set to
VERBOSE (probably the easiest way to hack this in make check's temp
installation is to modify src/backend/utils/misc/postgresql.conf.sample).
Then the postmaster log file created by make check will show the elog
calls' locations.
        regards, tom lane


Re: parallel regression test failure

From
Robert Creager
Date:
On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

> 
> If you would like to do the cvs -d testing yourself instead of me, let
> me know.  It will take me a few hours to get to it anyway.
> 

Just to make sure I've got this right:

cvs update -D yyyy-mm-dd
make maintainer-clean
./configure
make
test

-- 09:05:56 up 8 days, 13:06,  2 users,  load average: 2.59, 2.90, 2.14

Re: parallel regression test failure

From
Bruce Momjian
Date:
Robert Creager wrote:
-- Start of PGP signed section.
> On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
> Bruce Momjian <pgman@candle.pha.pa.us> said something like:
> 
> > 
> > If you would like to do the cvs -d testing yourself instead of me, let
> > me know.  It will take me a few hours to get to it anyway.
> > 
> 
> I will start doing pulling down old versions (once I figure out the -d
> syntax).  Do you recall how long you may of been seeing this?

I think you just take a CVS checkout and to:
cvs update -D '2003-05-01 00:00:00 GMT' pgsql

and keep changing the dates to find the date it started breaking.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: parallel regression test failure

From
Bruce Momjian
Date:
Yep, I think that is it, though the last one is pgtest or whatever you
are using for testing.

---------------------------------------------------------------------------

Robert Creager wrote:
-- Start of PGP signed section.
> On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
> Bruce Momjian <pgman@candle.pha.pa.us> said something like:
> 
> > 
> > If you would like to do the cvs -d testing yourself instead of me, let
> > me know.  It will take me a few hours to get to it anyway.
> > 
> 
> Just to make sure I've got this right:
> 
> cvs update -D yyyy-mm-dd
> make maintainer-clean
> ./configure
> make
> test
> 
> -- 
>  09:05:56 up 8 days, 13:06,  2 users,  load average: 2.59, 2.90, 2.14
-- End of PGP section, PGP failed!

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: parallel regression test failure

From
Bruce Momjian
Date:
Robert Creager wrote:
-- Start of PGP signed section.
> On Sat, 26 Jul 2003 10:47:12 -0400 (EDT)
> Bruce Momjian <pgman@candle.pha.pa.us> said something like:
> 
> > 
> > If you would like to do the cvs -d testing yourself instead of me, let
> > me know.  It will take me a few hours to get to it anyway.
> > 
> 
> I will start doing pulling down old versions (once I figure out the -d
> syntax).  Do you recall how long you may of been seeing this?

Since it is random, I hadn't noticed when it started, and originally
suspected my hardware I recently upgraded my hardware, around May 1, I
think.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: parallel regression test failure

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I think you need a dual cpu machine to see the failures.
> 
> I was wondering about that myself, but we shouldn't fixate on that
> assumption without more evidence.  There could be some other factor
> explaining why I can't reproduce it.  A couple of questions for both
> of you:
>   - what configure options are you using?
configure \    --with-x \    --with-threads \    --with-tcl \    --with-perl \    --with-python \
--enable-pltcl-unknown\    --with-tclconfig=/u/lib \    --with-tkconfig=/u/lib \    --enable-cassert \
--with-includes="/usr/local/include/readline/usr/contrib/include" \    --with-libraries="/usr/local/lib
/usr/contrib/lib"\    --enable-locale \    --enable-multibyte \    --with-recode \    --with-openssl
 


>   - can you reproduce the problem with serial tests (make installcheck)?

No, I have never seen a serial failure, and when I get a paralell
failure, I run the serial to make sure it is just the paralell test, and
serial always passes.

>   - exactly how repeatable is it --- when it fails, is it always at the
>     same places, or do the failures move around?

No, different, as reported by Robert, but it usually has to do with the
contraint, trigger, and sanity tests.  I assume we just had a dependency
in the paralell regression tests and we just need to do an adjustment,
but looking at the diffs more closely, I see it is more serious.

> It would also be good to find out exactly where the failures are coming
> from.  Please try running the tests with LOG_ERROR_VERBOSITY set to
> VERBOSE (probably the easiest way to hack this in make check's temp
> installation is to modify src/backend/utils/misc/postgresql.conf.sample).
> Then the postmaster log file created by make check will show the elog
> calls' locations.

OK.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: parallel regression test failure

From
Tom Lane
Date:
Robert Creager <Robert_Creager@LogicalChaos.org> writes:
> Just to make sure I've got this right:

> cvs update -D yyyy-mm-dd
> make maintainer-clean
> ./configure
> make
> test

I'd do the "make maintainer-clean" before cvs update'ing, but otherwise
probably right.  Watch the output the first couple times and make sure
cvs is actually willing to replace files in both the forward and
backward directions.
        regards, tom lane


Re: parallel regression test failure

From
Robert Creager
Date:
./configure --with-pgport=5433 --prefix=/usr/local/pgsql_cvs

The failure moves around (out of 25 tests):

constraints failed 1 times
cluster failed 1 times
foreign_key failed 1 times
misc failed 6 times
sanity_check failed 3 times
inherit failed 2 times
triggers failed 4 times

Have not tried install check yet.

On Sat, 26 Jul 2003 11:06:21 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

>   - what configure options are you using?
>   - can you reproduce the problem with serial tests (make
>   installcheck)?- exactly how repeatable is it --- when it fails, is
>   it always at the
>     same places, or do the failures move around?
> 

-- 09:22:25 up 8 days, 13:23,  2 users,  load average: 1.36, 1.26, 1.70

Re: parallel regression test failure

From
Robert Creager
Date:
On Sat, 26 Jul 2003 11:22:21 -0400
Tom Lane <tgl@sss.pgh.pa.us> said something like:

> Robert Creager <Robert_Creager@LogicalChaos.org> writes:
> > Just to make sure I've got this right:
> 
> > cvs update -D yyyy-mm-dd
> > make maintainer-clean
> > ./configure
> > make
> > test
> 
> I'd do the "make maintainer-clean" before cvs update'ing, but
> otherwise probably right.  Watch the output the first couple times and
> make sure cvs is actually willing to replace files in both the forward
> and backward directions.
> 

Yeah, and yeah, it just removed src/tools/pgtest when I went back to
April...

-- 09:36:18 up 8 days, 13:37,  2 users,  load average: 0.08, 0.86, 1.54

Re: parallel regression test failure

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I think you need a dual cpu machine to see the failures.
> 
> I was wondering about that myself, but we shouldn't fixate on that
> assumption without more evidence.  There could be some other factor
> explaining why I can't reproduce it.  A couple of questions for both
> of you:
>   - what configure options are you using?
>   - can you reproduce the problem with serial tests (make installcheck)?
>   - exactly how repeatable is it --- when it fails, is it always at the
>     same places, or do the failures move around?
> 
> It would also be good to find out exactly where the failures are coming
> from.  Please try running the tests with LOG_ERROR_VERBOSITY set to
> VERBOSE (probably the easiest way to hack this in make check's temp
> installation is to modify src/backend/utils/misc/postgresql.conf.sample).
> Then the postmaster log file created by make check will show the elog
> calls' locations.

OK, I got a failure with verbose output.  Error was:

*** ./expected/triggers.out    Fri Jul 25 12:38:34 2003
--- ./results/triggers.out    Sat Jul 26 12:52:02 2003
***************
*** 66,71 ****
--- 66,72 ---- ERROR:  tuple references non-existent key DETAIL:  Trigger "check_fkeys2_pkey_exist" found tuple
referencingnon-existent key in "pkeys". insert into fkeys values (10, '1', 2);
 
+ ERROR:  could not open relation with OID 119980 insert into fkeys values (30, '3', 3); insert into fkeys values (40,
'4',2); insert into fkeys values (50, '5', 2);
 
***************
*** 87,93 **** NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted ERROR:  "check_fkeys2_fkey_restrict":
tupleis referenced in "fkeys" update pkeys set pkey1 = 7, pkey2 = '70' where pkey1 = 10 and pkey2 = '1';
 
! NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of
fkeys2are deleted DROP TABLE pkeys; DROP TABLE fkeys;
 
--- 88,94 ---- NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted ERROR:  "check_fkeys2_fkey_restrict":
tupleis referenced in "fkeys" update pkeys set pkey1 = 7, pkey2 = '70' where pkey1 = 10 and pkey2 = '1';
 
! NOTICE:  check_pkeys_fkey_cascade: 0 tuple(s) of fkeys are deleted NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of
fkeys2are deleted DROP TABLE pkeys; DROP TABLE fkeys;
 

======================================================================

and logs show:

ERROR:  23514: new row for relation "check_tbl" violates CHECK
constraint "check_con"
LOCATION:  ExecConstraints, execMain.c:1698
ERROR:  09000: tuple references non-existent key
DETAIL:  Trigger "check_fkeys2_pkey_exist" found tuple referencing
non-existent key in "pkeys".
LOCATION:  check_primary_key, refint.c:214
ERROR:  23514: new row for relation "check_tbl" violates CHECK
constraint "check_con"
LOCATION:  ExecConstraints, execMain.c:1698

ERROR:  XX000: could not open relation with OID 119980
LOCATION:  relation_open, heapam.c:459

ERROR:  23502: null value for attribute "aa" violates NOT NULL
constraint
LOCATION:  ExecConstraints, execMain.c:1686
ERROR:  09000: tuple references non-existent key

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: parallel regression test failure

From
Bruce Momjian
Date:
That is a very good guess.  All the errors seem related to the parser.

---------------------------------------------------------------------------

Robert Creager wrote:
-- Start of PGP signed section.
> 
> Could the failures have something to do with bison level?  2003-02-01
> would not compile with 1.875, but compiles with 1.5.  Which is running
> now...
> 
> Later,
> Rob
> 
> On Sat, 26 Jul 2003 14:12:35 -0400 (EDT)
> Bruce Momjian <pgman@candle.pha.pa.us> said something like:
> 
> > 
> > I just reproduced the same failure for the same date.  Let me try
> > another date here.
> > 
> > ---------------------------------------------------------------------
> > ------
> > 
> > Robert Creager wrote:
> > -- Start of PGP signed section.
> > > On Sat, 26 Jul 2003 11:09:54 -0400 (EDT)
> > > Bruce Momjian <pgman@candle.pha.pa.us> said something like:
> > > 
> > > > 
> > > > I think you just take a CVS checkout and to:
> > > > 
> > > >     cvs update -D '2003-05-01 00:00:00 GMT' pgsql
> > > > 
> > > > and keep changing the dates to find the date it started breaking.
> > > > 
> > > 
> > > I just want to make sure I'm not chasing my tail.
> > > 
> > > I just went to 2002-12-01 in an empty directory, and had the
> > > following failures:
> > > 
> > > *** ./expected/strings.out    Sun Sep 22 11:27:25 2002
> > > --- ./results/strings.out    Sat Jul 26 11:20:22 2003
> > > ***************
> > > *** 18,24 ****
> > >   ' - next line' /* this comment is not allowed here */
> > >   ' - third line'
> > >       AS "Illegal comment within continuation";
> > > ! ERROR:  parser: parse error at or near "' - third line'" at
> > > character 75
> > >   --
> > >   -- test conversions between various string types
> > >   -- E021-10 implicit casting among the character data types
> > > --- 18,24 ----
> > >   ' - next line' /* this comment is not allowed here */
> > >   ' - third line'
> > >       AS "Illegal comment within continuation";
> > > ! ERROR:  parser: syntax error at or near "' - third line'" at
> > > character 75
> > >   --
> > >   -- test conversions between various string types
> > >   -- E021-10 implicit casting among the character data types
> > > 
> > > ===================================================================
> > > ===
> > > 
> > > *** ./expected/geometry.out    Fri Nov  8 13:09:55 2002
> > > --- ./results/geometry.out    Sat Jul 26 11:20:23 2003
> > > ***************
> > > *** 258,281 ****
> > >    twenty |                               rotation                  
> > >                
> > >   --------+--------------------------------------------------------
> > >   --------------
> > >           | (0,-0),(-0.2,-0.2)
> > > -         | (-0.1,-0.1),(-0.3,-0.3)
> > > -         | (-0.25,-0.25),(-0.25,-0.35)
> > > -         | (-0.3,-0.3),(-0.3,-0.3)
> > >           | (0.08,-0),(0,-0.56)
> > > -         | (0.12,-0.28),(0.04,-0.84)
> > > -         | (0.26,-0.7),(0.1,-0.82)
> > > -         | (0.12,-0.84),(0.12,-0.84)
> > >           | (0.0651176557644,0),(0,-0.0483449262493)
> > > -         |
> > > (0.0976764836466,-0.0241724631247),(0.0325588278822,-0.072517389374
> > > )-         |
> > > (0.109762715209,-0.0562379754329),(0.0813970697055,-0.0604311578117
> > > )-         |
> > > (0.0976764836466,-0.072517389374),(0.0976764836466,-0.072517389374)
> > >           | (-0,0.0828402366864),(-0.201183431953,0)
> > > -         |
> > > (-0.100591715976,0.12426035503),(-0.301775147929,0.0414201183432)-  
> > >       |
> > >       (-0.251479289941,0.103550295858),(-0.322485207101,0.073964497
> > >       0414)
> > > -         |
> > > (-0.301775147929,0.12426035503),(-0.301775147929,0.12426035503)
> > >           | (0.2,0),(0,0)
> > >           | (0.3,0),(0.1,0)
> > >           | (0.3,0.05),(0.25,0)
> > >           | (0.3,0),(0.3,0)
> > >   (20 rows)
> > >   
> > > --- 258,281 ----
> > >    twenty |                               rotation                  
> > >                
> > >   --------+--------------------------------------------------------
> > >   --------------
> > >           | (0,-0),(-0.2,-0.2)
> > >           | (0.08,-0),(0,-0.56)
> > >           | (0.0651176557644,0),(0,-0.0483449262493)
> > >           | (-0,0.0828402366864),(-0.201183431953,0)
> > >           | (0.2,0),(0,0)
> > > +         | (-0.1,-0.1),(-0.3,-0.3)
> > > +         | (0.12,-0.28),(0.04,-0.84)
> > > +         |
> > > (0.0976764836466,-0.0241724631247),(0.0325588278822,-0.072517389374
> > > )+         |
> > > (-0.100591715976,0.12426035503),(-0.301775147929,0.0414201183432)
> > >           | (0.3,0),(0.1,0)
> > > +         | (-0.25,-0.25),(-0.25,-0.35)
> > > +         | (0.26,-0.7),(0.1,-0.82)
> > > +         |
> > > (0.109762715209,-0.0562379754329),(0.0813970697055,-0.0604311578117
> > > )+         |
> > > (-0.251479289941,0.103550295858),(-0.322485207101,0.0739644970414)
> > >           | (0.3,0.05),(0.25,0)
> > > +         | (-0.3,-0.3),(-0.3,-0.3)
> > > +         | (0.12,-0.84),(0.12,-0.84)
> > > +         |
> > > (0.0976764836466,-0.072517389374),(0.0976764836466,-0.072517389374)+
> > >         |
> > >         (-0.301775147929,0.12426035503),(-0.301775147929,0.12426035
> > >         503)
> > >           | (0.3,0),(0.3,0)
> > >   (20 rows)
> > >   
> > > 
> > > ===================================================================
> > > ===
> > > 
> > > *** ./expected/create_function_1.out    Sat Jul 26 11:19:18 2003
> > > --- ./results/create_function_1.out    Sat Jul 26 11:20:24 2003
> > > ***************
> > > *** 51,57 ****
> > >   ERROR:  return type mismatch in function: declared to return
> > >   integer, returns "unknown" CREATE FUNCTION test1 (int) RETURNS int
> > >   LANGUAGE sql
> > >       AS 'not even SQL';
> > > ! ERROR:  parser: parse error at or near "not" at character 1
> > >   CREATE FUNCTION test1 (int) RETURNS int LANGUAGE sql
> > >       AS 'SELECT 1, 2, 3;';
> > >   ERROR:  function declared to return integer returns multiple
> > >   columns in final SELECT
> > > --- 51,57 ----
> > >   ERROR:  return type mismatch in function: declared to return
> > >   integer, returns "unknown" CREATE FUNCTION test1 (int) RETURNS int
> > >   LANGUAGE sql
> > >       AS 'not even SQL';
> > > ! ERROR:  parser: syntax error at or near "not" at character 1
> > >   CREATE FUNCTION test1 (int) RETURNS int LANGUAGE sql
> > >       AS 'SELECT 1, 2, 3;';
> > >   ERROR:  function declared to return integer returns multiple
> > >   columns in final SELECT
> > > 
> > > ===================================================================
> > > ===
> > > 
> > > *** ./expected/constraints.out    Sat Jul 26 11:19:18 2003
> > > --- ./results/constraints.out    Sat Jul 26 11:20:26 2003
> > > ***************
> > > *** 45,56 ****
> > >   -- syntax errors
> > >   --  test for extraneous comma
> > >   CREATE TABLE error_tbl (i int DEFAULT (100, ));
> > > ! ERROR:  parser: parse error at or near "," at character 43
> > >   --  this will fail because gram.y uses b_expr not a_expr for
> > >   defaults,--  to avoid a shift/reduce conflict that arises from NOT
> > >   NULL being--  part of the column definition syntax:
> > >   CREATE TABLE error_tbl (b1 bool DEFAULT 1 IN (1, 2));
> > > ! ERROR:  parser: parse error at or near "IN" at character 43
> > >   --  this should work, however:
> > >   CREATE TABLE error_tbl (b1 bool DEFAULT (1 IN (1, 2)));
> > >   DROP TABLE error_tbl;
> > > --- 45,56 ----
> > >   -- syntax errors
> > >   --  test for extraneous comma
> > >   CREATE TABLE error_tbl (i int DEFAULT (100, ));
> > > ! ERROR:  parser: syntax error at or near "," at character 43
> > >   --  this will fail because gram.y uses b_expr not a_expr for
> > >   defaults,--  to avoid a shift/reduce conflict that arises from NOT
> > >   NULL being--  part of the column definition syntax:
> > >   CREATE TABLE error_tbl (b1 bool DEFAULT 1 IN (1, 2));
> > > ! ERROR:  parser: syntax error at or near "IN" at character 43
> > >   --  this should work, however:
> > >   CREATE TABLE error_tbl (b1 bool DEFAULT (1 IN (1, 2)));
> > >   DROP TABLE error_tbl;
> > > 
> > > ===================================================================
> > > ===
> > > 
> > > *** ./expected/errors.out    Mon Sep  2 00:05:16 2002
> > > --- ./results/errors.out    Sat Jul 26 11:20:28 2003
> > > ***************
> > > *** 22,34 ****
> > >    
> > >   -- missing relation name 
> > >   select;
> > > ! ERROR:  parser: parse error at or near ";" at character 7
> > >   -- no such relation 
> > >   select * from nonesuch;
> > >   ERROR:  Relation "nonesuch" does not exist
> > >   -- missing target list
> > >   select from pg_database;
> > > ! ERROR:  parser: parse error at or near "from" at character 8
> > >   -- bad name in target list
> > >   select nonesuch from pg_database;
> > >   ERROR:  Attribute "nonesuch" not found
> > > --- 22,34 ----
> > >    
> > >   -- missing relation name 
> > >   select;
> > > ! ERROR:  parser: syntax error at or near ";" at character 7
> > >   -- no such relation 
> > >   select * from nonesuch;
> > >   ERROR:  Relation "nonesuch" does not exist
> > >   -- missing target list
> > >   select from pg_database;
> > > ! ERROR:  parser: syntax error at or near "from" at character 8
> > >   -- bad name in target list
> > >   select nonesuch from pg_database;
> > >   ERROR:  Attribute "nonesuch" not found
> > > ***************
> > > *** 40,46 ****
> > >   ERROR:  Attribute "nonesuch" not found
> > >   -- bad select distinct on syntax, distinct attribute missing
> > >   select distinct on (foobar) from pg_database;
> > > ! ERROR:  parser: parse error at or near "from" at character 29
> > >   -- bad select distinct on syntax, distinct attribute not in target
> > >   list select distinct on (foobar) * from pg_database;
> > >   ERROR:  Attribute "foobar" not found
> > > --- 40,46 ----
> > >   ERROR:  Attribute "nonesuch" not found
> > >   -- bad select distinct on syntax, distinct attribute missing
> > >   select distinct on (foobar) from pg_database;
> > > ! ERROR:  parser: syntax error at or near "from" at character 29
> > >   -- bad select distinct on syntax, distinct attribute not in target
> > >   list select distinct on (foobar) * from pg_database;
> > >   ERROR:  Attribute "foobar" not found
> > > ***************
> > > *** 49,55 ****
> > >    
> > >   -- missing relation name (this had better not wildcard!) 
> > >   delete from;
> > > ! ERROR:  parser: parse error at or near ";" at character 12
> > >   -- no such relation 
> > >   delete from nonesuch;
> > >   ERROR:  Relation "nonesuch" does not exist
> > > --- 49,55 ----
> > >    
> > >   -- missing relation name (this had better not wildcard!) 
> > >   delete from;
> > > ! ERROR:  parser: syntax error at or near ";" at character 12
> > >   -- no such relation 
> > >   delete from nonesuch;
> > >   ERROR:  Relation "nonesuch" does not exist
> > > ***************
> > > *** 58,64 ****
> > >    
> > >   -- missing relation name (this had better not wildcard!) 
> > >   drop table;
> > > ! ERROR:  parser: parse error at or near ";" at character 11
> > >   -- no such relation 
> > >   drop table nonesuch;
> > >   ERROR:  table "nonesuch" does not exist
> > > --- 58,64 ----
> > >    
> > >   -- missing relation name (this had better not wildcard!) 
> > >   drop table;
> > > ! ERROR:  parser: syntax error at or near ";" at character 11
> > >   -- no such relation 
> > >   drop table nonesuch;
> > >   ERROR:  table "nonesuch" does not exist
> > > ***************
> > > *** 68,74 ****
> > >   -- relation renaming 
> > >   -- missing relation name 
> > >   alter table rename;
> > > ! ERROR:  parser: parse error at or near ";" at character 19
> > >   -- no such relation 
> > >   alter table nonesuch rename to newnonesuch;
> > >   ERROR:  Relation "nonesuch" does not exist
> > > --- 68,74 ----
> > >   -- relation renaming 
> > >   -- missing relation name 
> > >   alter table rename;
> > > ! ERROR:  parser: syntax error at or near ";" at character 19
> > >   -- no such relation 
> > >   alter table nonesuch rename to newnonesuch;
> > >   ERROR:  Relation "nonesuch" does not exist
> > > ***************
> > > *** 122,131 ****
> > >    
> > >   -- missing index name 
> > >   drop index;
> > > ! ERROR:  parser: parse error at or near ";" at character 11
> > >   -- bad index name 
> > >   drop index 314159;
> > > ! ERROR:  parser: parse error at or near "314159" at character 12
> > >   -- no such index 
> > >   drop index nonesuch;
> > >   ERROR:  index "nonesuch" does not exist
> > > --- 122,131 ----
> > >    
> > >   -- missing index name 
> > >   drop index;
> > > ! ERROR:  parser: syntax error at or near ";" at character 11
> > >   -- bad index name 
> > >   drop index 314159;
> > > ! ERROR:  parser: syntax error at or near "314159" at character 12
> > >   -- no such index 
> > >   drop index nonesuch;
> > >   ERROR:  index "nonesuch" does not exist
> > > ***************
> > > *** 134,146 ****
> > >    
> > >   -- missing aggregate name 
> > >   drop aggregate;
> > > ! ERROR:  parser: parse error at or near ";" at character 15
> > >   -- missing aggregate type
> > >   drop aggregate newcnt1;
> > > ! ERROR:  parser: parse error at or near ";" at character 23
> > >   -- bad aggregate name 
> > >   drop aggregate 314159 (int);
> > > ! ERROR:  parser: parse error at or near "314159" at character 16
> > >   -- bad aggregate type
> > >   drop aggregate newcnt (nonesuch);
> > >   ERROR:  Type "nonesuch" does not exist
> > > --- 134,146 ----
> > >    
> > >   -- missing aggregate name 
> > >   drop aggregate;
> > > ! ERROR:  parser: syntax error at or near ";" at character 15
> > >   -- missing aggregate type
> > >   drop aggregate newcnt1;
> > > ! ERROR:  parser: syntax error at or near ";" at character 23
> > >   -- bad aggregate name 
> > >   drop aggregate 314159 (int);
> > > ! ERROR:  parser: syntax error at or near "314159" at character 16
> > >   -- bad aggregate type
> > >   drop aggregate newcnt (nonesuch);
> > >   ERROR:  Type "nonesuch" does not exist
> > > ***************
> > > *** 155,164 ****
> > >    
> > >   -- missing function name 
> > >   drop function ();
> > > ! ERROR:  parser: parse error at or near "(" at character 15
> > >   -- bad function name 
> > >   drop function 314159();
> > > ! ERROR:  parser: parse error at or near "314159" at character 15
> > >   -- no such function 
> > >   drop function nonesuch();
> > >   ERROR:  RemoveFunction: function nonesuch() does not exist
> > > --- 155,164 ----
> > >    
> > >   -- missing function name 
> > >   drop function ();
> > > ! ERROR:  parser: syntax error at or near "(" at character 15
> > >   -- bad function name 
> > >   drop function 314159();
> > > ! ERROR:  parser: syntax error at or near "314159" at character 15
> > >   -- no such function 
> > >   drop function nonesuch();
> > >   ERROR:  RemoveFunction: function nonesuch() does not exist
> > > ***************
> > > *** 167,176 ****
> > >    
> > >   -- missing type name 
> > >   drop type;
> > > ! ERROR:  parser: parse error at or near ";" at character 10
> > >   -- bad type name 
> > >   drop type 314159;
> > > ! ERROR:  parser: parse error at or near "314159" at character 11
> > >   -- no such type 
> > >   drop type nonesuch;
> > >   ERROR:  Type "nonesuch" does not exist
> > > --- 167,176 ----
> > >    
> > >   -- missing type name 
> > >   drop type;
> > > ! ERROR:  parser: syntax error at or near ";" at character 10
> > >   -- bad type name 
> > >   drop type 314159;
> > > ! ERROR:  parser: syntax error at or near "314159" at character 11
> > >   -- no such type 
> > >   drop type nonesuch;
> > >   ERROR:  Type "nonesuch" does not exist
> > > ***************
> > > *** 179,200 ****
> > >    
> > >   -- missing everything 
> > >   drop operator;
> > > ! ERROR:  parser: parse error at or near ";" at character 14
> > >   -- bad operator name 
> > >   drop operator equals;
> > > ! ERROR:  parser: parse error at or near ";" at character 21
> > >   -- missing type list 
> > >   drop operator ===;
> > > ! ERROR:  parser: parse error at or near ";" at character 18
> > >   -- missing parentheses 
> > >   drop operator int4, int4;
> > > ! ERROR:  parser: parse error at or near "," at character 19
> > >   -- missing operator name 
> > >   drop operator (int4, int4);
> > > ! ERROR:  parser: parse error at or near "(" at character 15
> > >   -- missing type list contents 
> > >   drop operator === ();
> > > ! ERROR:  parser: parse error at or near ")" at character 20
> > >   -- no such operator 
> > >   drop operator === (int4);
> > >   ERROR:  parser: argument type missing (use NONE for unary
> > >   operators)
> > > --- 179,200 ----
> > >    
> > >   -- missing everything 
> > >   drop operator;
> > > ! ERROR:  parser: syntax error at or near ";" at character 14
> > >   -- bad operator name 
> > >   drop operator equals;
> > > ! ERROR:  parser: syntax error at or near ";" at character 21
> > >   -- missing type list 
> > >   drop operator ===;
> > > ! ERROR:  parser: syntax error at or near ";" at character 18
> > >   -- missing parentheses 
> > >   drop operator int4, int4;
> > > ! ERROR:  parser: syntax error at or near "," at character 19
> > >   -- missing operator name 
> > >   drop operator (int4, int4);
> > > ! ERROR:  parser: syntax error at or near "(" at character 15
> > >   -- missing type list contents 
> > >   drop operator === ();
> > > ! ERROR:  parser: syntax error at or near ")" at character 20
> > >   -- no such operator 
> > >   drop operator === (int4);
> > >   ERROR:  parser: argument type missing (use NONE for unary
> > >   operators)
> > > ***************
> > > *** 206,212 ****
> > >   ERROR:  parser: argument type missing (use NONE for unary
> > >   operators)-- no such type1 
> > >   drop operator = ( , int4);
> > > ! ERROR:  parser: parse error at or near "," at character 19
> > >   -- no such type1 
> > >   drop operator = (nonesuch, int4);
> > >   ERROR:  Type "nonesuch" does not exist
> > > --- 206,212 ----
> > >   ERROR:  parser: argument type missing (use NONE for unary
> > >   operators)-- no such type1 
> > >   drop operator = ( , int4);
> > > ! ERROR:  parser: syntax error at or near "," at character 19
> > >   -- no such type1 
> > >   drop operator = (nonesuch, int4);
> > >   ERROR:  Type "nonesuch" does not exist
> > > ***************
> > > *** 215,239 ****
> > >   ERROR:  Type "nonesuch" does not exist
> > >   -- no such type2 
> > >   drop operator = (int4, );
> > > ! ERROR:  parser: parse error at or near ")" at character 24
> > >   --
> > >   -- DROP RULE
> > >    
> > >   -- missing rule name 
> > >   drop rule;
> > > ! ERROR:  parser: parse error at or near ";" at character 10
> > >   -- bad rule name 
> > >   drop rule 314159;
> > > ! ERROR:  parser: parse error at or near "314159" at character 11
> > >   -- no such rule 
> > >   drop rule nonesuch on noplace;
> > >   ERROR:  Relation "noplace" does not exist
> > >   -- bad keyword 
> > >   drop tuple rule nonesuch;
> > > ! ERROR:  parser: parse error at or near "tuple" at character 6
> > >   -- no such rule 
> > >   drop instance rule nonesuch on noplace;
> > > ! ERROR:  parser: parse error at or near "instance" at character 6
> > >   -- no such rule 
> > >   drop rewrite rule nonesuch;
> > > ! ERROR:  parser: parse error at or near "rewrite" at character 6
> > > --- 215,239 ----
> > >   ERROR:  Type "nonesuch" does not exist
> > >   -- no such type2 
> > >   drop operator = (int4, );
> > > ! ERROR:  parser: syntax error at or near ")" at character 24
> > >   --
> > >   -- DROP RULE
> > >    
> > >   -- missing rule name 
> > >   drop rule;
> > > ! ERROR:  parser: syntax error at or near ";" at character 10
> > >   -- bad rule name 
> > >   drop rule 314159;
> > > ! ERROR:  parser: syntax error at or near "314159" at character 11
> > >   -- no such rule 
> > >   drop rule nonesuch on noplace;
> > >   ERROR:  Relation "noplace" does not exist
> > >   -- bad keyword 
> > >   drop tuple rule nonesuch;
> > > ! ERROR:  parser: syntax error at or near "tuple" at character 6
> > >   -- no such rule 
> > >   drop instance rule nonesuch on noplace;
> > > ! ERROR:  parser: syntax error at or near "instance" at character 6
> > >   -- no such rule 
> > >   drop rewrite rule nonesuch;
> > > ! ERROR:  parser: syntax error at or near "rewrite" at character 6
> > > 
> > > ===================================================================
> > > ===
> > > 
> > > 
> > > 
> > > -- 
> > >  11:22:06 up 8 days, 15:22,  2 users,  load average: 0.30, 0.51,
> > >  0.53
> > -- End of PGP section, PGP failed!
> > 
> > -- 
> >   Bruce Momjian                        |  http://candle.pha.pa.us
> >   pgman@candle.pha.pa.us               |  (610) 359-1001
> >   +  If your life is a hard drive,     |  13 Roberts Road
> >   +  Christ can be your backup.        |  Newtown Square, Pennsylvania
> >   19073
> > 
> > 
> 
> 
> -- 
>  12:41:22 up 8 days, 16:42,  2 users,  load average: 2.58, 1.05, 0.97
-- End of PGP section, PGP failed!

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: parallel regression test failure

From
Robert Creager
Date:
On Sat, 26 Jul 2003 16:40:27 -0400 (EDT)
Bruce Momjian <pgman@candle.pha.pa.us> said something like:

> 
> That is a very good guess.  All the errors seem related to the parser.
>

Everyone gets lucky now and then ;-)

I'm now using bison 1.5

2003-01-22 did not fail in 50 tests.
2003-01-26 has not failed yet in 33 of 50 tests.

2003-01-28 and 2003-02-15 are compiled and waiting...

2003-02-01 fails, but only 2 time in 50 tests:

*** ./expected/domain.out    Sat Jul 26 12:24:18 2003
--- ./results/domain.out    Sat Jul 26 12:56:01 2003
***************
*** 263,269 **** insert into domcontest values (5); alter domain con drop constraint t; insert into domcontest values
(-5);--fails
 
! ERROR:  ExecEvalConstraintTest: Domain con constraint $1 failed insert into domcontest values (42); -- cleanup drop
domainddef1 restrict;
 
--- 263,269 ---- insert into domcontest values (5); alter domain con drop constraint t; insert into domcontest values
(-5);--fails
 
! ERROR:  ExecEvalConstraintTest: Domain con constraint  failed insert into domcontest values (42); -- cleanup drop
domainddef1 restrict;
 

======================================================================


-- 14:52:02 up 8 days, 18:52,  2 users,  load average: 3.69, 3.40, 2.57

Re: parallel regression test failure

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> That is a very good guess.  All the errors seem related to the parser.

No, I don't think bison's got anything to do with it.  AFAICS all the
reported failures look more like syscache-level problems.  I'm betting
on a locking issue.  It'll be easier to find once you guys home in on
the date we broke it.
        regards, tom lane


Re: parallel regression test failure

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I am seeing the following parallel regression test failures.  Any idea
> on the cause?

For the record, I believe this is explained by the bug I just fixed in
_bt_search().

The bug occurs only when one backend is trying to search a btree index
at the same time another backend is doing the first page split in that
index (that is, when the aboriginal root-and-leaf page gets split into
two leaf pages).  In the present form of the parallel regression tests,
pg_class_oid_index and pg_type_oid_index suffer that split during the
third group of parallel tests, which is why the failures were bunched
in constraints/triggers/vacuum.

My guess is that the reason different vintages of CVS show or don't show
the problem is that modifications of the test scripts have caused more
or fewer pg_class and pg_type entries to get created, possibly moving
the critical split point before or after that set of parallel tests.
If the split occurs during a sequential test step then we'd never see
a failure.  This may explain why we've not become aware of the bug till
now, even though it's certainly been there a long time.

We need to think about whether this bug is serious enough to justify a
quick 7.3.5 release.  I'm leaning to the idea that it is not, because
if it were, we'd have heard about it from the field before now.  In
pre-7.4 code there is only one instant in the lifespan of an index where
the bug could occur, and then only if the index is created empty.
        regards, tom lane


Re: parallel regression test failure

From
Bruce Momjian
Date:
Tom Lane wrote:
> We need to think about whether this bug is serious enough to justify a
> quick 7.3.5 release.  I'm leaning to the idea that it is not, because
> if it were, we'd have heard about it from the field before now.  In
> pre-7.4 code there is only one instant in the lifespan of an index where
> the bug could occur, and then only if the index is created empty.

Agreed, I don't think 7.3.5 is warranted, but it would have been nice to
get this in 7.3.4.  Let's keep our eyes open for maybe a 7.3.5 later.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: parallel regression test failure

From
Kurt Roeckx
Date:
On Fri, Jul 25, 2003 at 05:47:50PM -0400, Bruce Momjian wrote:
> I am seeing the following parallel regression test failures.  Any idea
> on the cause?

I think I saw about the same thing once, but I run the test again
and it didn't show up anymore at all.  I'm not sure what it
exactly was, but it looked a bit simular to yours.


Kurt