Intermittent stats test failures on buildfarm - Mailing list pgsql-hackers

From Tom Lane
Subject Intermittent stats test failures on buildfarm
Date
Msg-id 17253.1125376253@sss.pgh.pa.us
Whole thread Raw
Responses Re: Intermittent stats test failures on buildfarm  (Kris Jurka <books@ejurka.com>)
List pgsql-hackers
I just spent a tedious hour digging through the buildfarm results
to see what I could learn about the intermittent failures we're seeing
in the stats regression test, such as here:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=ferret&dt=2005-05-29%2018:25:09
This is seen in both Check and InstallCheck steps.  A variant pathology
is seen here:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=gerbil&dt=2005-07-22%2007:58:01
Notice that only the heap stats columns are wrong in this case, not the
index stats.  I think that this variant behavior may have been fixed by
this patch:

2005-07-23 20:33  tgl
* src/backend/postmaster/pgstat.c: Fix some failures to initializetable entries induced by recent autovacuum
integration. Not clearthis explains recent stats problems, but it's definitely wrong.
 

but it's not certain since nobody traced through the code to exhibit
why those uninitialized table entries would have led to this particular
visible symptom.  But with no occurrences of that behavior since the
patch went in, I suspect it's fixed.

What we are left with turns out to be multiple occurrences of the first
pathology on exactly three buildfarm members:
ferret        Cygwinkudu        Solaris 9, x86dragonfly    Solaris 9, x86

There are no occurrences of the failure on the native-Windows machines,
nor on buzzard (Solaris 10, SPARC), nor on gerbil (Solaris 9, SPARC)
(though gerbil has one old occurrence of the second pathology, so maybe
that observation should be taken with a grain of salt).  And none
whatever on any other buildfarm member.

The same three machines are showing the failure in the 8.0 branch, too,
so it's not a recently-introduced issue.

And one thing more: kudu and dragonfly are actually the same machine,
same OS, different compilers.

So what to make of this?  Dunno, but it is clearly a very
platform-specific behavior.  Anyone see a connection between Cygwin
and Solaris?
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Thomas F. O'Connell"
Date:
Subject: Re: SHMMAX seems entirely broken in OS X 10.4.2
Next
From: Tom Lane
Date:
Subject: Re: SHMMAX seems entirely broken in OS X 10.4.2