Re: Intermittent stats test failures on buildfarm - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Intermittent stats test failures on buildfarm
Date
Msg-id 21191.1125409811@sss.pgh.pa.us
Whole thread Raw
In response to Re: Intermittent stats test failures on buildfarm  (Kris Jurka <books@ejurka.com>)
List pgsql-hackers
Kris Jurka <books@ejurka.com> writes:
> On Tue, 30 Aug 2005, Tom Lane wrote:
>> What we are left with turns out to be multiple occurrences of the first
>> pathology on exactly three buildfarm members:
>> 
>> ferret        Cygwin
>> kudu        Solaris 9, x86
>> dragonfly    Solaris 9, x86
>> 
>> So what to make of this?  Dunno, but it is clearly a very
>> platform-specific behavior.  Anyone see a connection between Cygwin
>> and Solaris?

> One thing to note about kudu and dragonfly is that they are running under 
> vmware.  This, combined with cygwin's reputation, makes me suspect that 
> the connection is that they are both struggling under load.  Although 
> canary (NetBSD 1.6 x86) is setup in the same fashion and has shown no such 
> failures.

Hmm.  One pretty obvious explanation of the failure is simply that the
machine is so loaded that the stats collector doesn't get to run for a
few seconds.  I had dismissed this idea because I figured the buildfarm
machine owners would schedule the tests to run at relatively low-load
times of day ... but maybe that's not true on these two machines?

We could try increasing the delay in the stats test, say from two
seconds to five.  If it is just a matter of load, that should result
in a very large drop in the frequency of the failure.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: ALTER TABLE ( smallinto -> boolean ) ...
Next
From: Tom Lane
Date:
Subject: Re: VACUUM/t_ctid bug (was Re: GiST concurrency commited)