Intermittent "make check" failures on hyena - Mailing list pgsql-hackers

From Tom Lane
Subject Intermittent "make check" failures on hyena
Date
Msg-id 24155.1154877339@sss.pgh.pa.us
Whole thread Raw
Responses Re: Intermittent "make check" failures on hyena
List pgsql-hackers
I'm noticing that buildfarm member hyena sometimes fails the parallel
regression tests, for instance
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=hyena&dt=2006-07-19%2009:20:00

The symptom is always that one of the tests fails entirely because
psql couldn't connect:

psql: could not connect to server: Connection refusedIs the server running locally and acceptingconnections on Unix
domainsocket "/tmp/.s.PGSQL.55678"?
 

It's a different test failing in each occurrence.  Sometimes there are
ensuing failures in subsequent tests that expect the side-effects
of the one that failed, but there's clearly a common cause here.

AFAIK it is not possible for Postgres itself to cause a "connection
refused" failure --- that's a kernel-level errno.  So what's going on
here?  The only idea that comes to mind is that this version of Solaris
has some very low limit on SOMAXCONN, and when the timing is just so
it's bouncing connection requests because several of them arrive faster
than the postmaster can fork off children.  Googling suggests that there
are versions of Solaris with SOMAXCONN as low as 5 :-( ... but other
pages say that the default is higher, so this theory might be wrong.

What is SOMAXCONN set to on that box, anyway?  If it's tiny, I suggest
you increase SOMAXCONN to something saner, or if you can't, run "make
check" with MAX_CONNECTIONS=5 added to the make command.  Does the
buildfarm script have provisions for site-local settings of this
parameter?
        regards, tom lane


pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: TODO system WAS: 8.2 features status
Next
From: Andrew Dunstan
Date:
Subject: Re: Intermittent "make check" failures on hyena