Re: Intermittent "make check" failures on hyena - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: Intermittent "make check" failures on hyena
Date
Msg-id 44D6153F.5000607@dunslane.net
Whole thread Raw
In response to Intermittent "make check" failures on hyena  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Intermittent "make check" failures on hyena
List pgsql-hackers

Tom Lane wrote:

>I'm noticing that buildfarm member hyena sometimes fails the parallel
>regression tests, for instance
>http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=hyena&dt=2006-07-19%2009:20:00
>
>The symptom is always that one of the tests fails entirely because
>psql couldn't connect:
>
>psql: could not connect to server: Connection refused
>    Is the server running locally and accepting
>    connections on Unix domain socket "/tmp/.s.PGSQL.55678"?
>
>It's a different test failing in each occurrence.  Sometimes there are
>ensuing failures in subsequent tests that expect the side-effects
>of the one that failed, but there's clearly a common cause here.
>
>AFAIK it is not possible for Postgres itself to cause a "connection
>refused" failure --- that's a kernel-level errno.  So what's going on
>here?  The only idea that comes to mind is that this version of Solaris
>has some very low limit on SOMAXCONN, and when the timing is just so
>it's bouncing connection requests because several of them arrive faster
>than the postmaster can fork off children.  Googling suggests that there
>are versions of Solaris with SOMAXCONN as low as 5 :-( ... but other
>pages say that the default is higher, so this theory might be wrong.
>
>What is SOMAXCONN set to on that box, anyway?  If it's tiny, I suggest
>you increase SOMAXCONN to something saner, or if you can't, run "make
>check" with MAX_CONNECTIONS=5 added to the make command.  Does the
>buildfarm script have provisions for site-local settings of this
>parameter?
>
>    
>  
>


Yes it sure does.

This is the box that Sun donated, btw.

I get: ndd /dev/tcp tcp_conn_req_max_q   => 128

Is that the Solaris equivalent of SOMAXCONN? That's low, maybe, but not 
impossibly low.

I don't have root on the box, though. For now I have set MAX_CONNECTIONS 
to 8, to provide a modest limit on parallelism. I will see if I can 
coordinate with Robert and Josh to increase the OS limits.

Thanks for the diagnosis.


cheers

andrew


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Intermittent "make check" failures on hyena
Next
From: Tom Lane
Date:
Subject: Re: Intermittent "make check" failures on hyena