Jason Tishler <Jason.Tishler@dothill.com> writes:
> I just figured out what is error 10061 -- it is WSAECONNREFUSED, Winsock's
> version of ECONNREFUSED. I just submitted a patch to Cygwin that maps
> getsockopt optval's from the Winsock versions to their corresponding
> errno values.
Ah so. Sounds good.
> If my Cygwin patch is accepted, I'll let the list know. At that time, I
> think that the fe-connect.c change should be backed out.
My feeling is that we should leave it in place for 7.1 in any case.
Once there's a shipping Cygwin version that maps the error number
correctly, we can back out the patch so that Cygwin is treated more
like other platforms.
> In digging some more through the MSDN, I found out the backlog limit
> on NT 4.0 Workstation and Server is 5 and 200, respectively.
This page only talks about NT; what of other flavors of Windows? Cygwin
runs on more than NT, doesn't it?
Interesting point here: a copy of Postgres compiled on NT WS would
presumably see SOMAXCONN = 5 in the system headers. If the executable
is then moved to NT Server, it would fail to take advantage of the
higher queue limit. Do we need to hardwire a hack to use the larger
value always on Windows?
> When running the parallel_schedule, as many as 18 psql's are trying to
> connect to postmaster. Isn't it conceivable that more than 6 are trying
> to connection concurrently?
Yes (although that's still hypothesis, not the proven cause of failure).
I still suspect there's something else going on here, anyway. SOMAXCONN
is nominally 5 on quite a lot of Unixen, but we've only heard reports of
transient "make check" connect failures on Windows. Why is Windows so
much more prone to show this problem?
regards, tom lane