Re: Cygwin PostgreSQL Regression Test Problems (Revisited) - Mailing list pgsql-ports

From Jason Tishler
Subject Re: Cygwin PostgreSQL Regression Test Problems (Revisited)
Date
Msg-id 20010331220722.A2591@dothill.com
Whole thread Raw
In response to Re: Cygwin PostgreSQL Regression Test Problems (Revisited)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Cygwin PostgreSQL Regression Test Problems (Revisited)  (Jason Tishler <Jason.Tishler@dothill.com>)
RE: Cygwin PostgreSQL Regression Test Problems (Revisited)  ("Hiroshi Inoue" <Inoue@tpf.co.jp>)
Re: Cygwin PostgreSQL Regression Test Problems (Revisited)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-ports
Tom,

On Sat, Mar 31, 2001 at 05:45:45PM -0500, Tom Lane wrote:
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> > Oh I found the same description yesterday though I've had no time
> > to test it. If your patch resolves *hang*, it may be the right solution
> > at least for cygwin port.
>
> It seems clear that it's a good idea for fe-misc.c to check the
> exceptfds bit as well as read/write ready --- I'm surprised we have not
> seen problems associated with that on other platforms.  I think it
> should check exceptfds all the time, regardless of whether we are
> waiting to read or to write.

I'm glad that you agree.  Please post to the list when the change is in
CVS and I will test that this solves the Cygwin regression test (i.e.,
psql) hangs.

BTW, this will also solve the problem of Cygwin psql hanging when no
postmaster is running which I stumbled across when enabling Unix domain
socket support.  Previously, I thought that it was a Cygwin problem but
now I know that it is caused by the same pqWait() problem.

> I'm inclined to also accept Jason's change to do the connect() in
> blocking mode on Cygwin.

Actually, the blocking connect() change for Cygwin is obviated by the
pqWait() fix.  So, I am now no longer recommending making the blocking
connect() change for Cygwin.  Unless, you do so for other Unixes too.

> If we do both of those things, have we
> resolved the issue on Cygwin, or is there still a problem?

If you do both of these changes, then the pqWait() fix will never be
triggered under Cygwin.  When I tested my hacky patch to pqWait(), I had
to back out my blocking connect() patch in order for the pqWait() changes
to take affect.  The regression test still did not hang -- although, I
continued to have spurious failures due to connection refused conditions.

On Sat, Mar 31, 2001 at 10:15:08AM +0900, Hiroshi Inoue wrote:
> BTW I've never passed the pararell regression test without hang or
> refusal(with your previous patch appiled) under my cygwin environ-
> ment. I added one more connect() call after the refusal and passed
> all regression test successfully. Hmm it may be a more preferable
> solution.

I'm wondering whether it makes sense to add a simple connection retry
policy as suggested above by Hiroshi?  Otherwise, make check will
generate false negatives due to connection refused conditions.

If it is considered too late in the release cycle for such a change,
then I offer the following suggestions:

1. Change make check to use the serial_schedule or at least allow it to
be easily selected via a make variable (e.g., make schedule=serial_schedule
check).

2. Change the backlog parameter to listen() in src/backend/libpq/pqcomm.c
to a number that will "ensure" that the parallel_schedule version of the
regression test does not generate connection refused conditions.  Note
that I'm not even sure this will really work on all (or any) platforms.

Thanks,
Jason

--
Jason Tishler
Director, Software Engineering       Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp.               Fax:   +1 (732) 264-8798
82 Bethany Road, Suite 7             Email: Jason.Tishler@dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com

pgsql-ports by date:

Previous
From: Tom Lane
Date:
Subject: Re: Cygwin PostgreSQL Regression Test Problems (Revisited)
Next
From: Jason Tishler
Date:
Subject: Re: Cygwin PostgreSQL Regression Test Problems (Revisited)