Thread: RE: Re: Cygwin PostgreSQL CVS Patch

RE: Re: Cygwin PostgreSQL CVS Patch

From
Horák Daniel
Date:
> Another issue you might be interested in is that of Unix
> domain sockets.
> I understand that they now exist in Cygwin, so you might want
> to refine
> this snippet in src/include/config.h[.in]:
>
> /*
>  * Define this if your operating system supports AF_UNIX
> family sockets.
>  */
> #if !defined(__CYGWIN__) && !defined(__QNX__) && !defined(__BEOS__)
> # define HAVE_UNIX_SOCKETS 1
> #endif

I tried AF_UNIX sockets with Cygwin few months ago. The code have
compiled OK but was not working. The special file in Cygwin that
represents the socket was somehow corrupted (there was no <!socket>
string). But with latest Cygwin release can be the situation better.

        Dan

Re: Re: Cygwin PostgreSQL CVS Patch

From
Jason Tishler
Date:
Dan,

On Mon, Jan 15, 2001 at 02:01:35PM +0100, Horák Daniel wrote:
> > Another issue you might be interested in is that of Unix
> > domain sockets.
> > I understand that they now exist in Cygwin, so you might want
> > to refine
> > this snippet in src/include/config.h[.in]:
> >
> > /*
> >  * Define this if your operating system supports AF_UNIX
> > family sockets.
> >  */
> > #if !defined(__CYGWIN__) && !defined(__QNX__) && !defined(__BEOS__)
> > # define HAVE_UNIX_SOCKETS 1
> > #endif
>
> I tried AF_UNIX sockets with Cygwin few months ago. The code have
> compiled OK but was not working. The special file in Cygwin that
> represents the socket was somehow corrupted (there was no <!socket>
> string). But with latest Cygwin release can be the situation better.

I just tried AF_UNIX sockets with Cygwin 1.1.7 last night and everything
seemed to work just fine -- at least all of the regression tests passed.
I will be submitting a patch shortly.

Jason

--
Jason Tishler
Director, Software Engineering       Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp.               Fax:   +1 (732) 264-8798
82 Bethany Road, Suite 7             Email: Jason.Tishler@dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com

Re: Cygwin PostgreSQL CVS Patch

From
Tom Lane
Date:
Jason Tishler <Jason.Tishler@dothill.com> writes:
> I just tried AF_UNIX sockets with Cygwin 1.1.7 last night and everything
> seemed to work just fine -- at least all of the regression tests passed.
> I will be submitting a patch shortly.

Please first check that the failure cases also behave reasonably ---
connecting to a nonexistent socket, no postmaster, etc.

            regards, tom lane

Re: Cygwin PostgreSQL CVS Patch

From
Jason Tishler
Date:
Tom,

On Mon, Jan 15, 2001 at 10:08:43AM -0500, Tom Lane wrote:
> Jason Tishler <Jason.Tishler@dothill.com> writes:
> > I just tried AF_UNIX sockets with Cygwin 1.1.7 last night and everything
> > seemed to work just fine -- at least all of the regression tests passed.
> > I will be submitting a patch shortly.
>
> Please first check that the failure cases also behave reasonably ---
> connecting to a nonexistent socket, no postmaster, etc.

I've already tested the no postmaster case.  Please be more explicit
regarding "connecting to a nonexistent socket" and etc.  I'm afraid that
if I guess then I'll miss something.

Thanks,
Jason

--
Jason Tishler
Director, Software Engineering       Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp.               Fax:   +1 (732) 264-8798
82 Bethany Road, Suite 7             Email: Jason.Tishler@dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com

Re: Cygwin PostgreSQL CVS Patch

From
Tom Lane
Date:
Jason Tishler <Jason.Tishler@dothill.com> writes:
> I've already tested the no postmaster case.  Please be more explicit
> regarding "connecting to a nonexistent socket" and etc.  I'm afraid that
> if I guess then I'll miss something.

The cases I was thinking of were (a) no socket file (normal case if no
postmaster) and (b) socket file present but no postmaster attached ---
you can get that by kill -9'ing the postmaster ...

            regards, tom lane

Re: Cygwin PostgreSQL CVS Patch

From
Jason Tishler
Date:
Tom,

On Mon, Jan 15, 2001 at 05:23:20PM -0500, Tom Lane wrote:
> Jason Tishler <Jason.Tishler@dothill.com> writes:
> > I've already tested the no postmaster case.  Please be more explicit
> > regarding "connecting to a nonexistent socket" and etc.  I'm afraid that
> > if I guess then I'll miss something.
>
> The cases I was thinking of were (a) no socket file (normal case if no
> postmaster) and

Case a behaves correctly.

> (b) socket file present but no postmaster attached ---
> you can get that by kill -9'ing the postmaster ...

Unfortunately, case b causes psql to hang.  Using gdb, I was able to
trace that psql hangs calling select() in pqWait() (i.e.,
src/interfaces/libpq/fe-misc.c line 739).

I'm pretty sure that this is Cygwin bug.  I will try to formulate
a minimal test for submission to the Cygwin list but I'm not that
experienced with sockets.  Would anyone like to assist me with this
endeavor?

I will hold off submitting the patch until this is fixed.

Thanks,
Jason

--
Jason Tishler
Director, Software Engineering       Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp.               Fax:   +1 (732) 264-8798
82 Bethany Road, Suite 7             Email: Jason.Tishler@dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com

Re: Cygwin PostgreSQL CVS Patch

From
Tom Lane
Date:
Jason Tishler <Jason.Tishler@dothill.com> writes:
>> (b) socket file present but no postmaster attached ---
>> you can get that by kill -9'ing the postmaster ...

> Unfortunately, case b causes psql to hang.  Using gdb, I was able to
> trace that psql hangs calling select() in pqWait() (i.e.,
> src/interfaces/libpq/fe-misc.c line 739).

> I'm pretty sure that this is Cygwin bug.  I will try to formulate
> a minimal test for submission to the Cygwin list but I'm not that
> experienced with sockets.  Would anyone like to assist me with this
> endeavor?

You should be able to do something like just creating an unconnected
socket file and then trying to cat(1) from it ...

> I will hold off submitting the patch until this is fixed.

That might be an overreaction --- this does sound like a Cygwin bug
that should be reported and fixed, but the scenario won't happen in
normal operations.  Furthermore, if we wait for a confirmed Cygwin
fix, we'll likely miss the 7.1 release.  I'd suggest going ahead and
enabling AF_UNIX socket support for Cygwin; worst case is that we
warn people to be wary of it in Cygwin versions < something-or-other.

            regards, tom lane

Re: Cygwin PostgreSQL CVS Patch

From
Jason Tishler
Date:
Tom,

On Tue, Jan 16, 2001 at 12:04:57AM -0500, Tom Lane wrote:
> Jason Tishler <Jason.Tishler@dothill.com> writes:
> >> (b) socket file present but no postmaster attached ---
> >> you can get that by kill -9'ing the postmaster ...
>
> > Unfortunately, case b causes psql to hang.  Using gdb, I was able to
> > trace that psql hangs calling select() in pqWait() (i.e.,
> > src/interfaces/libpq/fe-misc.c line 739).
>
> > I'm pretty sure that this is Cygwin bug.  I will try to formulate
> > a minimal test for submission to the Cygwin list but I'm not that
> > experienced with sockets.  Would anyone like to assist me with this
> > endeavor?
>
> You should be able to do something like just creating an unconnected
> socket file and then trying to cat(1) from it ...

I just tried the above on Linux (just to eliminate the Cygwin factor) and
I get the following whether or not postmaster is running:

    $ cat /tmp/.s.PGSQL.5432
    cat: /tmp/.s.PGSQL.5432: Invalid argument

What am I missing?

> > I will hold off submitting the patch until this is fixed.
>
> That might be an overreaction --- this does sound like a Cygwin bug
> that should be reported and fixed, but the scenario won't happen in
> normal operations.  Furthermore, if we wait for a confirmed Cygwin
> fix, we'll likely miss the 7.1 release.  I'd suggest going ahead and
> enabling AF_UNIX socket support for Cygwin; worst case is that we
> warn people to be wary of it in Cygwin versions < something-or-other.

The patch has been submitted.  Unfortunately, I just found two more
issues:

1. postmaster will start up without complaining about an already existing
unconnected socket file.  It should exit with the following error:

    $ postmaster
    FATAL: StreamServerPort: bind() failed: Address already in use
            Is another postmaster already running on that port?
            If not, remove socket node (/tmp/.s.PGSQL.5432) and retry.
    /usr/local/pgsql/bin/postmaster: cannot create UNIX stream port

2. A second postmaster will exit as appropriate but will not display the
above error message.  Instead it displays the following:

    $ postmaster
    Lock file "/usr/local/pgsql/data/postmaster.pid" already exists.
    Is another postmaster (pid 385) running in "/usr/local/pgsql/data"?

With all of these issues, does it still make sense to enable UNIX domain
sockets for Cygwin?

Jason

--
Jason Tishler
Director, Software Engineering       Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp.               Fax:   +1 (732) 264-8798
82 Bethany Road, Suite 7             Email: Jason.Tishler@dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com

Re: Cygwin PostgreSQL CVS Patch

From
Tom Lane
Date:
Jason Tishler <Jason.Tishler@dothill.com> writes:
>> You should be able to do something like just creating an unconnected
>> socket file and then trying to cat(1) from it ...

> I just tried the above on Linux (just to eliminate the Cygwin factor) and
> I get the following whether or not postmaster is running:

>     $ cat /tmp/.s.PGSQL.5432
>     cat: /tmp/.s.PGSQL.5432: Invalid argument

> What am I missing?

Nothing ... I hadn't actually tried that, but now that I do, I get

$ cat /tmp/.s.PGSQL.5432
cat: Cannot open /tmp/.s.PGSQL.5432: Operation not supported

so apparently you can't open a socket file except by using bind()
and so forth.  Sorry for the misinformation.

> 2. A second postmaster will exit as appropriate but will not display the
> above error message.  Instead it displays the following:

>     $ postmaster
>     Lock file "/usr/local/pgsql/data/postmaster.pid" already exists.
>     Is another postmaster (pid 385) running in "/usr/local/pgsql/data"?

That does not look like a bug; the data directory lockfile is created
first.  To test this properly, you'll need two data directories set up
so that you can start two postmasters (use the -D switch to direct each
one to the right place; you'll also need -D for initdb).  They should
start if given different port numbers (-p) or fail if the same port.
Might want to try different combinations of -i and not -i while you are
at it.

            regards, tom lane

Re: Cygwin PostgreSQL CVS Patch

From
Jason Tishler
Date:
Tom,

On Tue, Jan 16, 2001 at 10:57:29AM -0500, Tom Lane wrote:
> Jason Tishler <Jason.Tishler@dothill.com> writes:
> > 2. A second postmaster will exit as appropriate but will not display the
> > above error message.  Instead it displays the following:
>
> >     $ postmaster
> >     Lock file "/usr/local/pgsql/data/postmaster.pid" already exists.
> >     Is another postmaster (pid 385) running in "/usr/local/pgsql/data"?
>
> That does not look like a bug;

My point was that the error message generated by the second postmaster
is *different* that the one generated for the same test case on a real
UNIX (i.e., Linux) platform.

Jason

--
Jason Tishler
Director, Software Engineering       Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp.               Fax:   +1 (732) 264-8798
82 Bethany Road, Suite 7             Email: Jason.Tishler@dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com

Re: Cygwin PostgreSQL CVS Patch

From
Tom Lane
Date:
Jason Tishler <Jason.Tishler@dothill.com> writes:
>> That does not look like a bug;

> My point was that the error message generated by the second postmaster
> is *different* that the one generated for the same test case on a real
> UNIX (i.e., Linux) platform.

It is?  I get exactly that message (modulo the obvious path and PID
differences) when I try it.  Are you comparing to 7.0.* on Linux
perhaps?  IIRC, 7.0 makes these tests in a different order than 7.1
does.

            regards, tom lane

Re: Cygwin PostgreSQL CVS Patch

From
Jason Tishler
Date:
Tom,

On Wed, Jan 17, 2001 at 11:19:10AM -0500, Tom Lane wrote:
> Jason Tishler <Jason.Tishler@dothill.com> writes:
> >> That does not look like a bug;
>
> > My point was that the error message generated by the second postmaster
> > is *different* that the one generated for the same test case on a real
> > UNIX (i.e., Linux) platform.
>
> It is?  I get exactly that message (modulo the obvious path and PID
> differences) when I try it.  Are you comparing to 7.0.* on Linux
> perhaps?  IIRC, 7.0 makes these tests in a different order than 7.1
> does.

Bingo!  I *am* (still) using 7.0.3 on Linux.  I assumed that this aspect
would not have changed between 7.0.3 and 7.1.  There I go getting myself
in trouble again by assuming...

Thanks,
Jason

--
Jason Tishler
Director, Software Engineering       Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp.               Fax:   +1 (732) 264-8798
82 Bethany Road, Suite 7             Email: Jason.Tishler@dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com