Thread: RE: Re: Cygwin PostgreSQL CVS Patch
> Another issue you might be interested in is that of Unix > domain sockets. > I understand that they now exist in Cygwin, so you might want > to refine > this snippet in src/include/config.h[.in]: > > /* > * Define this if your operating system supports AF_UNIX > family sockets. > */ > #if !defined(__CYGWIN__) && !defined(__QNX__) && !defined(__BEOS__) > # define HAVE_UNIX_SOCKETS 1 > #endif I tried AF_UNIX sockets with Cygwin few months ago. The code have compiled OK but was not working. The special file in Cygwin that represents the socket was somehow corrupted (there was no <!socket> string). But with latest Cygwin release can be the situation better. Dan
Dan, On Mon, Jan 15, 2001 at 02:01:35PM +0100, Horák Daniel wrote: > > Another issue you might be interested in is that of Unix > > domain sockets. > > I understand that they now exist in Cygwin, so you might want > > to refine > > this snippet in src/include/config.h[.in]: > > > > /* > > * Define this if your operating system supports AF_UNIX > > family sockets. > > */ > > #if !defined(__CYGWIN__) && !defined(__QNX__) && !defined(__BEOS__) > > # define HAVE_UNIX_SOCKETS 1 > > #endif > > I tried AF_UNIX sockets with Cygwin few months ago. The code have > compiled OK but was not working. The special file in Cygwin that > represents the socket was somehow corrupted (there was no <!socket> > string). But with latest Cygwin release can be the situation better. I just tried AF_UNIX sockets with Cygwin 1.1.7 last night and everything seemed to work just fine -- at least all of the regression tests passed. I will be submitting a patch shortly. Jason -- Jason Tishler Director, Software Engineering Phone: +1 (732) 264-8770 x235 Dot Hill Systems Corp. Fax: +1 (732) 264-8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com
Jason Tishler <Jason.Tishler@dothill.com> writes: > I just tried AF_UNIX sockets with Cygwin 1.1.7 last night and everything > seemed to work just fine -- at least all of the regression tests passed. > I will be submitting a patch shortly. Please first check that the failure cases also behave reasonably --- connecting to a nonexistent socket, no postmaster, etc. regards, tom lane
Tom, On Mon, Jan 15, 2001 at 10:08:43AM -0500, Tom Lane wrote: > Jason Tishler <Jason.Tishler@dothill.com> writes: > > I just tried AF_UNIX sockets with Cygwin 1.1.7 last night and everything > > seemed to work just fine -- at least all of the regression tests passed. > > I will be submitting a patch shortly. > > Please first check that the failure cases also behave reasonably --- > connecting to a nonexistent socket, no postmaster, etc. I've already tested the no postmaster case. Please be more explicit regarding "connecting to a nonexistent socket" and etc. I'm afraid that if I guess then I'll miss something. Thanks, Jason -- Jason Tishler Director, Software Engineering Phone: +1 (732) 264-8770 x235 Dot Hill Systems Corp. Fax: +1 (732) 264-8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com
Jason Tishler <Jason.Tishler@dothill.com> writes: > I've already tested the no postmaster case. Please be more explicit > regarding "connecting to a nonexistent socket" and etc. I'm afraid that > if I guess then I'll miss something. The cases I was thinking of were (a) no socket file (normal case if no postmaster) and (b) socket file present but no postmaster attached --- you can get that by kill -9'ing the postmaster ... regards, tom lane
Tom, On Mon, Jan 15, 2001 at 05:23:20PM -0500, Tom Lane wrote: > Jason Tishler <Jason.Tishler@dothill.com> writes: > > I've already tested the no postmaster case. Please be more explicit > > regarding "connecting to a nonexistent socket" and etc. I'm afraid that > > if I guess then I'll miss something. > > The cases I was thinking of were (a) no socket file (normal case if no > postmaster) and Case a behaves correctly. > (b) socket file present but no postmaster attached --- > you can get that by kill -9'ing the postmaster ... Unfortunately, case b causes psql to hang. Using gdb, I was able to trace that psql hangs calling select() in pqWait() (i.e., src/interfaces/libpq/fe-misc.c line 739). I'm pretty sure that this is Cygwin bug. I will try to formulate a minimal test for submission to the Cygwin list but I'm not that experienced with sockets. Would anyone like to assist me with this endeavor? I will hold off submitting the patch until this is fixed. Thanks, Jason -- Jason Tishler Director, Software Engineering Phone: +1 (732) 264-8770 x235 Dot Hill Systems Corp. Fax: +1 (732) 264-8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com
Jason Tishler <Jason.Tishler@dothill.com> writes: >> (b) socket file present but no postmaster attached --- >> you can get that by kill -9'ing the postmaster ... > Unfortunately, case b causes psql to hang. Using gdb, I was able to > trace that psql hangs calling select() in pqWait() (i.e., > src/interfaces/libpq/fe-misc.c line 739). > I'm pretty sure that this is Cygwin bug. I will try to formulate > a minimal test for submission to the Cygwin list but I'm not that > experienced with sockets. Would anyone like to assist me with this > endeavor? You should be able to do something like just creating an unconnected socket file and then trying to cat(1) from it ... > I will hold off submitting the patch until this is fixed. That might be an overreaction --- this does sound like a Cygwin bug that should be reported and fixed, but the scenario won't happen in normal operations. Furthermore, if we wait for a confirmed Cygwin fix, we'll likely miss the 7.1 release. I'd suggest going ahead and enabling AF_UNIX socket support for Cygwin; worst case is that we warn people to be wary of it in Cygwin versions < something-or-other. regards, tom lane
Tom, On Tue, Jan 16, 2001 at 12:04:57AM -0500, Tom Lane wrote: > Jason Tishler <Jason.Tishler@dothill.com> writes: > >> (b) socket file present but no postmaster attached --- > >> you can get that by kill -9'ing the postmaster ... > > > Unfortunately, case b causes psql to hang. Using gdb, I was able to > > trace that psql hangs calling select() in pqWait() (i.e., > > src/interfaces/libpq/fe-misc.c line 739). > > > I'm pretty sure that this is Cygwin bug. I will try to formulate > > a minimal test for submission to the Cygwin list but I'm not that > > experienced with sockets. Would anyone like to assist me with this > > endeavor? > > You should be able to do something like just creating an unconnected > socket file and then trying to cat(1) from it ... I just tried the above on Linux (just to eliminate the Cygwin factor) and I get the following whether or not postmaster is running: $ cat /tmp/.s.PGSQL.5432 cat: /tmp/.s.PGSQL.5432: Invalid argument What am I missing? > > I will hold off submitting the patch until this is fixed. > > That might be an overreaction --- this does sound like a Cygwin bug > that should be reported and fixed, but the scenario won't happen in > normal operations. Furthermore, if we wait for a confirmed Cygwin > fix, we'll likely miss the 7.1 release. I'd suggest going ahead and > enabling AF_UNIX socket support for Cygwin; worst case is that we > warn people to be wary of it in Cygwin versions < something-or-other. The patch has been submitted. Unfortunately, I just found two more issues: 1. postmaster will start up without complaining about an already existing unconnected socket file. It should exit with the following error: $ postmaster FATAL: StreamServerPort: bind() failed: Address already in use Is another postmaster already running on that port? If not, remove socket node (/tmp/.s.PGSQL.5432) and retry. /usr/local/pgsql/bin/postmaster: cannot create UNIX stream port 2. A second postmaster will exit as appropriate but will not display the above error message. Instead it displays the following: $ postmaster Lock file "/usr/local/pgsql/data/postmaster.pid" already exists. Is another postmaster (pid 385) running in "/usr/local/pgsql/data"? With all of these issues, does it still make sense to enable UNIX domain sockets for Cygwin? Jason -- Jason Tishler Director, Software Engineering Phone: +1 (732) 264-8770 x235 Dot Hill Systems Corp. Fax: +1 (732) 264-8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com
Jason Tishler <Jason.Tishler@dothill.com> writes: >> You should be able to do something like just creating an unconnected >> socket file and then trying to cat(1) from it ... > I just tried the above on Linux (just to eliminate the Cygwin factor) and > I get the following whether or not postmaster is running: > $ cat /tmp/.s.PGSQL.5432 > cat: /tmp/.s.PGSQL.5432: Invalid argument > What am I missing? Nothing ... I hadn't actually tried that, but now that I do, I get $ cat /tmp/.s.PGSQL.5432 cat: Cannot open /tmp/.s.PGSQL.5432: Operation not supported so apparently you can't open a socket file except by using bind() and so forth. Sorry for the misinformation. > 2. A second postmaster will exit as appropriate but will not display the > above error message. Instead it displays the following: > $ postmaster > Lock file "/usr/local/pgsql/data/postmaster.pid" already exists. > Is another postmaster (pid 385) running in "/usr/local/pgsql/data"? That does not look like a bug; the data directory lockfile is created first. To test this properly, you'll need two data directories set up so that you can start two postmasters (use the -D switch to direct each one to the right place; you'll also need -D for initdb). They should start if given different port numbers (-p) or fail if the same port. Might want to try different combinations of -i and not -i while you are at it. regards, tom lane
Tom, On Tue, Jan 16, 2001 at 10:57:29AM -0500, Tom Lane wrote: > Jason Tishler <Jason.Tishler@dothill.com> writes: > > 2. A second postmaster will exit as appropriate but will not display the > > above error message. Instead it displays the following: > > > $ postmaster > > Lock file "/usr/local/pgsql/data/postmaster.pid" already exists. > > Is another postmaster (pid 385) running in "/usr/local/pgsql/data"? > > That does not look like a bug; My point was that the error message generated by the second postmaster is *different* that the one generated for the same test case on a real UNIX (i.e., Linux) platform. Jason -- Jason Tishler Director, Software Engineering Phone: +1 (732) 264-8770 x235 Dot Hill Systems Corp. Fax: +1 (732) 264-8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com
Jason Tishler <Jason.Tishler@dothill.com> writes: >> That does not look like a bug; > My point was that the error message generated by the second postmaster > is *different* that the one generated for the same test case on a real > UNIX (i.e., Linux) platform. It is? I get exactly that message (modulo the obvious path and PID differences) when I try it. Are you comparing to 7.0.* on Linux perhaps? IIRC, 7.0 makes these tests in a different order than 7.1 does. regards, tom lane
Tom, On Wed, Jan 17, 2001 at 11:19:10AM -0500, Tom Lane wrote: > Jason Tishler <Jason.Tishler@dothill.com> writes: > >> That does not look like a bug; > > > My point was that the error message generated by the second postmaster > > is *different* that the one generated for the same test case on a real > > UNIX (i.e., Linux) platform. > > It is? I get exactly that message (modulo the obvious path and PID > differences) when I try it. Are you comparing to 7.0.* on Linux > perhaps? IIRC, 7.0 makes these tests in a different order than 7.1 > does. Bingo! I *am* (still) using 7.0.3 on Linux. I assumed that this aspect would not have changed between 7.0.3 and 7.1. There I go getting myself in trouble again by assuming... Thanks, Jason -- Jason Tishler Director, Software Engineering Phone: +1 (732) 264-8770 x235 Dot Hill Systems Corp. Fax: +1 (732) 264-8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com