Thread: Socket problem using beta2 on Windows-XP

Socket problem using beta2 on Windows-XP

From
Thomas Hallgren
Date:
Hi,
I've installed PostgreSQL 8.1-beta2 as a service on my Windows-XP box. 
It runs fine but I get repeated messages like this in the log:
  2005-09-29 00:41:09 FATAL:  could not duplicate socket 1880 for use 
in backend: error code 10038

and for each message printed, a new postgres process is created. To make 
things worse, those processes do not die when I stop the service.

I use sysinternals tcpview to monitor my sockets. I know that no other 
process is using 1880. Each started postgres process will occupy two, 
seemingly random ports that apparently form a loop somehow. This is a 
typical entry:
  <non-existent>:3136    TCP    127.0.0.1:1554    127.0.0.1:1555 ESTABLISHED      <non-existent>:3136    TCP
127.0.0.1:1555   127.0.0.1:1554    ESTABLISHED    
 

The weird thing is that there is no process with pid 3136 (hence the 
name <non-existent>). There is a postgres process with another pid in my 
process listing. If I kill that, the <non-existstent> entries go away.

Looks like pid 3136 is talking to itself. A pipe() followed by failure 
to start the new process perhaps?

Regards,
Thomas Hallgren



Re: Socket problem using beta2 on Windows-XP

From
"Magnus Hagander"
Date:
> Hi,
> I've installed PostgreSQL 8.1-beta2 as a service on my
> Windows-XP box.
> It runs fine but I get repeated messages like this in the log:
>
>    2005-09-29 00:41:09 FATAL:  could not duplicate socket
> 1880 for use in backend: error code 10038
>
> and for each message printed, a new postgres process is
> created. To make things worse, those processes do not die
> when I stop the service.
>
> I use sysinternals tcpview to monitor my sockets. I know that
> no other process is using 1880. Each started postgres process
> will occupy two, seemingly random ports that apparently form
> a loop somehow. This is a typical entry:
>
>    <non-existent>:3136    TCP    127.0.0.1:1554
> 127.0.0.1:1555 ESTABLISHED
>    <non-existent>:3136    TCP    127.0.0.1:1555
> 127.0.0.1:1554    ESTABLISHED
>
> The weird thing is that there is no process with pid 3136
> (hence the name <non-existent>). There is a postgres process
> with another pid in my process listing. If I kill that, the
> <non-existstent> entries go away.
>
> Looks like pid 3136 is talking to itself. A pipe() followed
> by failure to start the new process perhaps?


Do you by any chance run any antivirus or firewall software? If so, can
you try removing it (note! actual uninstall, not just disabling it!)

//Magnus


Re: Socket problem using beta2 on Windows-XP

From
Thomas Hallgren
Date:
Nope, no anti-virus and no firewall (other then the box that fronts my 
home-network to the outside world).

- thomas

Magnus Hagander wrote:

>>Hi,
>>I've installed PostgreSQL 8.1-beta2 as a service on my 
>>Windows-XP box. 
>>It runs fine but I get repeated messages like this in the log:
>>
>>   2005-09-29 00:41:09 FATAL:  could not duplicate socket 
>>1880 for use in backend: error code 10038
>>
>>and for each message printed, a new postgres process is 
>>created. To make things worse, those processes do not die 
>>when I stop the service.
>>
>>I use sysinternals tcpview to monitor my sockets. I know that 
>>no other process is using 1880. Each started postgres process 
>>will occupy two, seemingly random ports that apparently form 
>>a loop somehow. This is a typical entry:
>>
>>   <non-existent>:3136    TCP    127.0.0.1:1554    
>>127.0.0.1:1555 ESTABLISHED    
>>   <non-existent>:3136    TCP    127.0.0.1:1555    
>>127.0.0.1:1554    ESTABLISHED    
>>
>>The weird thing is that there is no process with pid 3136 
>>(hence the name <non-existent>). There is a postgres process 
>>with another pid in my process listing. If I kill that, the 
>><non-existstent> entries go away.
>>
>>Looks like pid 3136 is talking to itself. A pipe() followed 
>>by failure to start the new process perhaps?
>>    
>>
>
>
>Do you by any chance run any antivirus or firewall software? If so, can
>you try removing it (note! actual uninstall, not just disabling it!)
>
>//Magnus
>  
>




Re: Socket problem using beta2 on Windows-XP

From
Martijn van Oosterhout
Date:
On Thu, Sep 29, 2005 at 08:50:30AM +0200, Thomas Hallgren wrote:
> Hi,
> I've installed PostgreSQL 8.1-beta2 as a service on my Windows-XP box.
> It runs fine but I get repeated messages like this in the log:
>
>   2005-09-29 00:41:09 FATAL:  could not duplicate socket 1880 for use
> in backend: error code 10038

That's from postmaster.c:write_inheritable_socket(). Error 10038 is
WSAENOTSOCK. Very odd, time to get out the debugger? Get a backtrace at
least.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Socket problem using beta2 on Windows-XP

From
"Magnus Hagander"
Date:
Hmm. Bummer.

Anyway. The netstat indicates that the pipe() call works. The order is
pretty much:

parent: create socket pair, connected to each other.
parent: Duplicate socket [this is what fails]
parent: close own copy of socket
child: recreate socket from structure [this is never called, thus the
new socket is never "attached" to a process]

Now *why* it's doing this, I hav eno idea.

Questions:
1) Does it actually work? ;-) And just logs the error anyway?
2) Does this happen on *every* connection?
3) Can you reproduce this on a different machine, or just one?

//Magnus

> -----Original Message-----
> From: Thomas Hallgren [mailto:thhal@mailblocks.com]
> Sent: Thursday, September 29, 2005 9:48 AM
> To: Magnus Hagander
> Cc: PostgreSQL-development
> Subject: Re: [HACKERS] Socket problem using beta2 on Windows-XP
>
> Nope, no anti-virus and no firewall (other then the box that
> fronts my home-network to the outside world).
>
> - thomas
>
> Magnus Hagander wrote:
>
> >>Hi,
> >>I've installed PostgreSQL 8.1-beta2 as a service on my
> Windows-XP box.
> >>It runs fine but I get repeated messages like this in the log:
> >>
> >>   2005-09-29 00:41:09 FATAL:  could not duplicate socket
> 1880 for use
> >>in backend: error code 10038
> >>
> >>and for each message printed, a new postgres process is created. To
> >>make things worse, those processes do not die when I stop
> the service.
> >>
> >>I use sysinternals tcpview to monitor my sockets. I know
> that no other
> >>process is using 1880. Each started postgres process will
> occupy two,
> >>seemingly random ports that apparently form a loop somehow.
> This is a
> >>typical entry:
> >>
> >>   <non-existent>:3136    TCP    127.0.0.1:1554
> >>127.0.0.1:1555 ESTABLISHED
> >>   <non-existent>:3136    TCP    127.0.0.1:1555
> >>127.0.0.1:1554    ESTABLISHED
> >>
> >>The weird thing is that there is no process with pid 3136
> (hence the
> >>name <non-existent>). There is a postgres process with
> another pid in
> >>my process listing. If I kill that, the <non-existstent> entries go
> >>away.
> >>
> >>Looks like pid 3136 is talking to itself. A pipe() followed
> by failure
> >>to start the new process perhaps?
> >>
> >>
> >
> >
> >Do you by any chance run any antivirus or firewall software?
> If so, can
> >you try removing it (note! actual uninstall, not just disabling it!)
> >
> >//Magnus
> >
> >
>
>
>


Re: Socket problem using beta2 on Windows-XP

From
Thomas Hallgren
Date:
Hi,
I'm Sorry, Time was short today. To answer your questions.
1. I can run a psql and other client programs. Everything works fine. 
But while doing it, I get a lot of zombies in the tcpview and 
eventually, I think I run out of ports. Psql just hangs when I try to 
connect. When that happens, I have two choices; a) Stop the service and 
then kill off all processes by hand (there's now a *lot* of them), or b) 
reboot.
2. It happens while the postmaster is idle. If I leave it idle for a 
while and then come back, I'll have a whole bunch of new processes in my 
task-manager and zombies in tcpview.
3. I don't have another machine handy for this right now.

It sounds like you know where it happens. Martijn requested a 
stacktrace. Do you still need that? If you do, I'll try to get some time 
over this weekend.

Regards,
Thomas Hallgren

Magnus Hagander wrote:

>Hmm. Bummer.
>
>Anyway. The netstat indicates that the pipe() call works. The order is
>pretty much:
>
>parent: create socket pair, connected to each other. 
>parent: Duplicate socket [this is what fails]
>parent: close own copy of socket
>child: recreate socket from structure [this is never called, thus the
>new socket is never "attached" to a process]
>
>Now *why* it's doing this, I hav eno idea.
>
>Questions:
>1) Does it actually work? ;-) And just logs the error anyway?
>2) Does this happen on *every* connection?
>3) Can you reproduce this on a different machine, or just one?
>
>//Magnus
>
>  
>
>>-----Original Message-----
>>From: Thomas Hallgren [mailto:thhal@mailblocks.com] 
>>Sent: Thursday, September 29, 2005 9:48 AM
>>To: Magnus Hagander
>>Cc: PostgreSQL-development
>>Subject: Re: [HACKERS] Socket problem using beta2 on Windows-XP
>>
>>Nope, no anti-virus and no firewall (other then the box that 
>>fronts my home-network to the outside world).
>>
>>- thomas
>>
>>Magnus Hagander wrote:
>>
>>    
>>
>>>>Hi,
>>>>I've installed PostgreSQL 8.1-beta2 as a service on my 
>>>>        
>>>>
>>Windows-XP box.
>>    
>>
>>>>It runs fine but I get repeated messages like this in the log:
>>>>
>>>>  2005-09-29 00:41:09 FATAL:  could not duplicate socket 
>>>>        
>>>>
>>1880 for use 
>>    
>>
>>>>in backend: error code 10038
>>>>
>>>>and for each message printed, a new postgres process is created. To 
>>>>make things worse, those processes do not die when I stop 
>>>>        
>>>>
>>the service.
>>    
>>
>>>>I use sysinternals tcpview to monitor my sockets. I know 
>>>>        
>>>>
>>that no other 
>>    
>>
>>>>process is using 1880. Each started postgres process will 
>>>>        
>>>>
>>occupy two, 
>>    
>>
>>>>seemingly random ports that apparently form a loop somehow. 
>>>>        
>>>>
>>This is a 
>>    
>>
>>>>typical entry:
>>>>
>>>>  <non-existent>:3136    TCP    127.0.0.1:1554    
>>>>127.0.0.1:1555 ESTABLISHED    
>>>>  <non-existent>:3136    TCP    127.0.0.1:1555    
>>>>127.0.0.1:1554    ESTABLISHED    
>>>>
>>>>The weird thing is that there is no process with pid 3136 
>>>>        
>>>>
>>(hence the 
>>    
>>
>>>>name <non-existent>). There is a postgres process with 
>>>>        
>>>>
>>another pid in 
>>    
>>
>>>>my process listing. If I kill that, the <non-existstent> entries go 
>>>>away.
>>>>
>>>>Looks like pid 3136 is talking to itself. A pipe() followed 
>>>>        
>>>>
>>by failure 
>>    
>>
>>>>to start the new process perhaps?
>>>>   
>>>>
>>>>        
>>>>
>>>Do you by any chance run any antivirus or firewall software? 
>>>      
>>>
>>If so, can 
>>    
>>
>>>you try removing it (note! actual uninstall, not just disabling it!)
>>>
>>>//Magnus
>>> 
>>>
>>>      
>>>
>>
>>    
>>




Re: Socket problem using beta2 on Windows-XP

From
Alvaro Herrera
Date:
On Thu, Sep 29, 2005 at 11:43:37PM +0200, Thomas Hallgren wrote:

> 2. It happens while the postmaster is idle. If I leave it idle for a 
> while and then come back, I'll have a whole bunch of new processes in my 
> task-manager and zombies in tcpview.

Hmm ... how many processes?  Did you enable autovacuum perchance?  If
so, does the number of processes correspond approximately to the
"autovacuum_naptime"?

-- 
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"La espina, desde que nace, ya pincha" (Proverbio africano)


Re: Socket problem using beta2 on Windows-XP

From
"Magnus Hagander"
Date:
> > 2. It happens while the postmaster is idle. If I leave it
> idle for a
> > while and then come back, I'll have a whole bunch of new
> processes in
> > my task-manager and zombies in tcpview.
>
> Hmm ... how many processes?  Did you enable autovacuum
> perchance?  If so, does the number of processes correspond
> approximately to the "autovacuum_naptime"?

IIRC, the win32 installer will enable autovacuum by default. And yes,
autovacuum was my first thought as well after Thomas last mail - that
would be a good explanation to why it happens when the postmaster is
idle.

//Magnus


Re: Socket problem using beta2 on Windows-XP

From
Thomas Hallgren
Date:
Magnus Hagander wrote:

>IIRC, the win32 installer will enable autovacuum by default. And yes,
>autovacuum was my first thought as well after Thomas last mail - that
>would be a good explanation to why it happens when the postmaster is
>idle.
>  
>
I used  the win32 installer defaults so autovacuum is probably a safe 
assumption.

- thomas




Re: Socket problem using beta2 on Windows-XP

From
"Magnus Hagander"
Date:
> >IIRC, the win32 installer will enable autovacuum by default.
> And yes,
> >autovacuum was my first thought as well after Thomas last
> mail - that
> >would be a good explanation to why it happens when the postmaster is
> >idle.
> >
> >
> I used  the win32 installer defaults so autovacuum is
> probably a safe assumption.

Right. Please try turning it off and see if the problem goes away.

//Magnus


Re: Socket problem using beta2 on Windows-XP

From
Thomas Hallgren
Date:
Magnus Hagander wrote:

>Right. Please try turning it off and see if the problem goes away.
>  
>
It does (go away).

- thomas





Re: Socket problem using beta2 on Windows-XP

From
Thomas Hallgren
Date:
Magnus Hagander wrote:

>Right. Please try turning it off and see if the problem goes away.
>  
>
No, wait! It does *not* go away. Do I need to do anything more than 
setting this in my postgresql.conf file:

autovacuum = false            # enable autovacuum subprocess?

and restart the service?

The two zombie entries occurs directly when I start the service, then 
there's two new entries popping up every minute.

- thomas




Re: Socket problem using beta2 on Windows-XP

From
"Magnus Hagander"
Date:
> >Right. Please try turning it off and see if the problem goes away.
> >
> >
> No, wait! It does *not* go away. Do I need to do anything
> more than setting this in my postgresql.conf file:
>
> autovacuum = false            # enable autovacuum subprocess?
>
> and restart the service?
>
> The two zombie entries occurs directly when I start the
> service, then there's two new entries popping up every minute.

Yes, that should be enough.

Hmm. Weird!

If you can get a backtrace from the point where the error msg shows up,
that certainly would help - this means it's not coming from where we
thought it was coming from :-(

//Magnus


Re: Socket problem using beta2 on Windows-XP

From
Alvaro Herrera
Date:
On Fri, Sep 30, 2005 at 08:29:07AM +0200, Thomas Hallgren wrote:
> Magnus Hagander wrote:
> 
> >Right. Please try turning it off and see if the problem goes away.
> > 
> >
> No, wait! It does *not* go away. Do I need to do anything more than 
> setting this in my postgresql.conf file:
> 
> autovacuum = false            # enable autovacuum subprocess?
> 
> and restart the service?
> 
> The two zombie entries occurs directly when I start the service, then 
> there's two new entries popping up every minute.

If it's two zombies per minute, then I bet it's the stat collector and
stat bufferer.  They are restarted by the postmaster if not found to be
running.

The weird thing is that the postmaster _should_ call wait() for them if
it detects that they died (when receiving a SIGCHLD signal AFAIR).  If
it doesn't, maybe it indicates there's a problem with the signal
handling on Win32.

-- 
Alvaro Herrera       Valdivia, Chile   ICBM: S 39º 49' 17.7", W 73º 14' 26.8"
"We are who we choose to be", sang the goldfinch
when the sun is high (Sandman)


Re: Socket problem using beta2 on Windows-XP

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> If it's two zombies per minute, then I bet it's the stat collector and
> stat bufferer.  They are restarted by the postmaster if not found to be
> running.

That would make some sense, because the stat processes need to set up new
sockets (for the pipe between them).  The autovacuum theory didn't hold
any water in my eyes because autovacuum doesn't create any new sockets.

However, why two zombies?  That would mean that the grandchild process
started, which should mean that the pipe was already created ...

Does Windows have any equivalent of strace whereby we could watch what's
happening during stats process launch?
        regards, tom lane


Re: Socket problem using beta2 on Windows-XP

From
"Magnus Hagander"
Date:
> > If it's two zombies per minute, then I bet it's the stat
> collector and
> > stat bufferer.  They are restarted by the postmaster if not
> found to
> > be running.
>
> That would make some sense, because the stat processes need
> to set up new sockets (for the pipe between them).  The
> autovacuum theory didn't hold any water in my eyes because
> autovacuum doesn't create any new sockets.
>
> However, why two zombies?  That would mean that the
> grandchild process started, which should mean that the pipe
> was already created ...
>
> Does Windows have any equivalent of strace whereby we could
> watch what's happening during stats process launch?


First of all, I won't be able to dig into this any more until next week
- sorry about that. But others are always free to :-)

There is no strace equivalent builtin, but you can get an addon from
http://www.bindview.com/Services/RAZOR/Utilities/Windows/strace_readme.c
fm. Don't put it on a production box permanently, though, it tends to
cause BSODs in some cases.

//Magnus


Re: Socket problem using beta2 on Windows-XP

From
Thomas Hallgren
Date:
Tom Lane wrote:

>However, why two zombies?  That would mean that the grandchild process
>started, which should mean that the pipe was already created ...
>  
>
To clarify, I talk about the tcpview window and connections, and thus 
zombi-connections. They both belong to the same pid and seems to point 
to eachother. The actual process no longer exists (it can't be viewed 
anywhere).

Regards,
Thomas Hallgren




Re: Socket problem using beta2 on Windows-XP

From
Thomas Hallgren
Date:
Martijn van Oosterhout wrote:

>On Thu, Sep 29, 2005 at 08:50:30AM +0200, Thomas Hallgren wrote:
>  
>
>>Hi,
>>I've installed PostgreSQL 8.1-beta2 as a service on my Windows-XP box. 
>>It runs fine but I get repeated messages like this in the log:
>>
>>  2005-09-29 00:41:09 FATAL:  could not duplicate socket 1880 for use 
>>in backend: error code 10038
>>    
>>
>
>That's from postmaster.c:write_inheritable_socket(). Error 10038 is
>WSAENOTSOCK. Very odd, time to get out the debugger? Get a backtrace at
>least.
>  
>
I finally managed to debug the postmaster and I'm now pretty sure the 
message is not from the postmaster itself. I put a breakpoint where the 
message is printed (postmaster.c:3762) and in errstart() where elevel >= 
ERROR (elog.c:152) but I never get there although the message is 
printed. I know that my debugger works because if I put a break on 
elog.c:194 it stops for other messages.

Regards,
Thomas Hallgren




Re: Socket problem using beta2 on Windows-XP

From
Thomas Hallgren
Date:
I added some traces to the code. I know that the following happens when 
I start a postmaster.

StartupDatabase will call internal_fork_exec, it calls 
write_inheritable_socket 4 times and succeeds.

During the first iteration of ServerLoop: StartBackgroundWriter will call internal_fork_exec and succeed.
pgstat_forkexecwill call internal_fork_exec and succeed.
 

In the second iteration of ServerLoop, pgstat_forkexec will again call  
will call internal_fork_exec. This time it fails.
According to the log it fails on line:
   write_inheritable_socket(¶m->pgStatSock, pgStatSock, childPid);

i.e. on it's second call to write_inheriable_socket. The failure is in a 
postgres.exe process, not postmaster.exe (and that's why I can't debug 
propery on Windoz).

Hope this helps.

Regards,
Thomas Hallgren


Magnus Hagander wrote:

>>>If it's two zombies per minute, then I bet it's the stat 
>>>      
>>>
>>collector and 
>>    
>>
>>>stat bufferer.  They are restarted by the postmaster if not 
>>>      
>>>
>>found to 
>>    
>>
>>>be running.
>>>      
>>>
>>That would make some sense, because the stat processes need 
>>to set up new sockets (for the pipe between them).  The 
>>autovacuum theory didn't hold any water in my eyes because 
>>autovacuum doesn't create any new sockets.
>>
>>However, why two zombies?  That would mean that the 
>>grandchild process started, which should mean that the pipe 
>>was already created ...
>>
>>Does Windows have any equivalent of strace whereby we could 
>>watch what's happening during stats process launch?
>>    
>>
>
>
>First of all, I won't be able to dig into this any more until next week
>- sorry about that. But others are always free to :-)
>
>There is no strace equivalent builtin, but you can get an addon from
>http://www.bindview.com/Services/RAZOR/Utilities/Windows/strace_readme.c
>fm. Don't put it on a production box permanently, though, it tends to
>cause BSODs in some cases.
>
>//Magnus
>  
>




Re: Socket problem using beta2 on Windows-XP

From
Martijn van Oosterhout
Date:
On Sun, Oct 02, 2005 at 12:20:05PM +0200, Thomas Hallgren wrote:
> I added some traces to the code. I know that the following happens when
> I start a postmaster.

<snip>

> In the second iteration of ServerLoop, pgstat_forkexec will again call
> will call internal_fork_exec. This time it fails.
> According to the log it fails on line:
>
>    write_inheritable_socket(¶m->pgStatSock, pgStatSock, childPid);

Well, pgStatSock is the only SOCK_DGRAM socket, all the others are
SOCK_STREAM, maybe that's the difference? It's also connected to
itself, although for DGRAM sockets that's not that special.

The documentation isn't totally clear about this. Yet the error thrown
should terminate the process, yet it obviously isn't. Very odd.

Any Windows programmers with ideas?
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Socket problem using beta2 on Windows-XP

From
Thomas Hallgren
Date:
With great help from Magnus, who advised me to use lspfix from cexx.org 
to list my lsp's, I found that I had gapsp.dll, "Neoteris DNS Provider" 
installed. An uninstall of the Neoteris software made this problem go away.

Regards,
Thomas Hallgren

Thomas Hallgren wrote:
> I added some traces to the code. I know that the following happens 
> when I start a postmaster.
>
> StartupDatabase will call internal_fork_exec, it calls 
> write_inheritable_socket 4 times and succeeds.
>
> During the first iteration of ServerLoop:
>  StartBackgroundWriter will call internal_fork_exec and succeed.
>  pgstat_forkexec will call internal_fork_exec and succeed.
>
> In the second iteration of ServerLoop, pgstat_forkexec will again 
> call  will call internal_fork_exec. This time it fails.
> According to the log it fails on line:
>
>    write_inheritable_socket(¶m->pgStatSock, pgStatSock, childPid);
>
> i.e. on it's second call to write_inheriable_socket. The failure is in 
> a postgres.exe process, not postmaster.exe (and that's why I can't 
> debug propery on Windoz).
>
> Hope this helps.
>
> Regards,
> Thomas Hallgren
>
>
> Magnus Hagander wrote:
>
>>>> If it's two zombies per minute, then I bet it's the stat     
>>> collector and   
>>>> stat bufferer.  They are restarted by the postmaster if not     
>>> found to   
>>>> be running.
>>>>     
>>> That would make some sense, because the stat processes need to set 
>>> up new sockets (for the pipe between them).  The autovacuum theory 
>>> didn't hold any water in my eyes because autovacuum doesn't create 
>>> any new sockets.
>>>
>>> However, why two zombies?  That would mean that the grandchild 
>>> process started, which should mean that the pipe was already created 
>>> ...
>>>
>>> Does Windows have any equivalent of strace whereby we could watch 
>>> what's happening during stats process launch?
>>>   
>>
>>
>> First of all, I won't be able to dig into this any more until next week
>> - sorry about that. But others are always free to :-)
>>
>> There is no strace equivalent builtin, but you can get an addon from
>> http://www.bindview.com/Services/RAZOR/Utilities/Windows/strace_readme.c
>> fm. Don't put it on a production box permanently, though, it tends to
>> cause BSODs in some cases.
>>
>> //Magnus
>>  
>>
>
>




Re: Socket problem using beta2 on Windows-XP

From
Alvaro Herrera
Date:
Thomas Hallgren wrote:

> With great help from Magnus, who advised me to use lspfix from cexx.org 
> to list my lsp's, I found that I had gapsp.dll, "Neoteris DNS Provider" 
> installed. An uninstall of the Neoteris software made this problem go away.

I guess the question is, why is a "DNS Provider" software blocking
socket creation?  Is there a way we could work around that?

-- 
Alvaro Herrera                         Architect, http://www.EnterpriseDB.com
"El destino baraja y nosotros jugamos" (A. Schopenhauer)


Re: Socket problem using beta2 on Windows-XP

From
"Magnus Hagander"
Date:
> > With great help from Magnus, who advised me to use lspfix from
> > cexx.org to list my lsp's, I found that I had gapsp.dll,
> "Neoteris DNS Provider"
> > installed. An uninstall of the Neoteris software made this
> problem go away.
>
> I guess the question is, why is a "DNS Provider" software
> blocking socket creation?  Is there a way we could work around that?
>

It's just another version of the "Broken LSP" that we've been having
problems iwth before. But before, it's only been AV and firewall stuff.

I guess they somehow put a LSP in there to intercept DNS packets or
soemthign. Completely broken design IMHO, but that's a different thing
;-) And they apparantly don't support socket inheritance. The only way
we can work around them breaking the concept of socket inheritance is to
stop using it. Which would mean going multithread instead of
multiprocess, which isn't very likely...

To reiterate the basic point: The broken LSP breaks a fundamental
promise in the sockets API that we absolutely require. The bug is
completely within the LSP.

//Magnus




Re: Socket problem using beta2 on Windows-XP

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
> To reiterate the basic point: The broken LSP breaks a fundamental
> promise in the sockets API that we absolutely require. The bug is
> completely within the LSP.

ISTM that maybe what we have here is a documentation shortcoming.
I'm thinking that our Windows FAQ ought to suggest troubleshooting
socket-related problems by removing LSPs one at a time.
        regards, tom lane


Re: Socket problem using beta2 on Windows-XP

From
"Magnus Hagander"
Date:
> > To reiterate the basic point: The broken LSP breaks a fundamental
> > promise in the sockets API that we absolutely require. The bug is
> > completely within the LSP.
>
> ISTM that maybe what we have here is a documentation shortcoming.
> I'm thinking that our Windows FAQ ought to suggest
> troubleshooting socket-related problems by removing LSPs one
> at a time.

We used to have this, but we removed it when we aded the code that fixed
the problem in 95% of the cases. It's probably a good idea to bring it
back :-(

//Magnus