Re: "incomplete startup packet" on SGI - Mailing list pgsql-general

From David Rysdam
Subject Re: "incomplete startup packet" on SGI
Date
Msg-id 43A1703F.2010505@ll.mit.edu
Whole thread Raw
In response to Re: "incomplete startup packet" on SGI  (David Rysdam <drysdam@ll.mit.edu>)
List pgsql-general
David Rysdam wrote:

> David Rysdam wrote:
>
>> Tom Lane wrote:
>>
>>> David Rysdam <drysdam@ll.mit.edu> writes:
>>>
>>>
>>>> Just finished building and installing on *Sun* (also
>>>> "--without-readline", not that I think that could be the issue):
>>>> Works fine.  So it's something to do with the SGI build in particular.
>>>>
>>>
>>>
>>>
>>> More likely it's something to do with weird behavior of the SGI
>>> kernel's
>>> TCP stack.  I did a little googling for "transport endpoint is not
>>> connected" without turning up anything obviously related, but that or
>>> ENOTCONN is probably what you need to search on.
>>>
>>>             regards, tom lane
>>>
>>> ---------------------------(end of
>>> broadcast)---------------------------
>>> TIP 2: Don't 'kill -9' the postmaster
>>>
>>>
>>>
>>>
>> It's acting like a race condition or pointer problem.  When I add
>> random debug printfs/PQflushs to libpq it sometimes works.
>> ---------------------------(end of broadcast)---------------------------
>> TIP 9: In versions below 8.0, the planner will ignore your desire to
>>       choose an index scan if your joining column's datatypes do not
>>       match
>>
> Not a race condition: No threads
> Not a memory leak: Electric fence says nothing.  And it works when
> electric fence is running, whereas a binary that uses the same libpq
> without linking efence does not work.
>
I know nobody is interested in this, but I think I should document the
"solution" for anyone who finds this thread in the archives:  My theory
is that Irix is unable to keep up with how fast the postgresql client is
going and that the debug statements/efence stuff are slowing it down
enough that Irix can catch up and make sure the socket really is there,
connected and working.  To that end, I inserted a sleep(1) in
fe-connect.c just before the pqPacketSend(...startpacket...) stuff.
It's stupid and hacky, but gets me where I need to be and maybe this
hint will inspire somebody who knows (and cares) about Irix to find a
real fix.








pgsql-general by date:

Previous
From: Jaime Casanova
Date:
Subject: Re: is this a bug or I am blind?
Next
From: Csaba Nagy
Date:
Subject: Re: is this a bug or I am blind?