Re: Suggestion for To Do List - Client timeout please. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Suggestion for To Do List - Client timeout please.
Date
Msg-id 9174.998323817@sss.pgh.pa.us
Whole thread Raw
In response to Suggestion for To Do List - Client timeout please.  (Grant <grant@conprojan.com.au>)
List pgsql-hackers
Peter Eisentraut <peter_e@gmx.net> writes:
> I can observe something peculiar:
> [7.0.2 works different from current]

Interesting.  The psql that I exhibited my test with was in fact 7.0.2
[quick check ... yes, current sources act the same].  So it does seem
there's something Linux-specific here.

> The backtrace shows:

> #0  0x401d9a0e in __select () from /lib/libc.so.6
> #1  0x4002f3b0 in b2c3 () from /usr/lib/libpq.so.2.1
> #2  0x4002666e in pqFlush () from /usr/lib/libpq.so.2.1
> #3  0x40022bc2 in closePGconn () from /usr/lib/libpq.so.2.1
> #4  0x40022c67 in PQfinish () from /usr/lib/libpq.so.2.1
> #5  0x805167d in main ()

> I suspect that this may be because of the questionable TCP implementation
> in Linux that you argued about with Alan Cox et al. a while ago, though I
> don't pretend to fathom the details.  Apparently something in libpq
> changed in between, however.

You changed it.  I'll bet the difference you are seeing is that
closePGconn no longer tries to send an 'X' message when closing the
socket, if we haven't reached CONNECTION_OK state.  The hang is clearly
occuring while trying to flush out that extra byte.

I would agree that this is evidence of a broken TCP stack, however ---
at worst you should incur a second timeout delay here, not an indefinite
hang.  Anyone want to file a bug report with the Linux TCP boys?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Doug McNaught
Date:
Subject: Re: Re: CREATEDB Where ??
Next
From: Doug McNaught
Date:
Subject: Re: CREATEDB Where ??