Hi,
we had a kernel panic crashing our DB server today and all libpq clients (C and
Perl clients) got stuck in poll() for hours even after the server was back up,
i.e. longer than the tcp timeout should be:
#0 0x00002b2283b31c8f in poll () from /lib/libc.so.6
#1 0x00002b228446f4af in PQmblen () from /usr/lib/libpq.so.4
#2 0x00002b228446f590 in pqWaitTimed () from /usr/lib/libpq.so.4
#3 0x00002b228446ee72 in PQgetResult () from /usr/lib/libpq.so.4
#4 0x00002b228446ef4e in PQgetResult () from /usr/lib/libpq.so.4
#5 0x00002b2284341ffe in pg_st_prepare_statement ()
from /usr/local/lib/perl/5.8.8/auto/DBD/Pg/Pg.so
#6 0x00002b228434eb25 in pg_st_execute ()
[...]
It seems that poll() never receives a connection closed notification under Linux
(https://lists.linux-foundation.org/pipermail/bugme-new/2003-April/008335.html -
very old report, I can't find any newer information), so I am unsure how to
handle such a case gracefully. I guess I'm having the same problem as reported in
http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg105844.html
but there's no real conclusion there. Any suggestions? Can libpq be configured
to use epoll or select perhaps? Is the libpq (8.1.19-0etch1) too old?
Server version is 8.4.4, using tcp (no SSL).
Regards,
Marinos