============================================================================
POSTGRESQL BUG REPORT TEMPLATE
============================================================================
Your name : Shez
Your email address : shez@nsl.net
Category : runtime: back-end
Severity : critical
Summary: Server hangs,with backends still running on accept() failure
System Configuration
--------------------
Operating System : RH Linux 5.2(x86 on K6II)
PostgreSQL version : 6.4.2
Compiler used : gcc-2.7.2.3-14
Hardware:
---------
RH Linux 5.2(x86 on K6II)
Linux media.nsl.net 2.0.36 #1 Tue Oct 13 22:17:11 EDT 1998 i586 unknown
Versions of other tools:
------------------------
--------------------------------------------------------------------------
Problem Description:
--------------------
A lightly loaded server with about 16 backends running at
anytime hung dead. Attempts to connect simply hung and
existing backend processes also hung. Last entry in the logs were:
ERROR: postmaster: StreamConnection: accept: Invalid argument
This seems to have come in the mail list,
but I could see no resolution on the list.
As far as I can see it happens if the accept call fails on
an inet connection.
I have a small test program (below) which causes a similar but
not identical crash.
--------------------------------------------------------------------------
Test Case:
----------
This connects to a given machine port combination and then
attempts to kill its self before the operation can be
completed, resulting in the accept call on the server
returning:
StreamConnection: accept: Connection reset by peer
I have put it up on http://sheznet.nsl.net/crash.c rather
than try and paste it into this form.
usage: ./crash dbhost 5432
The crash program only works for me when ran from a different
machine than the server.
Note that this is a serious security issue, with anybody who
can send packets to a listening server being able to hang it.
--------------------------------------------------------------------------
Solution:
---------
I believe that when accept() fails the backend should at
least quit the server properly so that supervise programs
can restart it, or just silently ignore the failure - this
has worked for me in similar situations.
Sorry I don't have time to investigate further, but please
feel free to contact me if you have further questions.
Sincerely.
Shez
--------------------------------------------------------------------------