Re: SIGPIPE gripe - Mailing list pgsql-hackers

From Tom Lane
Subject Re: SIGPIPE gripe
Date
Msg-id 4968.894298039@sss.pgh.pa.us
Whole thread Raw
In response to Re: SIGPIPE gripe  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Re: SIGPIPE gripe
Re: [HACKERS] Re: SIGPIPE gripe
Re: [HACKERS] Re: SIGPIPE gripe
List pgsql-hackers
I said:
> The real question is what scenario is causing SIGPIPE to be delivered
> in the first place.  A search of the pgsql-hackers archives for
> "SIGPIPE" yields only a mention of seeing SIGPIPE some of the time
> (not always) when trying to connect to a nonexistent database.

OK, I've been able to reproduce this; I understand the problem and
I have a proposed fix.

Here's the scenario.  On the server side, this happens:

    Postmaster receives new connection request from client

    (possible authentication cycle here)

    Postmaster sends "AUTH OK" to client

    Postmaster forks backend

    Backend discovers that database name is invalid

    Backend sends error message

    Backend closes connection and exits

Meanwhile, once the client receives the "AUTH OK" it initiates
an empty query cycle (which is commented as intending to discover
whether the database exists!):

    ...

    Client receives "AUTH_OK"

    Client sends "Q " query

    Client waits for response

The problem, of course, is that if the backend manages to exit
before the client gets to send its empty query, then the client
is writing on a closed connection.  Boom, SIGPIPE.

I thought about hacking around this by having the postmaster check
the validity of the database name before it does the authorization
cycle.  But that's a bad idea; first because it allows unauthorized
users to probe the validity of database names, and second because
it only fixes this particular instance of the problem.  The general
problem is that the FE/BE protocol does not make provision for errors
reported by the backend during startup.  ISTM there are many ways in
which the BE might fail during startup, not all of which could
reasonably be checked in advance by the postmaster.

So ... since we're altering the protocol anyway ... the right fix is
to alter the protocol a little more.  Remember that "Z" message that
the backend is now sending at the end of every query cycle?  What
we ought to do is make the BE send "Z" at completion of startup,
as well.  (In other words, "Z" will really mean "Ready for Query"
rather than "Query Done".  This is actually easier to implement in
postgres.c than the other way.)  Now the client's startup procedure
looks like

    ...

    Client receives "AUTH_OK"

    Client waits for "Z" ; if get "E" instead, BE startup failed.

I suspect it's not really necessary to do an empty query after this,
but we may as well leave that in there for additional reliability.

            regards, tom lane

pgsql-hackers by date:

Previous
From:
Date:
Subject: Re: [HACKERS] cvs question
Next
From: "Jackson, DeJuan"
Date:
Subject: Auto Type conversion