libpq drops error messages received just before backend crash - Mailing list pgsql-hackers

From Tom Lane
Subject libpq drops error messages received just before backend crash
Date
Msg-id 23527.934772190@sss.pgh.pa.us
Whole thread Raw
Responses Re: libpq drops error messages received just before backend crash
List pgsql-hackers
While poking at the vacuum-induced coredump we were discussing on
Friday, I noticed that psql did not report ERROR:  vacuum: can't destroy lock file!
even though this message was showing up in the postmaster log.
Even more interesting, psql *did* report theNOTICE:  AbortTransaction and not in in-progress state 
that the backend emitted *after* the elog(ERROR) and just before
coredumping.

It turns out that this is a libpq deficiency: it's got the error
message, but because PQexec() was used, it's waiting around for
a 'Z' ReadyForQuery message before it hands the error message
back to the application.  Since the backend crashes, of course
the 'Z' never comes ... and when libpq detects closure of the
connection, it wipes out the stored error message in its haste
to reportpqReadData() -- backend closed the channel unexpectedly.        This probably means the backend terminated
abnormally       before or while processing the request.
 
which is all that the user gets to see, unless he thinks to 
look in the postmaster log.  Boo hiss.

(The reason the NOTICE shows up is that it's just dumped to stderr
immediately upon receipt, rather than being queued to hand back
to the application.)

I have a fix in mind for this: concatenate "backend closed the channel"
to the waiting error message, instead of wiping it out.  But I think
I will wait till after Michael Ansley's long-query changes have been
committed before I start hacking on libpq again.

Anyway, if you want to know what really happened right before a
backend crash, you should look in the postmaster log until this
is fixed...
        regards, tom lane


pgsql-hackers by date:

Previous
From: The Hermit Hacker
Date:
Subject: Re: [HACKERS] New man pages
Next
From: The Hermit Hacker
Date:
Subject: Re: [HACKERS] Re: autoconf does not recognize HP-UX 11.00.