Thread: libpq drops error messages received just before backend crash

libpq drops error messages received just before backend crash

From
Tom Lane
Date:
While poking at the vacuum-induced coredump we were discussing on
Friday, I noticed that psql did not report ERROR:  vacuum: can't destroy lock file!
even though this message was showing up in the postmaster log.
Even more interesting, psql *did* report theNOTICE:  AbortTransaction and not in in-progress state 
that the backend emitted *after* the elog(ERROR) and just before
coredumping.

It turns out that this is a libpq deficiency: it's got the error
message, but because PQexec() was used, it's waiting around for
a 'Z' ReadyForQuery message before it hands the error message
back to the application.  Since the backend crashes, of course
the 'Z' never comes ... and when libpq detects closure of the
connection, it wipes out the stored error message in its haste
to reportpqReadData() -- backend closed the channel unexpectedly.        This probably means the backend terminated
abnormally       before or while processing the request.
 
which is all that the user gets to see, unless he thinks to 
look in the postmaster log.  Boo hiss.

(The reason the NOTICE shows up is that it's just dumped to stderr
immediately upon receipt, rather than being queued to hand back
to the application.)

I have a fix in mind for this: concatenate "backend closed the channel"
to the waiting error message, instead of wiping it out.  But I think
I will wait till after Michael Ansley's long-query changes have been
committed before I start hacking on libpq again.

Anyway, if you want to know what really happened right before a
backend crash, you should look in the postmaster log until this
is fixed...
        regards, tom lane


Re: libpq drops error messages received just before backend crash

From
Tom Lane
Date:
I wrote:
> It turns out that this is a libpq deficiency: it's got the error
> message, but because PQexec() was used, it's waiting around for
> a 'Z' ReadyForQuery message before it hands the error message
> back to the application.  Since the backend crashes, of course
> the 'Z' never comes ... and when libpq detects closure of the
> connection, it wipes out the stored error message in its haste
> to report
>     pqReadData() -- backend closed the channel unexpectedly.
>             This probably means the backend terminated abnormally
>             before or while processing the request.
> which is all that the user gets to see, unless he thinks to 
> look in the postmaster log.  Boo hiss.

Although I forgot to mention it in the commit log entry, this problem
is fixed in the libpq changes I just committed to the current branch.
        regards, tom lane