Re: segfault in pg 8.4, CurrentResourceOwner == NULL while processing SIGTERM - Mailing list pgsql-bugs

From Tom Lane
Subject Re: segfault in pg 8.4, CurrentResourceOwner == NULL while processing SIGTERM
Date
Msg-id 4948.1269027484@sss.pgh.pa.us
Whole thread Raw
In response to segfault in pg 8.4, CurrentResourceOwner == NULL while processing SIGTERM  (Dennis Koegel <dk@openit.de>)
List pgsql-bugs
Dennis Koegel <dk@openit.de> writes:
> [ crash when a SQL function is killed by SIGTERM ]

Frankly, I would expect this to fail with even higher probability in
8.1.  We didn't even pretend that SIGTERM'ing individual backends was
a safe thing to do in 8.1.  You ought to rethink your connection
management methodology.

Now having said that, there are a couple of bad things happening in
this stack trace:

1. sql_exec_error_callback() is supposing that it's safe for it to do
a SearchSysCache lookup.  On reflection this is a fundamentally bad
idea, because an error context callback is typically going to get
invoked in already-aborted transactions, and it's never safe to initiate
new catalog accesses in such a situation.  The error is probably masked
in most cases by the desired pg_proc row being already in cache, but
if it had gotten flushed from cache for some reason then bad things
would happen.  And then of course we have this case, where we're outside
a transaction entirely by the time control arrives at log_disconnections.

2. log_disconnections() is blithely doing an ordinary ereport(), which
will result in attempting to annotate the disconnection log message
with whatever context data might be supplied by active callback stack
entries.  I cannot imagine any situation where callback stack entries
would actually be relevant to the disconnection log message, and as we
see here it's a context where the callback functions are very possibly
going to misbehave.  What seems like a good idea is to flush the error
callback stack to empty, either in log_disconnections itself or maybe
better at the beginning of proc_exit().

I think both of these things need to be changed.  Comments?

BTW, a quick look through the sources says that
sql_function_parse_error_callback is also making the unsafe assumption
that it can do SearchSysCache, but I don't see any other error callback
that is doing anything unsafe.  Some other callbacks with similar needs,
such as sql_inline_error_callback, expect the callback setup code to
have fetched the tuple for them, and that seems like what we want for
SQL functions too.

            regards, tom lane

pgsql-bugs by date:

Previous
From: "Jon Nelson"
Date:
Subject: BUG #5384: pg_dump hard-codes use of /tmp
Next
From: Tom Lane
Date:
Subject: Re: BUG #5384: pg_dump hard-codes use of /tmp