Re: Printing backtrace of postgres processes - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Printing backtrace of postgres processes
Date
Msg-id 523589.1611091371@sss.pgh.pa.us
Whole thread Raw
In response to Re: Printing backtrace of postgres processes  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Printing backtrace of postgres processes  (vignesh C <vignesh21@gmail.com>)
Re: Printing backtrace of postgres processes  (Craig Ringer <craig.ringer@enterprisedb.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jan 19, 2021 at 12:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think it's got security hazards as well.  If we restricted the
>> feature to cause a trace of only one process at a time, and required
>> that process to be logged in as the same database user as the
>> requestor (or at least someone with the privs of that user, such
>> as a superuser), that'd be less of an issue.

> I am not sure I see a security hazard here. I think the checks for
> this should have the same structure as pg_terminate_backend() and
> pg_cancel_backend(); whatever is required there should be required
> here, too, but not more, unless we have a real clear reason for such
> an inconsistency.

Yeah, agreed.  The "broadcast" option seems inconsistent with doing
things that way, but I don't have a problem with being able to send
a trace signal to the same processes you could terminate.

>> I share your estimate that there'll be small-fraction-of-a-percent
>> hazards that could still add up to dangerous instability if people
>> go wild with this.

> Right. I was more concerned about whether we could, for example, crash
> while inside the function that generates the backtrace, on some
> platforms or in some scenarios. That would be super-sad. I assume that
> the people who wrote the code tried to make sure that didn't happen
> but I don't really know what's reasonable to expect.

Recursion is scary, but it should (I think) not be possible if this
is driven off CHECK_FOR_INTERRUPTS.  There will certainly be none
of those in libbacktrace.

One point here is that it might be a good idea to suppress elog.c's
calls to functions in the error context stack.  As we saw in another
connection recently, allowing that to happen makes for a *very*
large increase in the footprint of code that you are expecting to
work at any random CHECK_FOR_INTERRUPTS call site.

BTW, it also looks like the patch is doing nothing to prevent the
backtrace from being sent to the connected client.  I'm not sure
what I think about whether it'd be okay from a security standpoint
to do that on the connection that requested the trace, but I sure
as heck don't want it to happen on connections that didn't.  Also,
whatever you think about security concerns, it seems like there'd be
pretty substantial reentrancy hazards if the interrupt occurs
anywhere in code dealing with normal client communication.

Maybe, given all of these things, we should forget using elog at
all and just emit the trace with fprintf(stderr).  That seems like
it would decrease the odds of trouble by about an order of magnitude.
It would make it more painful to collect the trace in some logging
setups, of course.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Erik Rijkers
Date:
Subject: cfbot building docs - serving results
Next
From: Thomas Munro
Date:
Subject: Re: compression libraries and CF bot