Thread: Proposal: new ereport option "errdetail_log"

Proposal: new ereport option "errdetail_log"

From
Tom Lane
Date:
I'm not sure how many people noticed this -patches discussion:
http://archives.postgresql.org/pgsql-patches/2008-03/msg00282.php
so I'm reposting on -hackers.

The conclusion I drew from that thread is that we should send the text
of all queries involved in a deadlock to the server log, regardless
of session ownership, while reverting the client-side display to what
it's been historically.  We could argue separately about whether it's
ever really safe to report other-session queries, but in any case the
server log will receive a more complete report in general.  To implement
that we'll need some new feature in elog.c.  I propose a new auxiliary
function "errdetail_log()" that works exactly like errdetail(), except
that the string it produces is reported only to the server log, never
to the client.  If both errdetail() and errdetail_log() are used then
one string goes to the client and the other to the log.  The current
uses for this are in deadlock reporting and in reporting what objects
a user owns when DROP USER fails.

The main objection I can think of to this approach is that the two
strings will be somewhat redundant, and storing both independently could
result in an out-of-memory failure that otherwise wouldn't occur.  I'm
not too worried about that in the deadlock case, because it's unlikely
that you'd have a large number of processes involved in a deadlock,
so the duplicative storage of "Process N is blocked by process M" lines
really isn't going to amount to much.  There's a bit more space at stake
in the DROP USER case, but it's still pretty small (since the whole
point in the DROP USER case is to limit how much text goes to the
client) and in any case I don't foresee people doing DROP USER in
low-memory situations.  So, while we could arrange the definition in
some more complex way to avoid storing duplicate text, I don't really
think it's worth the trouble and potential loss of flexibility.

Barring objections, I'm going to go ahead and do this now so that we can
close the books on Itagaki-san's deadlock reporting patch.  I don't want
to risk forgetting that we have a security concern to deal with there.
        regards, tom lane


Re: Proposal: new ereport option "errdetail_log"

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> I'm not sure how many people noticed this -patches discussion:
> http://archives.postgresql.org/pgsql-patches/2008-03/msg00282.php
> so I'm reposting on -hackers.
>
> The conclusion I drew from that thread is that we should send the text
> of all queries involved in a deadlock to the server log, regardless
> of session ownership, while reverting the client-side display to what
> it's been historically.  

Modulo including a HINT suggesting looking at the log I concur.

I can't actually come up with an unassailable argument for it but I would
definitely be more comfortable. I'm picturing an environment like a web server
database where the database user doesn't actually represent the security fault
lines. Batch jobs, back-end administration, and web queries are quite likely
being run as the same user and security being handled by the application.

In such a configuration you would normally have the web front-end configured
not to show errors at all to avoid leaking even the front-end queries --
that's why I say I can't see an unassailable argument. But if you *don't*
configure it to do that the worst you would expect is for it to leak the web
front-end queries, not random administrative queries which happen to be
running at the same time.

I wonder how useful it is to output process ids at all. And for that matter
whether leaking process ids alone could be considered a security risk. Perhaps
the error message should just output enough detail to uniquely identify the
log message.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production
Tuning


Re: Proposal: new ereport option "errdetail_log"

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> I wonder how useful it is to output process ids at all. And for that matter
> whether leaking process ids alone could be considered a security risk.

Seems overly paranoid, especially considering we've output that
information after a deadlock for many years and no one's complained.
        regards, tom lane


Re: Proposal: new ereport option "errdetail_log"

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> I wonder how useful it is to output process ids at all. And for that matter
>> whether leaking process ids alone could be considered a security risk.
>
> Seems overly paranoid, especially considering we've output that
> information after a deadlock for many years and no one's complained.

Well I was coming at it from the other direction -- questioning whether it's
at all useful and if it's not whether there's any marginal downside even if
it's slight.

The axis on which I still see real room for improvement here is on the
description of the locks. It's awfully hard for a user to tell from the
deadlock message exactly what operation of the query was acquiring what lock
when it deadlocked.

I'm not sure how to improve that though. It's an inherent problem that
understanding deadlocks requires understanding a certain amount of internal
implementation details.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!


Re: Proposal: new ereport option "errdetail_log"

From
Alvaro Herrera
Date:
Gregory Stark wrote:

> The axis on which I still see real room for improvement here is on the
> description of the locks. It's awfully hard for a user to tell from the
> deadlock message exactly what operation of the query was acquiring what lock
> when it deadlocked.

Are the involved queries not enough?  Why?  What would you like to
have?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Proposal: new ereport option "errdetail_log"

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Gregory Stark wrote:
>> The axis on which I still see real room for improvement here is on the
>> description of the locks. It's awfully hard for a user to tell from the
>> deadlock message exactly what operation of the query was acquiring what lock
>> when it deadlocked.

> Are the involved queries not enough?  Why?  What would you like to
> have?

Greg's certainly got a point.  Consider for example tuple-level locks
taken as a result of an FK check --- which one, and which rows are
involved?  Or the case where the logged query is just "SELECT
some_huge_user_defined_function()" and you have no idea what part of the
function is triggering it.  (The CONTEXT traceback will help here if the
backend running the function is the one that errors out, but not when
it's some other backend.)

I don't have any immediate ideas for improvement either, but we
certainly shouldn't consider this a totally solved problem.
        regards, tom lane


Re: Proposal: new ereport option "errdetail_log"

From
Decibel!
Date:
On Mar 24, 2008, at 6:21 PM, Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Gregory Stark wrote:
>>> The axis on which I still see real room for improvement here is  
>>> on the
>>> description of the locks. It's awfully hard for a user to tell  
>>> from the
>>> deadlock message exactly what operation of the query was  
>>> acquiring what lock
>>> when it deadlocked.
>
>> Are the involved queries not enough?  Why?  What would you like to
>> have?
>
> Greg's certainly got a point.  Consider for example tuple-level locks
> taken as a result of an FK check --- which one, and which rows are
> involved?  Or the case where the logged query is just "SELECT
> some_huge_user_defined_function()" and you have no idea what part  
> of the
> function is triggering it.  (The CONTEXT traceback will help here  
> if the
> backend running the function is the one that errors out, but not when
> it's some other backend.)
>
> I don't have any immediate ideas for improvement either, but we
> certainly shouldn't consider this a totally solved problem.

Something I always find myself wanting when debugging locking issues  
is what's in pg_locks. Could we save that information somewhere when  
we detect a deadlock? In a real table would be nice, but I'd settle  
for just dumping it to a file somewhere. Or maybe into the logs.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Re: Proposal: new ereport option "errdetail_log"

From
Bruce Momjian
Date:
Decibel! wrote:
> > I don't have any immediate ideas for improvement either, but we
> > certainly shouldn't consider this a totally solved problem.
> 
> Something I always find myself wanting when debugging locking issues  
> is what's in pg_locks. Could we save that information somewhere when  
> we detect a deadlock? In a real table would be nice, but I'd settle  
> for just dumping it to a file somewhere. Or maybe into the logs.

That option will be in 8.4, logging deadlock information.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +