Thread: Re: [PATCHES] Implemented current_query

Re: [PATCHES] Implemented current_query

From
Bruce Momjian
Date:
Neil Conway wrote:
> On Mon, 2007-07-05 at 19:48 +0100, Tomas Doran wrote:
> > As suggested in the TODO list (and as I need the functionality  
> > myself), I have implemented the current_query interface to  
> > debug_query_string.

It actually has been removed from the TODO list since you saw it last.

> Comments:
> 
...
> * AFAIK debug_query_string() still does the wrong thing when the user
> submits multiple queries in a single protocol message (separated by
> semi-colons). Not sure there's a way to fix that that is both easy and
> efficient, though...

The problem with the last bullet is pretty serious.  It can be
illustrated with psql:
$ psql -c 'set log_statement="all";select 1;select 2;' test

Server log shows:
STATEMENT:  set log_statement=all;select 1;select 2;

Obviously this is what current_query() would return if we commit this
patch, and it probably isn't 100% accurate.  I see dblink exposes this:
   http://www.postgresql.org/docs/8.3/static/contrib-dblink-current-query.html
   Returns the currently executing interactive command string of the   local database session, or NULL if it can't be
determined. Note   that this function is not really related to <filename>dblink</>'s   other functionality.  It is
providedsince it is sometimes useful   in generating queries to be forwarded to remote databases.
 

but making it more widely available with a possible inaccurate result is
a problem.  We can't think of anyway to fix this cleanly --- it would
require a separate parser pass to split queries by semicolons (which
psql does by default in interactive mode).  Right now the parser does
the splitting as part of its normal single-parse operation and just
creates parse trees that don't have string representations.

Perhaps we could name it received_query() to indicate it is what the
backend received and it not necessarily the _current_ query.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: [PATCHES] Implemented current_query

From
Tomas Doran
Date:
On 28 Mar 2008, at 17:23, Bruce Momjian wrote:
> Neil Conway wrote:
>> On Mon, 2007-07-05 at 19:48 +0100, Tomas Doran wrote:
>>> As suggested in the TODO list (and as I need the functionality
>>> myself), I have implemented the current_query interface to
>>> debug_query_string.
>
> It actually has been removed from the TODO list since you saw it last.

I submitted a patch to make it do that a while ago :)

>> Comments:
>>
> ...
>> * AFAIK debug_query_string() still does the wrong thing when the user
>> submits multiple queries in a single protocol message (separated by
>> semi-colons). Not sure there's a way to fix that that is both easy  
>> and
>> efficient, though...
>
> The problem with the last bullet is pretty serious.  It can be
> illustrated with psql:
>
>     $ psql -c 'set log_statement="all";select 1;select 2;' test
>
> Server log shows:
>
>     STATEMENT:  set log_statement=all;select 1;select 2;
>
> Obviously this is what current_query() would return if we commit this
> patch, and it probably isn't 100% accurate.

Yeah, this was pointed out to me at the time.

Fortunately, for what I wanted to do, 'Don't do that then' was a very  
viable answer..

> I see dblink exposes this:
>
>     http://www.postgresql.org/docs/8.3/static/contrib-dblink- 
> current-query.html
>
>     Returns the currently executing interactive command string of the
>     local database session, or NULL if it can't be determined.  Note
>     that this function is not really related to <filename>dblink</>'s
>     other functionality.  It is provided since it is sometimes useful
>     in generating queries to be forwarded to remote databases.

My patch provided this functionality in core, and made dblink's  
current procedure to do the same just delegate to the one that I  
provided (for backwards compatibility reasons)


> but making it more widely available with a possible inaccurate  
> result is
> a problem.  We can't think of anyway to fix this cleanly --- it would
> require a separate parser pass to split queries by semicolons (which
> psql does by default in interactive mode).  Right now the parser does
> the splitting as part of its normal single-parse operation and just
> creates parse trees that don't have string representations.
>
> Perhaps we could name it received_query() to indicate it is what the
> backend received and it not necessarily the _current_ query.

reveived_query() sounds like a very sane name for me, and documenting  
it as such would allow you to expose the functionality without the  
possible complaints...

In a lot of environments where you actually want this, then  
constraining to 1 query per statement (outside the DB level) is very  
doable... I wouldn't like to see the functionality skipped over as  
providing this only solves 80% of cases.


In the particular application that I wrote the patch for, we needed  
to audit 'all access to encrypted credit card numbers' for PCI  
requirements..

Our solution was to put all cc number containing tables into their  
own schema / with no general permissions, and to use SECURITY DEFINER  
stored procedures to access them (and log the access).. However that  
wasn't quite good enough, so we got our DB access layer to iterate up  
the call stack (till outside our SQL abstraction), and add a comment  
to every query such that it took the form:

/* CodeFile-LineNo-UserId */ SELECT stored_procedure(arg1, arg2);

for all queries - so the caller information was encoded in the query  
info... Therefore, inside 'stored_procedure', logging the value of  
current_query() was perfect to satisfy our audit requirements, and we  
can just log the current query when we enter 'stored_procedure'.

Hope this helps to clarify that, whilst the current mechanism isn't  
in any way perfect - there are a number of use cases for including  
the functionality as-is.

Cheers
Tom


Cheers
Tom



Re: [PATCHES] Implemented current_query

From
Alvaro Herrera
Date:
Tomas Doran wrote:

> On 28 Mar 2008, at 17:23, Bruce Momjian wrote:

>> Perhaps we could name it received_query() to indicate it is what the
>> backend received and it not necessarily the _current_ query.
>
> reveived_query() sounds like a very sane name for me, and documenting it 
> as such would allow you to expose the functionality without the possible 
> complaints...

client_query perhaps?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [PATCHES] Implemented current_query

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> Tomas Doran wrote:
> 
> > On 28 Mar 2008, at 17:23, Bruce Momjian wrote:
> 
> >> Perhaps we could name it received_query() to indicate it is what the
> >> backend received and it not necessarily the _current_ query.
> >
> > reveived_query() sounds like a very sane name for me, and documenting it 
> > as such would allow you to expose the functionality without the possible 
> > complaints...
> 
> client_query perhaps?

Yea, that is consistent with what we do with other functions.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: [PATCHES] Implemented current_query

From
Simon Riggs
Date:
On Fri, 2008-03-28 at 14:32 -0400, Bruce Momjian wrote:
> Alvaro Herrera wrote:
> > Tomas Doran wrote:
> > 
> > > On 28 Mar 2008, at 17:23, Bruce Momjian wrote:
> > 
> > >> Perhaps we could name it received_query() to indicate it is what the
> > >> backend received and it not necessarily the _current_ query.
> > >
> > > reveived_query() sounds like a very sane name for me, and documenting it 
> > > as such would allow you to expose the functionality without the possible 
> > > complaints...
> > 
> > client_query perhaps?
> 
> Yea, that is consistent with what we do with other functions.

How about client_request()

It's then clear that a request can be made up of many statements, which
will be executed in turn.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com 
 PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk



Re: [PATCHES] Implemented current_query

From
Bruce Momjian
Date:
Simon Riggs wrote:
> On Fri, 2008-03-28 at 14:32 -0400, Bruce Momjian wrote:
> > Alvaro Herrera wrote:
> > > Tomas Doran wrote:
> > > 
> > > > On 28 Mar 2008, at 17:23, Bruce Momjian wrote:
> > > 
> > > >> Perhaps we could name it received_query() to indicate it is what the
> > > >> backend received and it not necessarily the _current_ query.
> > > >
> > > > reveived_query() sounds like a very sane name for me, and documenting it 
> > > > as such would allow you to expose the functionality without the possible 
> > > > complaints...
> > > 
> > > client_query perhaps?
> > 
> > Yea, that is consistent with what we do with other functions.
> 
> How about client_request()
> 
> It's then clear that a request can be made up of many statements, which
> will be executed in turn.

The problem with client_request() is that it is not clear it is a query
--- it could be a disonnection or cancel request, for example.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: [PATCHES] Implemented current_query

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Alvaro Herrera wrote:
> > Tomas Doran wrote:
> > 
> > > On 28 Mar 2008, at 17:23, Bruce Momjian wrote:
> > 
> > >> Perhaps we could name it received_query() to indicate it is what the
> > >> backend received and it not necessarily the _current_ query.
> > >
> > > reveived_query() sounds like a very sane name for me, and documenting it 
> > > as such would allow you to expose the functionality without the possible 
> > > complaints...
> > 
> > client_query perhaps?
> 
> Yea, that is consistent with what we do with other functions.

Uh, I think based on other usage it should be called client_statement().
Peter has us using statement instead of query in many cases.

FYI, log_statement also prints the combined query string.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: [PATCHES] Implemented current_query

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Uh, I think based on other usage it should be called client_statement().

That is *exactly* the wrong thing, because "statement" specifically
means one SQL statement.

"client_query" seems about the best compromise I've heard so far.

It's too bad we didn't have this debate before pg_stat_activity got out
into the wild, because it's now too late to rename its column
current_query.  Possibly we should stick with current_query() just
for consistency with that view ...
        regards, tom lane