Thread: Re: Auto-explain patch
OK, this one should work against CVS HEAD.
Dean.
> Subject: Re: [HACKERS] Auto-explain patch
> From: simon@2ndquadrant.com
> To: dean_rasheed@hotmail.com
> CC: pgsql-hackers@postgresql.org
> Date: Sun, 6 Jul 2008 16:42:55 +0100
>
>
> On Thu, 2008-07-03 at 16:58 +0000, Dean Rasheed wrote:
> > Here is an updated version of the patch
>
> Dean,
>
> I'm getting 4 chunks fail when trying to apply your patch onto CVS HEAD.
>
> I'm sure its just a little bitrot. Can you update the patch please?
>
> Thanks,
>
> --
> Simon Riggs www.2ndQuadrant.com
> PostgreSQL Training, Services and Support
>
Win £3000 to spend on whatever you want at Uni! Click here to WIN!
Dean.
> Subject: Re: [HACKERS] Auto-explain patch
> From: simon@2ndquadrant.com
> To: dean_rasheed@hotmail.com
> CC: pgsql-hackers@postgresql.org
> Date: Sun, 6 Jul 2008 16:42:55 +0100
>
>
> On Thu, 2008-07-03 at 16:58 +0000, Dean Rasheed wrote:
> > Here is an updated version of the patch
>
> Dean,
>
> I'm getting 4 chunks fail when trying to apply your patch onto CVS HEAD.
>
> I'm sure its just a little bitrot. Can you update the patch please?
>
> Thanks,
>
> --
> Simon Riggs www.2ndQuadrant.com
> PostgreSQL Training, Services and Support
>
Win £3000 to spend on whatever you want at Uni! Click here to WIN!
Attachment
On Sun, 2008-07-06 at 17:58 +0000, Dean Rasheed wrote: > OK, this one should work against CVS HEAD. OK, still getting some offsets, but it applies. The patch now outputs things in a useful way to avoid a flood of information, that's good. The code seems fine, but it doesn't link up with the other logging code in postgres.c. That split also seems to drive the way the parameters work, so you have another set of parameters mirroring the log_min_duration_statement infrastructure. Fairly yuk, but I understand why you've done this. It does however beg some serious questions about the way we currently do logging. The patch as it stands would produce EXPLAIN output for queries that will never be logged, which isn't very useful. So Dean, right now I feel this version of the patch must be rejected while we discuss how things might work differently. Not an issue with the patch, so much as issues with the code being patched. So thinking about the issues we can either ignore them or tackle them: Ignoring problem option: Rewrite this patch as an executor plugin, using Itagaki Takahiro's new infrastructure. It's an option, but I guess Dave will have something to say about that... Fix problems: Having logging in two separate places is ugly and leads to the problem of logging EXPLAINs and SQL completely separately, which isn't very useful. If we put all the logging code in the executor we miss out any parsing and planning time but we can log statements and explains for SPI queries also. We could move the logging of the EXPLAIN to postgres.c, but then we'd need to make querydescs live a little longer than desirable, or we trigger the EXPLAIN code using some guess at what the planning overhead was, a horrible kluge. Neither way is clean. So how do we do it? My proposal would be to introduce some new logging statements that work similarly to, yet as an alternative to log_min_duration_statement. * log_min_planner_duration - logs any statement whose planning time exceeds the value in ms (-1 is off). e.g. LOG planner time: %s ms statement: %s * log_min_executor_duration - logs any execution whose execution time exceeds the value in ms (-1 is off). We need to pass query text through to the executor, depending upon whether its a portal's source text or an incoming simple querystring. (This is similar, but not quite the same as current patch's debug_explain_min_duration). e.g. LOG executor time: %s ms statement: %s * log_explain = on | off - logs EXPLAIN if either log_min_executor_duration or log_min_planner_duration triggers. We output the explain on the same line as the SQL statement, so we don't need to play games to link up the output later. (This is similar to, but not quite the same as current patch's debug_explain_plan). e.g. LOG executor time: %s ms statement: %s If both log_min_executor_duration and log_min_planner_duration trigger then we get only one log line like this: e.g. LOG planner time: %s ms executor time: %s ms statement: %s We should call these parameters log_xxx not debug_xxx because we want them for production logging. If either of these new parameters is > -1 then log_min_duration_statement does not trigger. (Or we simply make log_min_duration_statement control the setting of the other two parameters). The log messages shown above allow us to differentiate between the reasons the SQL statement has been logged, as well as providing the details we want. If we do this then: * we can log EXPLAINs for any execution, plus we will always have the SQL for any EXPLAIN * we can log statements that take a long time to plan, yet a short time to execute. That can be an overhead in some cases. It also neatly sidelines the issue of where logging takes place. (We need to be able to track planner time, if we think we might want to introduce options to control planner effort, topic discussed on other recent thread, not here) * we can log SQL for any execution, even those via SPI. That means we'll be able to log the SQL used in functions. We'll probably need an option to log SQL from SPI or not. Things like RI trigger code probably doesn't need to be logged, especially if the top level statement is, like ALTER TABLE. This alters current performance logging, so I expect discussion. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon, I like your proposal, and I think I can see how to code it fairly easily. There is one thing that it doesn't allow, however, which the debug_xxx parameters do, and that is for a non-superuser to trace SQL used in functions, from an interactive client session. For me, this is quite a big thing, because I find it most convienient to turn these parameters on while writing and tweaking stored procedures, and have the output go straight to my psql window, without having to connect as a superuser and trawl through log files. So I suggest grouping these parameters in their own category (eg. "sql_trace") and then having additional parameters to control where the output would go. So the sql_trace parameters would be: * sql_trace_min_planner_duration * sql_trace_min_executor_duration * sql_trace_explain_plan and they would work exactly as you describe, except they would be settable by normal users. Then the destination(s) for the statement and EXPLAIN logging would be controlled by: * client_sql_trace = on | off - settable by a normal user to allow a client session to see the sql_trace output. If this parameter is on, the sql_trace will be logged as NOTICE output. * log_sql_trace = on | off - settable only by superusers. If this parameter is on, the sql_trace will be logged as LOG output, and the normal client sessions will see nothing (unless they set client_sql_trace). If both client_sql_trace and log_sql_trace are on, then the output goes to both locations, and if both are off, then the sql_trace parameters would do nothing. Dean > Subject: RE: [HACKERS] Auto-explain patch > From: simon@2ndquadrant.com > To: dean_rasheed@hotmail.com > CC: pgsql-hackers@postgresql.org > Date: Mon, 7 Jul 2008 18:03:15 +0100 > > > On Sun, 2008-07-06 at 17:58 +0000, Dean Rasheed wrote: > >> OK, this one should work against CVS HEAD. > > OK, still getting some offsets, but it applies. > > The patch now outputs things in a useful way to avoid a flood of > information, that's good. > > The code seems fine, but it doesn't link up with the other logging code > in postgres.c. That split also seems to drive the way the parameters > work, so you have another set of parameters mirroring the > log_min_duration_statement infrastructure. Fairly yuk, but I understand > why you've done this. It does however beg some serious questions about > the way we currently do logging. > > The patch as it stands would produce EXPLAIN output for queries that > will never be logged, which isn't very useful. So Dean, right now I feel > this version of the patch must be rejected while we discuss how things > might work differently. Not an issue with the patch, so much as issues > with the code being patched. > > So thinking about the issues we can either ignore them or tackle them: > > Ignoring problem option: Rewrite this patch as an executor plugin, using > Itagaki Takahiro's new infrastructure. It's an option, but I guess Dave > will have something to say about that... > > Fix problems: Having logging in two separate places is ugly and leads to > the problem of logging EXPLAINs and SQL completely separately, which > isn't very useful. > > If we put all the logging code in the executor we miss out any parsing > and planning time but we can log statements and explains for SPI queries > also. > > We could move the logging of the EXPLAIN to postgres.c, but then we'd > need to make querydescs live a little longer than desirable, or we > trigger the EXPLAIN code using some guess at what the planning overhead > was, a horrible kluge. > > Neither way is clean. So how do we do it? > > My proposal would be to introduce some new logging statements that work > similarly to, yet as an alternative to log_min_duration_statement. > > * log_min_planner_duration - logs any statement whose planning time > exceeds the value in ms (-1 is off). > e.g. LOG planner time: %s ms statement: %s > > * log_min_executor_duration - logs any execution whose execution time > exceeds the value in ms (-1 is off). We need to pass query text through > to the executor, depending upon whether its a portal's source text or an > incoming simple querystring. (This is similar, but not quite the same as > current patch's debug_explain_min_duration). > e.g. LOG executor time: %s ms statement: %s > > * log_explain = on | off - logs EXPLAIN if either > log_min_executor_duration or log_min_planner_duration triggers. We > output the explain on the same line as the SQL statement, so we don't > need to play games to link up the output later. (This is similar to, but > not quite the same as current patch's debug_explain_plan). > e.g. LOG executor time: %s ms statement: %s > > If both log_min_executor_duration and log_min_planner_duration trigger > then we get only one log line like this: > e.g. LOG planner time: %s ms executor time: %s ms statement: %s > > We should call these parameters log_xxx not debug_xxx because we want > them for production logging. > > If either of these new parameters is> -1 then > log_min_duration_statement does not trigger. (Or we simply make > log_min_duration_statement control the setting of the other two > parameters). > > The log messages shown above allow us to differentiate between the > reasons the SQL statement has been logged, as well as providing the > details we want. > > If we do this then: > > * we can log EXPLAINs for any execution, plus we will always have the > SQL for any EXPLAIN > > * we can log statements that take a long time to plan, yet a short time > to execute. That can be an overhead in some cases. It also neatly > sidelines the issue of where logging takes place. (We need to be able to > track planner time, if we think we might want to introduce options to > control planner effort, topic discussed on other recent thread, not > here) > > * we can log SQL for any execution, even those via SPI. That means we'll > be able to log the SQL used in functions. > > We'll probably need an option to log SQL from SPI or not. Things like RI > trigger code probably doesn't need to be logged, especially if the top > level statement is, like ALTER TABLE. > > This alters current performance logging, so I expect discussion. > > -- > Simon Riggs www.2ndQuadrant.com > PostgreSQL Training, Services and Support > _________________________________________________________________ Invite your Facebook friends to chat on Messenger http://clk.atdmt.com/UKM/go/101719649/direct/01/
Dean Rasheed <dean_rasheed@hotmail.com> wrote: > * client_sql_trace = on | off - settable by a normal user to allow a > client session to see the sql_trace output. If this parameter is on, > the sql_trace will be logged as NOTICE output. In terms of security, is it ok to show normal users SQLs used in functions that are owned by other users? Users can call not-owned functions only if they have EXECUTE privilege on them. -- presently we can see function bodies from pg_proc.prosrc freely, though. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
On Wed, 2008-07-09 at 09:11 +0000, Dean Rasheed wrote: > Simon, I like your proposal, and I think I can see how to code it > fairly easily. > > There is one thing that it doesn't allow, however, which the debug_xxx > parameters do, and that is for a non-superuser to trace SQL used in > functions, from an interactive client session. For me, this is quite a > big thing, because I find it most convienient to turn these parameters > on while writing and tweaking stored procedures, and have the output > go straight to my psql window, without having to connect as a superuser > and trawl through log files. Understood. > So I suggest grouping these parameters in their own category > (eg. "sql_trace") and then having additional parameters to control > where the output would go. So the sql_trace parameters would be: > > * sql_trace_min_planner_duration > * sql_trace_min_executor_duration > * sql_trace_explain_plan > > and they would work exactly as you describe, except they would be > settable by normal users. This is already possible, if your crafty. > Then the destination(s) for the statement > and EXPLAIN logging would be controlled by: > > * client_sql_trace = on | off - settable by a normal user to allow a > client session to see the sql_trace output. If this parameter is on, > the sql_trace will be logged as NOTICE output. Just set client_min_messages = 'LOG'; -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On 7/9/08, ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> wrote: > Dean Rasheed <dean_rasheed@hotmail.com> wrote: > > * client_sql_trace = on | off - settable by a normal user to allow a > > client session to see the sql_trace output. If this parameter is on, > > the sql_trace will be logged as NOTICE output. > > > In terms of security, is it ok to show normal users SQLs used in functions > that are owned by other users? Users can call not-owned functions only if > they have EXECUTE privilege on them. -- presently we can see function > bodies from pg_proc.prosrc freely, though. Different owner is not a problem, but SECURITY DEFINER may be. That can be solved by turning the setting off on entry to such function, by non-superuser. Like we handle search_path. -- marko
> Just set client_min_messages = 'LOG'; True, but you would still need to be a superuser to to set the min_durations and explain parameters. The other advantage of client_sql_trace is that you could debug your own functions without filling up the log file. It would work better for multiple users sharing the same DB. Dean. _________________________________________________________________ Play and win great prizes with Live Search and Kung Fu Panda http://clk.atdmt.com/UKM/go/101719966/direct/01/
Of course you can still sort of see the SQL used in functions declared SECURITY DEFINER, using debug_print_parse, but this would be opening up a much more transparent way to do it. Do I need to worry about this? Dean ---------------------------------------- > Date: Wed, 9 Jul 2008 14:22:45 +0300 > From: markokr@gmail.com > To: itagaki.takahiro@oss.ntt.co.jp > Subject: Re: [HACKERS] Auto-explain patch > CC: dean_rasheed@hotmail.com; simon@2ndquadrant.com; pgsql-hackers@postgresql.org > > On 7/9/08, ITAGAKI Takahiro wrote: >> Dean Rasheed wrote: >>> * client_sql_trace = on | off - settable by a normal user to allow a >> > client session to see the sql_trace output. If this parameter is on, >> > the sql_trace will be logged as NOTICE output. >> >> >> In terms of security, is it ok to show normal users SQLs used in functions >> that are owned by other users? Users can call not-owned functions only if >> they have EXECUTE privilege on them. -- presently we can see function >> bodies from pg_proc.prosrc freely, though. > > Different owner is not a problem, but SECURITY DEFINER may be. > > That can be solved by turning the setting off on entry to such function, > by non-superuser. Like we handle search_path. > > -- > marko _________________________________________________________________ Find the best and worst places on the planet http://clk.atdmt.com/UKM/go/101719807/direct/01/
On 7/9/08, Dean Rasheed <dean_rasheed@hotmail.com> wrote: > Of course you can still sort of see the SQL used in functions declared > SECURITY DEFINER, using debug_print_parse, but this would be opening > up a much more transparent way to do it. > > Do I need to worry about this? If this allows to see values, then yes. Otherwise no. -- marko
"Marko Kreen" <markokr@gmail.com> writes: > On 7/9/08, Dean Rasheed <dean_rasheed@hotmail.com> wrote: >> Do I need to worry about this? > If this allows to see values, then yes. Otherwise no. It definitely would open a hole that doesn't exist now, which is to see values that are inserted into an EXECUTE'd query. regards, tom lane
On Wed, 2008-07-09 at 12:01 +0000, Dean Rasheed wrote: > > Just set client_min_messages = 'LOG'; > > True, but you would still need to be a superuser to to set the min_durations and > explain parameters. No > The other advantage of client_sql_trace is that you could > debug your own functions without filling up the log file. It would work better for > multiple users sharing the same DB. Not sure about that. We have debuggers and dbms_put_line already. I think it would be better to introduce a new logging category than to do something just specifically for this situation. Suggest new level=TRACE. This allows messages which are not typically logged by server, yet are specifically requested by client. Admin has the option to log the messages if desired. Server log sequence is DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1, INFO, NOTICE, WARNING, ERROR, *TRACE*, LOG, FATAL, and PANIC. Rationale for placement: Default is LOG, so TRACE messages not logged by default. Sometimes set to ERROR. Setting at this position allows full system trace to be taken if required, then reset when trace complete. Client log Sequence is DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1, LOG, NOTICE, *TRACE*, WARNING, ERROR, FATAL, and PANIC. Rationale for placement: many NOTICE messages are generated by various commands, so many people can still set this off and still get TRACE messages. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Wed, 2008-07-09 at 09:11 +0000, Dean Rasheed wrote: > Simon, I like your proposal, and I think I can see how to code it > fairly easily. > > There is one thing that it doesn't allow, however, which the debug_xxx > parameters do, and that is for a non-superuser to trace SQL used in > functions, from an interactive client session. For me, this is quite a > big thing, because I find it most convienient to turn these parameters > on while writing and tweaking stored procedures, and have the output > go straight to my psql window, without having to connect as a superuser > and trawl through log files. > > So I suggest grouping these parameters in their own category > (eg. "sql_trace") and then having additional parameters to control > where the output would go. So the sql_trace parameters would be: > > * sql_trace_min_planner_duration > * sql_trace_min_executor_duration > * sql_trace_explain_plan > > and they would work exactly as you describe, except they would be > settable by normal users. Then the destination(s) for the statement > and EXPLAIN logging would be controlled by: > > * client_sql_trace = on | off - settable by a normal user to allow a > client session to see the sql_trace output. If this parameter is on, > the sql_trace will be logged as NOTICE output. After sleeping on this, I think we should follow your idea. If its possible to do the sql_trace_* parameters as a single one, I would prefer it, since it makes it more practical to use dynamically. Not sure how...maybe with a wrapper function? sql_trace(integer) sets just sql_trace_min_executor_duration sql_trace(integer, boolean) sets executor and explain sql_trace(integer, integer, boolean) sets all 3 I think you probably need to drop the sql_ off the front because of parameter length only. No need for the other log_... parameter though. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
"Simon Riggs" <simon@2ndquadrant.com> writes: > On Wed, 2008-07-09 at 09:11 +0000, Dean Rasheed wrote: >> >> So I suggest grouping these parameters in their own category >> (eg. "sql_trace") and then having additional parameters to control >> where the output would go. So the sql_trace parameters would be: >> >> * sql_trace_min_planner_duration >> * sql_trace_min_executor_duration >> * sql_trace_explain_plan >> > If its possible to do the sql_trace_* parameters as a single one, I > would prefer it, since it makes it more practical to use dynamically. > Not sure how...maybe with a wrapper function? > > sql_trace(integer) sets just sql_trace_min_executor_duration > sql_trace(integer, boolean) sets executor and explain > sql_trace(integer, integer, boolean) sets all 3 Fwiw it seems to me "trace_sql_*" would be nicer, much as we have track_* guc parameters. Though I also wonder if there's really any distinction here between "tracing" and "logging" like log_explain_plan and so on. Perhaps we should keep the word "trace" for a more in-detail facility. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!