Thread: Re: Increasing the length of

Re: Increasing the length of

From

"David Parker"

Date:

30 November 2004, 18:57:02

I've been using "log_min_duration_statement = 0" to get durations on all
SQL statements for the purposes of performance tuning, because this logs
the duration on the same line as the statement. My reading of this TODO
is that now log_min_duration_statement = 0 would give me the statements
but no total duration?

- DAP

>-----Original Message-----
>From: pgsql-hackers-owner@postgresql.org
>[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Bruce Momjian
>Sent: Tuesday, November 30, 2004 1:20 PM
>To: Simon Riggs
>Cc: Andrew Sullivan; pgsql-hackers@postgresql.org
>Subject: Re: [HACKERS] Increasing the length of
>
>
>Great idea.  Added to TODO:
>
>* Make log_min_duration_statement output when the duration is
>reached rather
>  than when the statement completes
>
>  This prints long queries while they are running, making
>trouble shooting
>  easier.  Also, it eliminates the need for log_statement because it
>  would now be the same as a log_min_duration_statement of zero.
>
>---------------------------------------------------------------
>------------
>
>Simon Riggs wrote:
>> On Wed, 2004-11-10 at 22:51, Andrew Sullivan wrote:
>> > On Wed, Nov 10, 2004 at 09:52:17PM +0000, Simon Riggs wrote:
>> > > On Wed, 2004-11-10 at 21:48, Richard Huxton wrote:
>> > > >
>> > > > Isn't that:
>> > > > log_min_duration_statement (integer)
>> > >
>> > > That gets written when a statement completes, not during
>execution.
>> >
>> > I've been following this thread, and I was thinking the
>same thing.
>> > I wonder how much work it'd be to have another log setting -- say
>> > log_statement_after_min_duration (integer) -- which did what Simon
>> > wants.  That'd more than satisfy my need, for sure.  Might
>the cost
>> > of that be too high, though?
>>
>> I think this is a great idea.
>>
>> ...Rather than invent a new GUC, I think this is the solution:
>>
>> log_min_duration_statement writes to log at end of execution, if
>> execution time is greater than that threshold. Let's move that piece
>> of code so it is executed as the query progresses. That way, you get
>> notified that a problem query is occurring NOW, rather than "it has
>> occurred".
>>
>> The code already has such a timer check, for statement_timeout, in
>> backend/storage/lmgr/proc.c. We could reuse this timer to go off at
>> log_min_duration_statement and then log query if still
>executing. (If
>> log_min_duration_statement >= statement_timeout, we would skip that
>> step.) We would then reset the timer so that it then goes
>off at where
>> it does now, at statement_timeout. So, same piece of code,
>used twice...
>>
>> That way you can set up 2 limits, with three bands of actions:
>>
>> Between 0 and log_min_duration_statement
>> - logs nothing
>>
>> Between log_min_duration_statement and statement_timeout
>> - statement written to log, though execution continues...
>>
>> At statement_timeout
>> - statement cancelled
>>
>> We'd just need a small piece of code to set timer correctly first,
>> then another piece to record state change and reset timer
>again. Lift
>> and drop the existing code from end-of-execution.
>>
>> This then:
>> - solves the *problem query* diagnosis issue, as originally
>raised by
>> Sean and seconded by myself and Greg
>> - makes the answer exactly what Tom proposed - look in the logs
>> - doesn't make any changes to the technical innards of UDP and
>> pgstats.c
>> - no administrative interface changes, just slightly changed
>behaviour
>> - existing users mostly wouldn't even notice we'd done it...
>>
>> Thoughts?
>>
>> Easy enough change to be included as a hot fix for 8.0: no
>new system
>> code, no new interface code, just same behaviour at different time.
>>
>>
>> --
>> Best Regards, Simon Riggs
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 9: the planner will ignore your desire to choose an
>index scan if your
>>       joining column's datatypes do not match
>>
>
>--
>  Bruce Momjian                        |  http://candle.pha.pa.us
>  pgman@candle.pha.pa.us               |  (610) 359-1001
>  +  If your life is a hard drive,     |  13 Roberts Road
>  +  Christ can be your backup.        |  Newtown Square,
>Pennsylvania 19073
>
>---------------------------(end of
>broadcast)---------------------------
>TIP 9: the planner will ignore your desire to choose an index
>scan if your
>      joining column's datatypes do not match
>

Re: Increasing the length of

From

Bruce Momjian

Date:

30 November 2004, 19:34:00

David Parker wrote:
> I've been using "log_min_duration_statement = 0" to get durations on all
> SQL statements for the purposes of performance tuning, because this logs
> the duration on the same line as the statement. My reading of this TODO
> is that now log_min_duration_statement = 0 would give me the statements
> but no total duration?

Oh, sorry, you are right.  I forgot about the duration part!  I got so
excited I forgot.

TODO item removed.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Increasing the length of

From

"Jim C. Nasby"

Date:

30 November 2004, 23:23:28

Could we come up with a compromise then? I guess maybe another setting
that says log any query when it hits more than x amount of time. (I'd
also argue you should get a NOTICE or WARNING when this exceeds the
query timeout time).

A perhapse more friendly alternative would be a way to query to get this
information in real-time, but that probably goes back into the
discussion about the length of data made available in pg_stat_activity.

On Tue, Nov 30, 2004 at 02:32:05PM -0500, Bruce Momjian wrote:
> David Parker wrote:
> > I've been using "log_min_duration_statement = 0" to get durations on all
> > SQL statements for the purposes of performance tuning, because this logs
> > the duration on the same line as the statement. My reading of this TODO
> > is that now log_min_duration_statement = 0 would give me the statements
> > but no total duration?
> 
> Oh, sorry, you are right.  I forgot about the duration part!  I got so
> excited I forgot.
> 
> TODO item removed.
> 
> -- 
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 359-1001
>   +  If your life is a hard drive,     |  13 Roberts Road
>   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
>                http://archives.postgresql.org
> 

-- 
Jim C. Nasby, Database Consultant               decibel@decibel.org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: Increasing the length of

From

Bruce Momjian

Date:

01 December 2004, 03:25:53

Jim C. Nasby wrote:
> Could we come up with a compromise then? I guess maybe another setting
> that says log any query when it hits more than x amount of time. (I'd
> also argue you should get a NOTICE or WARNING when this exceeds the
> query timeout time).
> 
> A perhapse more friendly alternative would be a way to query to get this
> information in real-time, but that probably goes back into the
> discussion about the length of data made available in pg_stat_activity.

Yes.  I don't see a huge win for adding another GUC variable.

---------------------------------------------------------------------------


> 
> On Tue, Nov 30, 2004 at 02:32:05PM -0500, Bruce Momjian wrote:
> > David Parker wrote:
> > > I've been using "log_min_duration_statement = 0" to get durations on all
> > > SQL statements for the purposes of performance tuning, because this logs
> > > the duration on the same line as the statement. My reading of this TODO
> > > is that now log_min_duration_statement = 0 would give me the statements
> > > but no total duration?
> > 
> > Oh, sorry, you are right.  I forgot about the duration part!  I got so
> > excited I forgot.
> > 
> > TODO item removed.
> > 
> > -- 
> >   Bruce Momjian                        |  http://candle.pha.pa.us
> >   pgman@candle.pha.pa.us               |  (610) 359-1001
> >   +  If your life is a hard drive,     |  13 Roberts Road
> >   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> > 
> >                http://archives.postgresql.org
> > 
> 
> -- 
> Jim C. Nasby, Database Consultant               decibel@decibel.org 
> Give your computer some brain candy! www.distributed.net Team #1828
> 
> Windows: "Where do you want to go today?"
> Linux: "Where do you want to go tomorrow?"
> FreeBSD: "Are you guys coming, or what?"
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Increasing the length of

From

Simon Riggs

Date:

01 December 2004, 08:42:28

On Tue, 2004-11-30 at 19:32, Bruce Momjian wrote:
> David Parker wrote:
> > I've been using "log_min_duration_statement = 0" to get durations on all
> > SQL statements for the purposes of performance tuning, because this logs
> > the duration on the same line as the statement. My reading of this TODO
> > is that now log_min_duration_statement = 0 would give me the statements
> > but no total duration?
> 
> Oh, sorry, you are right.  I forgot about the duration part!  I got so
> excited I forgot.
> 
> TODO item removed.

David's objection was noted, and why I hadn't coded it (yet).

There are currently two ways of getting statement and duration output,
which is confusing....

You can either
1. Individual statements
- log_statement = all
- log_duration = true
- log_line_prefix includes processid

which produces 2 log lines like
statement: xxxxxxxxx
duration: yyyyyyyyyy

2. log_min_duration
log_min_duration_statement=0
which produces 1 log line like
duration: yyyyyyy statement: xxxxxxxxxx

These two things do exactly the same thing, apart from the way the
output is presented to the user in the log line.

I'd like to change log_min_duration_statement as suggested, but this
side-effect behaviour of being a better log_statement than log_statement
kindof gets in the way. It makes me wonder why we have log_statement at
all.

We all want to do performance tracing. I'd also like to be able to
dynamically monitor what is actually happening *now* on the system.
There is no way right now to monitor for rogue queries, other than to
cancel anything that runs more than statement_timeout. Thats not good
either, even if it does keep the current behaviour.

My preference would be to do the following:
- add a script to contrib to process the log file
- always add processid to log_statement_prefix when both log_statement
and log_duration are specified, so you can always tie up the data

Anybody?

-- 
Best Regards, Simon Riggs

Re: Increasing the length of

From

Bruce Momjian

Date:

01 December 2004, 16:18:17

Simon Riggs wrote:
> On Tue, 2004-11-30 at 19:32, Bruce Momjian wrote:
> > David Parker wrote:
> > > I've been using "log_min_duration_statement = 0" to get durations on all
> > > SQL statements for the purposes of performance tuning, because this logs
> > > the duration on the same line as the statement. My reading of this TODO
> > > is that now log_min_duration_statement = 0 would give me the statements
> > > but no total duration?
> > 
> > Oh, sorry, you are right.  I forgot about the duration part!  I got so
> > excited I forgot.
> > 
> > TODO item removed.
> 
> David's objection was noted, and why I hadn't coded it (yet).
> 
> There are currently two ways of getting statement and duration output,
> which is confusing....
> 
> You can either
> 1. Individual statements
> - log_statement = all
> - log_duration = true
> - log_line_prefix includes processid
> 
> which produces 2 log lines like
> statement: xxxxxxxxx
> duration: yyyyyyyyyy
> 
> 2. log_min_duration
> log_min_duration_statement=0
> which produces 1 log line like
> duration: yyyyyyy statement: xxxxxxxxxx
> 
> These two things do exactly the same thing, apart from the way the
> output is presented to the user in the log line.
> 
> I'd like to change log_min_duration_statement as suggested, but this
> side-effect behaviour of being a better log_statement than log_statement
> kindof gets in the way. It makes me wonder why we have log_statement at
> all.

We have it so you can look in the log to see currently running queries
rather than just completed queries.

> We all want to do performance tracing. I'd also like to be able to
> dynamically monitor what is actually happening *now* on the system.
> There is no way right now to monitor for rogue queries, other than to
> cancel anything that runs more than statement_timeout. Thats not good
> either, even if it does keep the current behaviour.
> 
> My preference would be to do the following:
> - add a script to contrib to process the log file
> - always add processid to log_statement_prefix when both log_statement
> and log_duration are specified, so you can always tie up the data

I think our setup is confusing enough.  Adding "automatic" additions
seems even more confusing than what we have now.  We could throw a log
warning message somehow though.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Increasing the length of

From

Robert Treat

Date:

01 December 2004, 18:20:54

On Wednesday 01 December 2004 03:38, Simon Riggs wrote:
> On Tue, 2004-11-30 at 19:32, Bruce Momjian wrote:
> > David Parker wrote:
> > > I've been using "log_min_duration_statement = 0" to get durations on
> > > all SQL statements for the purposes of performance tuning, because this
> > > logs the duration on the same line as the statement. My reading of this
> > > TODO is that now log_min_duration_statement = 0 would give me the
> > > statements but no total duration?
> >
> > Oh, sorry, you are right.  I forgot about the duration part!  I got so
> > excited I forgot.
> >
> > TODO item removed.
>
> David's objection was noted, and why I hadn't coded it (yet).
>
> There are currently two ways of getting statement and duration output,
> which is confusing....
>
> You can either
> 1. Individual statements
> - log_statement = all
> - log_duration = true
> - log_line_prefix includes processid
>
> which produces 2 log lines like
> statement: xxxxxxxxx
> duration: yyyyyyyyyy
>
> 2. log_min_duration
> log_min_duration_statement=0
> which produces 1 log line like
> duration: yyyyyyy statement: xxxxxxxxxx
>
> These two things do exactly the same thing, apart from the way the
> output is presented to the user in the log line.
>
> I'd like to change log_min_duration_statement as suggested, but this
> side-effect behaviour of being a better log_statement than log_statement
> kindof gets in the way. It makes me wonder why we have log_statement at
> all.
>

it's what we started with and has some use cases that 
log_min_duration_statement doesnt.   Besides, if you change 
log_min_duration_statement to print at the time duration is exceeded, you'll 
need something to enable printing out of durations of completed statements. 

> We all want to do performance tracing. I'd also like to be able to
> dynamically monitor what is actually happening *now* on the system.
> There is no way right now to monitor for rogue queries, other than to
> cancel anything that runs more than statement_timeout. Thats not good
> either, even if it does keep the current behaviour.
>

pg_stat_activity shows query_start, you could poll that to look for rouge 
queries.  you can also use log_min_duration_statement to watch for rouge 
queries. 

> My preference would be to do the following:
> - add a script to contrib to process the log file

check out the pqa project on pgfoundry

> - always add processid to log_statement_prefix when both log_statement
> and log_duration are specified, so you can always tie up the data
>

we have the option to add the process id now, I don't see why we should force 
it into the line.  besides, some logging tools will do this for you, so it 
would be duplicate data. 

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL