Re: Patch proposal for log_duration - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: Patch proposal for log_duration
Date
Msg-id 200604111642.k3BGgNq29904@candle.pha.pa.us
Whole thread Raw
In response to Re: Patch proposal for log_duration  ("Guillaume Smet" <guillaume.smet@gmail.com>)
Responses Re: Patch proposal for log_duration  ("Guillaume Smet" <guillaume.smet@gmail.com>)
List pgsql-patches
The bottom line is that while I can see reasons for making log_duration
more than boolean, adding more complexity to the logging system for this
corner case probably isn't a good idea.  If we get more such requests,
we can reconsider it, but at this stage, I am thinking your using a
home-grown patch is your best approach.

---------------------------------------------------------------------------

Guillaume Smet wrote:
> Tom,
>
> On 3/30/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I really find it pretty bizarre to want to log a duration without
> > logging the statement that caused it.  Seems like the
> > log_min_duration_statement parameter ought to be enough.  These logging
> > options are already messy and unintuitive, and adding still more odd
> > frammishes doesn't help that ...
>
> It's not so odd as it was the default behaviour for 7.4. I agree with
> you having just the duration is not the best solution but we cannot
> log every query for performances reason.
>
> The point is we are hosting several databases for our customers and
> for these databases, we don't know the application. I'm going to take
> a real example we have here to explain why the behaviour of
> log_duration in 7.4 was perfect for us.
>
> This database is a 2 GB database receiving 13 millions queries a day.
> We usually have the same old big queries which cannot be really
> optimized in the log due to log_min_duration_statement so this setting
> doesn't give us any clue on how our server performs because these
> queries are always slow.
> Here are the cases when a global overview with all duration was useful:
> - our customer tells us their site is slower than usual. We check the
> database log reports and we can see if the database is slower (average
> duration higher) or not. If this is the case, perhaps there is a bug
> or a new feature in their application performing a lot more queries
> than before: we can tell them too as we have all the queries logged by
> their duration.
> - we have a maintenance to plan which requires the database to be down
> (namely the upgrade from 7.4 to 8.1 we did a few weeks ago) and we
> need to choose a time when the database activity is not so high as we
> will move the db to a slower server. In our case, it's not directly
> correlated to the web statistics due to an advanced cache system.
> Having every query logged, we know when the database activity is
> lower.
> - we see on the graphs the database is significantly slower after
> let's say 5pm. This can be due to a the need of a vacuum on a specific
> table because of a cronjob running. That's true that log_duration
> won't be enough but we will use log_statement='mod' during this period
> the following day to see if our hypothesis is true.
>
> So our point is:
> - log_min_duration_statement is not enough as it does not give us a
> global overview of the database activity.
> - log_statement='all' generates far too I/O for us and we cannot enable it.
> - log_duration=on in 7.4 was a good compromise allowing us to detect
> the problems on the database server while generating a not so big log
> file (1.2 GB a day) and nearly no overhead.
>
> That's why we'd like to see this behaviour back again in PostgreSQL.
>
> > Changing code without changing the associated comments isn't a good way
> > to get your ideas accepted, either, eg here: [...]
> > The patch makes that comment a lie.  This is not acceptable coding practice.
>
> Sorry for that. The attached patch is fixed.
>
> I'm not sure if we should remove the log_duration log if
> log_min_duration_statement logs the query and duration for the current
> query. PostgreSQL 7.4 logged a line for the duration due to
> log_duration=on and a line for statement + duration due to
> log_min_duration_statement. I have kept this behaviour to be
> consistent with 7.4 but it's perhaps better to not log the query twice
> (the tool we use count the query only once anyway). Any feedback on
> this is welcome. I'll update the patch if needed.
>
> Thanks for your attention and comments.
>
> Regards,
>
> --
> Guillaume

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly

--
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: Patch for bug #2073 (Can't drop sequence when created via SERIAL column)
Next
From: "Guillaume Smet"
Date:
Subject: Re: Patch proposal for log_duration