Re: log_duration and log_statement - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: log_duration and log_statement
Date
Msg-id 1142326222.11178.37.camel@localhost.localdomain
Whole thread Raw
In response to log_duration and log_statement  ("Guillaume Smet" <guillaume.smet@gmail.com>)
Responses Re: log_duration and log_statement  ("Guillaume Smet" <guillaume.smet@gmail.com>)
Re: log_duration and log_statement  ("Jim C. Nasby" <jnasby@pervasive.com>)
List pgsql-hackers
On Mon, 2006-03-13 at 23:40 +0100, Guillaume Smet wrote:

> Here are some background information to explain our issue and request.
> We are currently planning a migration from PostgreSQL 7.4 to
> PostgreSQL 8.1. We work on a medium sized database (2GB) with a rather
> important activity (12 millions queries a day with peaks up to 1000
> queries/s).
> We are analyzing the logs with a tool we developed (namely pgFouine
> available on pgFoundry). 

Guillaume,

Thanks very much for writing pgFouine. We've been doing our best to
support generation of useful logs for performance analysis, so please
feel free to ask for anything you see a need for.

> We currently use the following configuration
> for logging:
> - log_min_duration_statement = 500 to log the slowest queries
> - log_duration to log every query duration and have a global overview
> of our database activity (used to generate this sort of graphs:
> http://people.openwide.fr/~gsmet/postgresql/graphs.html ).

Interesting results and good graphics too.

I note your graphs don't show missing values: on the bottom graph there
is no data for 7pm and 2am, yet the graph passes directly from 6 to 8pm
as if the figure for 7pm was mid-way between them, rather than zero.

> We cannot log every query as we already generate 1.2GB of logs a day
> while only logging the text of one hundredth of the queries so we log
> only the duration for the "not so slow" queries.

How do you tell the difference between a SELECT and a Write query when
you do not see the text of the query?

> I didn't notice the log_duration behaviour has changed starting from
> 8.0 (thanks to oicu for pointing me the 8.0 release notes on
> #postgresql) and what we did is not possible anymore with 8.x as
> log_duration now only logs the duration for queries logged with
> log_statement.
> 
> I think the former behaviour can be interesting in our case and
> probably for many other people out there who use log analysis tools as
> logging only slow queries is not enough to have an overview of the
> database activity.
> I was thinking about something like log_duration = 'none|logged|all'
> which will allow us to switch between:
> - none: we don't log the duration (=log_duration=off);
> - logged: we log the duration only for logged queries (depending on
> log_statement as for 8.0);
> - all: we log every duration as 7.4 did before when log_duration was on.
> 
> Any comment on this?

I think I need more persuasion before I see the value of this, but I'm
not going to immediately disregard this either.

Collecting information is interesting, but it must in some way lead to a
rational corrective action. Logging the duration of statements without
recording what they are would tell you there was a problem but make it
difficult to move towards an action rationally. Perhaps I'm
misunderstanding this.

Is the issue that the log volume is too high? We might be able to look
at ways to reduce/compress the log volume for duration/statement
logging.

Another thought might be to provide a random sample of queries. A 10%
sample would be sufficient for your case here and yet would provide a
possibility of deeper analysis also.

Best Regards, Simon Riggs



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Proposal for updatable views
Next
From: Simon Riggs
Date:
Subject: Re: [PATCHES] Automatic free space map filling