Thread: query io stats and finding a slow query

query io stats and finding a slow query

From

"Kamen Stanev"

Date:

20 September 2007, 21:37:14

Hi everybody,

Is there a way to find which query is doing large io operations and/or which is using cached data and which is reading from disk. I need to see this on a production server to localize a slow and resource eating query. The pg_statio* tables are very handy, but don't help me at all in finding the most disk intensive query, just the most used table which I already knew.

And besides that, please share your experience on how do you decide which queries to optimize and how to reorganize your database? Is there any tools that you use to profile your database.

Regards,
Kamen Stanev

Re: query io stats and finding a slow query

From

"Kevin Grittner"

Date:

21 September 2007, 16:35:40

>>> On Thu, Sep 20, 2007 at  4:36 PM, in message
<de5063670709201436y5cbff0d0k70ade289a4c68199@mail.gmail.com>, "Kamen Stanev"
<hambai@gmail.com> wrote:
>
> Is there a way to find which query is doing large io operations and/or which
> is using cached data and which is reading from disk.

A big part of your cache is normally in the OS, which makes that tough.

> please share your experience on how do you decide which
> queries to optimize and how to reorganize your database?

We base this on two things -- query metrics from our application framework
and user complaints about performance.

> Is there any tools that you use to profile your database.

Many people set log_min_duration_statement to get a look at long-running
queries.

When you identify a problem query, running it with EXPLAIN ANALYZE in front
will show you the plan with estimated versus actual counts, costs, and time.
This does actually execute the query (unlike EXPLAIN without ANALYZE).

-Kevin

Re: query io stats and finding a slow query

From

"Kamen Stanev"

Date:

25 September 2007, 19:41:20

Thanks for the reply.

Here is what I found about my problem. When i set the log_min_duration_statement and in the moments when the server performance is degrading I can see that almost all queries run very slowly (10-100 times slower). At first I thought that there is exclusive lock on one of the tables but there wasn't any.

The information from the log files becomes useless when almost every query on you server is logged and when you can't tell which query after which. So I finally wrote a script to process the log file and graphically represent the timing of each query from the log (something like a gantt chart), and that way I found out what was the reason for the slowdowns. There was a query which actually reads all the data from one of the big tables and while it is running and some time after it finished the server is slowing down to death. I couldn't find it just looking at the log because it was not even the slowest query. After I examined the chart it was very clear what was happening. As I understand it, while this table was scanned all the disk i/o operations were slowed down, and maybe the data from that table was stored in the os cache, and hence all the other queries were so slow? After I removed the big query everything runs normally.

However, I was wondering if there are any tools for such log analysis. I'm ready to provide my script if somebody is interested? I think it is very useful, but maybe someone has already done something better?

Regards,
Kamen

On 9/21/07, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:

>>> On Thu, Sep 20, 2007 at 4:36 PM, in message
<de5063670709201436y5cbff0d0k70ade289a4c68199@mail.gmail.com>, "Kamen Stanev"
<hambai@gmail.com> wrote:
>
> Is there a way to find which query is doing large io operations and/or which
> is using cached data and which is reading from disk.

A big part of your cache is normally in the OS, which makes that tough.

> please share your experience on how do you decide which
> queries to optimize and how to reorganize your database?

We base this on two things -- query metrics from our application framework
and user complaints about performance.

> Is there any tools that you use to profile your database.

Many people set log_min_duration_statement to get a look at long-running
queries.

When you identify a problem query, running it with EXPLAIN ANALYZE in front
will show you the plan with estimated versus actual counts, costs, and time.
This does actually execute the query (unlike EXPLAIN without ANALYZE).

-Kevin

Re: query io stats and finding a slow query

From

"Bryan Murphy"

Date:

25 September 2007, 19:59:00

We use pgfouine (http://pgfouine.projects.postgresql.org/).

I currently have postgres log every query that takes longer than
100ms, roll the log files every 24 hours, and run pgfouine nightly.  I
check it every couple of mornings and this gives me a pretty good
picture of who misbehaved over the last 24 hours.  I can't even count
the # of times I've come in in the morning and some new query has
bubbled to the top.

It's very handy.  I don't know if it would have helped you identify
your problem, but it's saved our butts a few times.

Bryan

On 9/25/07, Kamen Stanev <hambai@gmail.com> wrote:
> Thanks for the reply.
>
> Here is what I found about my problem. When i set the
> log_min_duration_statement and in the moments when the server performance is
> degrading I can see that almost all queries run very slowly (10-100 times
> slower). At first I thought that there is exclusive lock on one of the
> tables but there wasn't any.
>
> The information from the log files becomes useless when almost every query
> on you server is logged and when you can't tell which query after which. So
> I finally wrote a script to process the log file and graphically represent
> the timing of each query from the log (something like a gantt chart), and
> that way I found out what was the reason for the slowdowns. There was a
> query which actually reads all the data from one of the big tables and while
> it is running and some time after it finished the server is slowing down to
> death. I couldn't find it just looking at the log because it was not even
> the slowest query. After I examined the chart it was very clear what was
> happening. As I understand it, while this table was scanned all the disk i/o
> operations were slowed down, and maybe the data from that table was stored
> in the os cache, and hence all the other queries were so slow? After I
> removed the big query everything runs normally.
>
> However, I was wondering if there are any tools for such log analysis. I'm
> ready to provide my script if somebody is interested? I think it is very
> useful, but maybe someone has already done something better?
>
> Regards,
> Kamen
>
> On 9/21/07, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
> > >>> On Thu, Sep 20, 2007 at  4:36 PM, in message
> >
> <de5063670709201436y5cbff0d0k70ade289a4c68199@mail.gmail.com>,
> "Kamen Stanev"
> > <hambai@gmail.com> wrote:
> > >
> > > Is there a way to find which query is doing large io operations and/or
> which
> > > is using cached data and which is reading from disk.
> >
> > A big part of your cache is normally in the OS, which makes that tough.
> >
> > > please share your experience on how do you decide which
> > > queries to optimize and how to reorganize your database?
> >
> > We base this on two things -- query metrics from our application framework
> > and user complaints about performance.
> >
> > > Is there any tools that you use to profile your database.
> >
> > Many people set log_min_duration_statement to get a look at long-running
> > queries.
> >
> > When you identify a problem query, running it with EXPLAIN ANALYZE in
> front
> > will show you the plan with estimated versus actual counts, costs, and
> time.
> > This does actually execute the query (unlike EXPLAIN without ANALYZE).
> >
> > -Kevin
> >
> >
> >
> >
>
>