Re: Pointers needed on optimizing slow SQL statements - Mailing list pgsql-performance

From Robert Haas
Subject Re: Pointers needed on optimizing slow SQL statements
Date
Msg-id 603c8f070906031821r5df29d0cmf256d44627b59fa1@mail.gmail.com
Whole thread Raw
In response to Re: Pointers needed on optimizing slow SQL statements  (Janine Sisk <janine@furfly.net>)
Responses Re: Pointers needed on optimizing slow SQL statements
List pgsql-performance
On Wed, Jun 3, 2009 at 6:04 PM, Janine Sisk <janine@furfly.net> wrote:
> Ok, I will look into gathering better statistics.  This is the first time
> I've had a significant problem with a PG database, so this is uncharted
> territory for me.
>
> If there is more info I could give that would help, please be more specific
> about what you need and I will attempt to do so.
>
> Thanks!
>
> janine

You might find it helpful to try to inline the
content_item__get_latest_revision function call.  I'm not sure whether
that's a SQL function or what, but the planner isn't always real
clever about things like that.  If you can redesign things so that all
the logic is in the actual query, you may get better results.

But, we're not always real clever about selectivity.  Sometimes you
have to fake the planner out, as discussed here.

http://archives.postgresql.org/pgsql-performance/2009-06/msg00023.php

Actually, I had to do this today on a production application.  In my
case, the planner thought that a big OR clause was not very selective,
so it figured it wouldn't have to scan very far through the outer side
before it found enough rows to satisfy the LIMIT clause.  Therefore it
materialized the inner side instead of hashing it, and when the
selectivity estimate turned out to be wrong, it took 220 seconds to
execute.  I added a fake join condition of the form a || b = a || b,
where a and b were on different sides of the join, and now it hashes
the inner side and takes < 100 ms.

Fortunately, these kinds of problems are fairly rare, but they can be
extremely frustrating to debug.  With any kind of query debugging, the
first question to ask yourself is "Are any of my selectivity estimates
way off?".  If the answer to that question is no, you should then ask
"Where is all the time going in this plan?".  If the answer to the
first question is yes, though, your time is usually better spent
fixing that problem, because once you do, the plan will most likely
change to something a lot better.

...Robert

pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Re: Best way to load test a postgresql server
Next
From: Brian Herlihy
Date:
Subject: Query plan issues - volatile tables