Re: CPU Intensive query - Mailing list pgsql-performance

From Steinar H. Gunderson
Subject Re: CPU Intensive query
Date
Msg-id 20070518223233.GB17690@uio.no
Whole thread Raw
In response to Re: CPU Intensive query  (Abu Mushayeed <abumushayeed@yahoo.com>)
Responses Re: CPU Intensive query  ("Steinar H. Gunderson" <sgunderson@bigfoot.com>)
List pgsql-performance
On Fri, May 18, 2007 at 02:37:27PM -0700, Abu Mushayeed wrote:
>>> set enable_nestloop = off;
>> What's the rationale for this?
> To eliminate nested loop. It does a nested loop betwwen to very large
> table(millions of rows).

If the planner chooses a nested loop, it is because it believes it is the
most efficient solution. I'd turn it back on and try to figure out why the
planner was wrong. Note that a nested loop with an index scan on one or both
sides can easily be as efficient as anything.

Did you ANALYZE your tables recently? If the joins are really between
millions of rows and the planner thinks it's a couple thousands, the stats
sound rather off...

>>> HashAggregate (cost=152555.97..152567.32 rows=267 width=162)
>> 152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's
>> probably misestimation involved at some point here. Does it really return 267
>> rows, or many more?
> It returns finally about 19-20 thousand rows.

So the planner is off by a factor of at least a hundred. That's a good
first-level explanation for why it's slow, at least...

If you can, please provide EXPLAIN ANALYZE output for your query (after
running ANALYZE on all your tables, if you haven't already); even though
it will take some time, it usually makes this kind of performance debugging
much easier.

/* Steinar */
--
Homepage: http://www.sesse.net/

pgsql-performance by date:

Previous
From: Abu Mushayeed
Date:
Subject: Re: CPU Intensive query
Next
From: "Steinar H. Gunderson"
Date:
Subject: Re: CPU Intensive query