Re: Query performance over a large proportion of data - Mailing list pgsql-performance

From Steve McLellan
Subject Re: Query performance over a large proportion of data
Date
Msg-id cfca83d70903102015m2c217852l2137ed78dd04f3e8@mail.gmail.com
Whole thread Raw
In response to Query performance over a large proportion of data  ("Steve McLellan" <smclellan@mintel.com>)
Responses Re: Query performance over a large proportion of data  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-performance


Scott Marlowe <scott.marlowe@gmail.com>
03/10/2009 05:19 PM

>
>  Nested Loop  (cost=466.34..192962.24 rows=15329 width=12) (actual
> time=13653.238..31332.113 rows=131466 loops=1)


Both your query plans end with this nested loop join which is taking
up about half your time in your query.  Notice the estimation of the
result set is off by a factor of about 10  here, which means a nested
loop might be not so good a choice for this.  Try increasing default
stats target and re-analyzing to see if that helps.  1000 is the max
you can give that a shot right off to see if it helps.  If it does,
drop it until the numbers start to go off again and stop.

For a quicker test, you can set enable_nestloop = off in the psql
command line and then run the query by hand and see if that helps.
Thanks - the nested loop is indeed causing problems - reducing seq_page_cost had the same effect of removing the nested loop for this query. We'd noticed the poor row count estimation. Increasing the statistics doesn't seem to have much effect, but we'll have more of a go with it.


pgsql-performance by date:

Previous
From: Steve McLellan
Date:
Subject: Re: Query performance over a large proportion of data
Next
From: Steve McLellan
Date:
Subject: Re: Query performance over a large proportion of data