Home > mailing lists

Re: Query performance over a large proportion of data - Mailing list pgsql-performance

From	Steve McLellan
Subject	Re: Query performance over a large proportion of data
Date	March 11, 2009 03:23:21
Msg-id	cfca83d70903102015m2c217852l2137ed78dd04f3e8@mail.gmail.com Whole thread Raw
In response to	Query performance over a large proportion of data ("Steve McLellan" <smclellan@mintel.com>)
Responses	Re: Query performance over a large proportion of data (Scott Marlowe <scott.marlowe@gmail.com>)
List	pgsql-performance

Tree view

Scott Marlowe <scott.marlowe@gmail.com>
03/10/2009 05:19 PM

>
> Nested Loop (cost=466.34..192962.24 rows=15329 width=12) (actual
> time=13653.238..31332.113 rows=131466 loops=1)

Both your query plans end with this nested loop join which is taking
up about half your time in your query. Notice the estimation of the
result set is off by a factor of about 10 here, which means a nested
loop might be not so good a choice for this. Try increasing default
stats target and re-analyzing to see if that helps. 1000 is the max
you can give that a shot right off to see if it helps. If it does,
drop it until the numbers start to go off again and stop.

For a quicker test, you can set enable_nestloop = off in the psql
command line and then run the query by hand and see if that helps.Thanks - the nested loop is indeed causing problems - reducing seq_page_cost had the same effect of removing the nested loop for this query. We'd noticed the poor row count estimation. Increasing the statistics doesn't seem to have much effect, but we'll have more of a go with it.

pgsql-performance by date:

From: Steve McLellan
Date: 11 March 2009, 03:21:13
Subject: Re: Query performance over a large proportion of data

From: Steve McLellan
Date: 11 March 2009, 03:37:35
Subject: Re: Query performance over a large proportion of data

Re: Query performance over a large proportion of data - Mailing list pgsql-performance

Previous

Next