Home > mailing lists

Re: merge>hash>loop - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: merge>hash>loop
Date	April 14, 2006 16:59:09
Msg-id	11035.1145033929@sss.pgh.pa.us Whole thread Raw
In response to	Re: merge>hash>loop (Ian Westmacott <ianw@intellivid.com>)
Responses	Re: merge>hash>loop
List	pgsql-performance

Tree view

Ian Westmacott <ianw@intellivid.com> writes:
> That's what I feared, thanks.  But then if I simplify things a bit,
> such that the row counts are quite good, and PG still chooses a
> worse plan, can I conclude anything about my configuration settings
> (like random_page_cost)?

Well, the other thing that's going on here is that we know we are
overestimating the cost of nestloop-with-inner-indexscan plans.
The current estimation for that is basically "outer scan cost plus N
times inner scan cost" where N is the estimated number of outer tuples;
in other words the repeated indexscan probes are each assumed to happen
from a cold start.  In reality, caching of the upper levels of the index
means that the later index probes are much cheaper than this model
thinks.  We've known about this for some time but no one's yet proposed
a more reasonable cost model.

In my mind this is tied into another issue, which is that the planner
always costs on the basis of each query starting from zero.  In a real
environment it's much cheaper to use heavily-used indexes than this cost
model suggests, because they'll already be swapped in due to use by
previous queries.  But we haven't got any infrastructure to keep track
of what's been heavily used, let alone a cost model that could make use
of the info.

I think part of the reason that people commonly reduce random_page_cost
to values much lower than physical reality would suggest is that it
provides a crude way of partially compensating for this basic problem.

            regards, tom lane

pgsql-performance by date:

From: Ian Westmacott
Date: 14 April 2006, 16:46:08
Subject: Re: merge>hash>loop

From: Francisco Reyes
Date: 14 April 2006, 17:50:36
Subject: Re: Inserts optimization?

Re: merge>hash>loop - Mailing list pgsql-performance

Previous

Next