Home > mailing lists

Re: Index Scan cost expression - Mailing list pgsql-hackers

From	Gregory Stark
Subject	Re: Index Scan cost expression
Date	January 27, 2009 14:22:52
Msg-id	87d4e8vkn9.fsf@oxford.xeocode.com Whole thread Raw
In response to	Index Scan cost expression (Amit Gupta <amit.pc.gupta@gmail.com>)
List	pgsql-hackers

Tree view

Amit Gupta <amit.pc.gupta@gmail.com> writes:

> While trying to figure out an appropriate cost expression function for
> Thick indexes, i learned that we are using Mackert and Lohman formula
> (described in their paper "Index Scans Using a Finite LRU Buffer: A
> Validated I/O Model", ACM Transactions on Database Systems).
...
> Please note that the above result only computes _heap_ page reads.

Moreover it only models a single index scan. It assumes nothing is cached
prior to the index scan which is very much not true if we're repeatedly
scanning similar ranges of keys.

Omitting the index pages is a very coarse attempt to model the caching across
multiple plan invocations since upper level index pages will nearly always be
cached and even lower index pages have a good chance of being cached.

The problem is that modeling cross-plan-invocation caching is a hard problem
we have few ideas for.

> Upon taking a cursory look at the cost functions of other operators, I
> realized that available memory (effective_cache_size) is not
> considered for estimating the costs of hash/sort/NLjoin/etc. Why is
> that the case?

Well they're all different but I suspect the root of what you're observing are
all the same thing. Cache doesn't affect any of these nodes unless we start
with something in the cache from previous queries and we don't model that. We
assume each query and even each plan node is run on a cold cache.

The problem is that how much to discount the cost of the inner node depends,
not only on the type of node it is, but also on the types of parameters it
will be called with. So it needs very different results for something like a
nested loop between an outer table with few closely spaced values and another
with an outer table with values sparsely spread throughout the inner table.

This is complicated by the fact that the repetitions don't necessarily come
from the parent of the plan in question. You could have, for example, a
subquery several nodes down from the scan that causes repetitions.

I think the only way to tackle it is to come up with some parametric formula
for how much to discount repetitions and carry the parameters of that formula
on every plan node. So when a node needs to add the cost of n repetitions of a
lower node it applies that formula using the parameters advertised by the
sub-node.

The tricky part of that is coming up with a formula and figuring parameters to
model plan nodes using it. Consider that how much to discount repetitions will
depend heavily on what distribution of parameters the plan is executed with.

I'm also not clear what kinds of formulas work for this. It has to be
something that can be composed effectively. That is, even if a plan node
doesn't want to discount itself at all for repetitions it has to include the
discount that any subplans have asked for. For example a large sequential scan
which expects to overflow effective_cache_size might not want to be discounted
at all, but if it has a subquery which scans a small table it will want to
discount that 100% for any repetitions since it'll be cached after the first
scan.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production
Tuning

pgsql-hackers by date:

From: Joshua Brindle
Date: 27 January 2009, 14:20:48
Subject: Re: 8.4 release planning

From: Tom Lane
Date: 27 January 2009, 14:23:24
Subject: Re: binary array and record recv

Re: Index Scan cost expression - Mailing list pgsql-hackers

Previous

Next