Home > mailing lists

Re: RFC: planner statistics in 7.2 - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: RFC: planner statistics in 7.2
Date	April 23, 2001 11:04:59
Msg-id	12651.988035033@sss.pgh.pa.us Whole thread Raw
In response to	Re: RFC: planner statistics in 7.2 (Philip Warner <pjw@rhyme.com.au>)
Responses	Re: RFC: planner statistics in 7.2
List	pgsql-hackers

Tree view

Philip Warner <pjw@rhyme.com.au> writes:
>> All that we're discussing here is one specific parameter in the cost
>> estimation for an indexscan, viz, the extent to which the table ordering
>> agrees with the index ordering. 

> This does not necessarily follow. A table ordering need not follow the sort
> order of an index for the index to have a low indexscan cost. All that is
> required is that most of the rows referred to by an index node must reside
> in a page or pages that will be read by one IO. eg. a table that has a
> sequence based ID, with, say 20% of rows updated, will work nicely with an
> indexscan on the ID, even though it has never been clustered. 

Right, what matters is the extent of correlation between table ordering
and index ordering, not how it got to be that way.

> What I'm suggesting is that if you look at a random sample of index nodes,
> you should be able to get a statistically valid estimate of the 'clumping'
> of the data pointed to by the index. 

And I'm saying that you don't actually have to look at the index in
order to compute the very same estimate.  The only property of the index
that matters is its sort order; if you assume you know the right sort
order (and in practice there's usually only one interesting possibility
for a column) then you can compute the correlation just by looking at
the table.

Andreas correctly points out that this approach doesn't extend very well
to multi-column or functional indexes, but I'm willing to punt on those
for the time being ...
        regards, tom lane

pgsql-hackers by date:

From: Adam Rose
Date: 23 April 2001, 11:04:58
Subject: row name length

From: Peter Eisentraut
Date: 23 April 2001, 11:25:57
Subject: Re: Re: How to determine if a user exists..

Re: RFC: planner statistics in 7.2 - Mailing list pgsql-hackers

Previous

Next