AW: AW: RFC: planner statistics in 7.2 - Mailing list pgsql-hackers

From Zeugswetter Andreas SB
Subject AW: AW: RFC: planner statistics in 7.2
Date
Msg-id 11C1E6749A55D411A9670001FA6879633682A2@sdexcsrv1.f000.d0188.sd.spardat.at
Whole thread Raw
List pgsql-hackers
> >> But you don't really need to look at the index (if it even exists
> >> at the time you do the ANALYZE).  The extent to which the data is
> >> ordered in the table is a property of the table, not the index.
> 
> > Think compound, ascending, descending and functional index.
> > The (let's call it) cluster statistic for estimating indexscan cost can only 
> > be deduced from the index itself (for all but the simplest one column btree).
> 
> If you want to write code that handles those cases, go right ahead ;-).
> I think it's sufficient to look at the first column of a multicolumn
> index for cluster-order estimation

I often see first index columns that are even unique when the appl is installed
for a small company (like a company id column (e.g. "mandt" in SAP)).

> --- remember all these numbers are pretty crude anyway.

Ok, you want to supply a value, that shows how well sorted single
columns are in regard to < >.  Imho this value should be stored in pg_attribute.
Later someone can add a statistic to pg_index that shows how well clustered the index is.
In lack of a pg_index statistic the optimizer uses the pg_attribute value of the first 
index column. I think that would be a good plan.

> We have no such thing as a "descending index";
> and I'm not going to worry about clustering estimation for functional
> indexes.

Ok, an approach that reads ctid pointers from the index in index order
would not need to worry about how the index is actually filled. It would need
a method to sample (or read all) ctid pointers from the index in index order.

Andreas


pgsql-hackers by date:

Previous
From: Karel Zak
Date:
Subject: Re: SET SESSION AUTHORIZATION (was Re: Real/effective user)
Next
From: Chad La Joie
Date:
Subject: Re: Replication through WAL