> >> But you don't really need to look at the index (if it even exists
> >> at the time you do the ANALYZE). The extent to which the data is
> >> ordered in the table is a property of the table, not the index.
>
> > Think compound, ascending, descending and functional index.
> > The (let's call it) cluster statistic for estimating indexscan cost can only
> > be deduced from the index itself (for all but the simplest one column btree).
>
> If you want to write code that handles those cases, go right ahead ;-).
> I think it's sufficient to look at the first column of a multicolumn
> index for cluster-order estimation
I often see first index columns that are even unique when the appl is installed
for a small company (like a company id column (e.g. "mandt" in SAP)).
> --- remember all these numbers are pretty crude anyway.
Ok, you want to supply a value, that shows how well sorted single
columns are in regard to < >. Imho this value should be stored in pg_attribute.
Later someone can add a statistic to pg_index that shows how well clustered the index is.
In lack of a pg_index statistic the optimizer uses the pg_attribute value of the first
index column. I think that would be a good plan.
> We have no such thing as a "descending index";
> and I'm not going to worry about clustering estimation for functional
> indexes.
Ok, an approach that reads ctid pointers from the index in index order
would not need to worry about how the index is actually filled. It would need
a method to sample (or read all) ctid pointers from the index in index order.
Andreas