Use "average field correlation per hard disk page" instead of global one? - Mailing list pgsql-hackers

From Alexey Nalbat
Subject Use "average field correlation per hard disk page" instead of global one?
Date
Msg-id 200403031858.51680.alexey@price.ru
Whole thread Raw
Responses Re: Use "average field correlation per hard disk page" instead of global one?
List pgsql-hackers
Hello.

I have a table of 2'500'000 tuples and 100'000 pages, and an index
on non-unique field, to each key value corresponds approximately
50'000 tuples.

Due to the updating algorithm the physical order of tuples in the
table happens to be such that all equal keys are placed together,
but not ordered globally. Correlation computed by "VACUUM ANALYZE"
is 0.15.

When computing indexscan cost for query with clause "key = ?"
the planner makes it closer to "Mackert and Lohman formula" value
than to "selectivity * pages". As a result it chooses seqscan
rather than indexscan while in fact indexscan is 20 times faster.

The question is, which is the best way to correct this behavior?

Maybe "VACUUM ANALYZE" could calculate some average of "field
correlation per page" and even use this value somewhere inside
(not outside) "Mackert and Lohman formula"?

Are there any better ideas?



pgsql-hackers by date:

Previous
From: "Simon Riggs"
Date:
Subject: Re: PITR Functional Design v2 for 7.5
Next
From: "Tom Hebbron"
Date:
Subject: Re: User defined types -- Social Security number...