David Rowley <dgrowleyml@gmail.com> writes:
> On Thu, 3 Apr 2025 at 16:24, Manikandan Swaminathan
> <maniswami23@gmail.com> wrote:
>> why doesn’t making a multivariate statistic make a difference?
> Extended statistics won't help you here. "dependencies" just estimates
> functional dependencies between the columns mentioned in the ON
> clause. What we'd need to store to do better in your example query is
> positional information of where certain values are within indexes
> according to an ordered scan of the index. I don't quite know how we'd
> represent that exactly, but if we knew that a row matching col_a >
> 4996 wasn't until somewhere near the end of idx_col_a_btree index,
> then we'd likely not want to use that index for this query.
A simple-minded approach could be to just be pessimistic, and
increase our estimate of how many rows would need to be scanned as a
consequence of noticing that the columns have significant correlation.
The shape of that penalty function would be mostly guesswork though,
I fear. (Even with a clear idea of what to do, making this happen
seems a little complex --- just a SMOP, but I'm not very sure how to
wire it up.)
regards, tom lane