Hi,
> The histogram values seem completely meaningless in this context ---
> for containment purposes, they are just ten or so randomly chosen
> values. I don't believe that the estimator works better with them.
> Certainly, whether the column is unique or not is totally irrelevant
> to whether they are representative.
Right, but if the column has a high number of stats, I think that the
samples found in the histogram could put the estimator on the right way:
i.e. in my case 80% of the values have '1041' as their root leaf and
most of the values in the histogram reflect this.
You're right saying that the column uniqueness isn't relevant to the
histogram, but if the column is unique, there won't be any mcv, and the
patch becomes useless.
> What would seem saner to me is to add a datatype-specific analyze
> function that collects some statistics that are actually relevant
> to containment, and then make use of those in the estimator.
Perhaps you're right, but unfortunately it's not a thing I can do
myself, because of lack of knowledge about both pg and ltree internals :(
Best regards
--
Matteo Beccati
http://phpadsnew.com
http://phppgads.com