Hi Tomas,
On 21.01.2023 00:50, Tomas Vondra wrote:
> Hi Egor,
>
> While reviewing a patch improving join estimates for ranges [1] I
> realized we don't show stats for ranges in pg_stats, and I recalled we
> had this patch.
>
> I rebased the v2, and I decided to took a stab at showing separate
> histograms for lower/upper histogram bounds. I believe it makes it way
> more readable, which is what pg_stats is about IMHO.
Thanks for looking into this.
I have to admit it looks much better this way, so +1.
> This simply adds two functions, accepting/producing anyarray - one for
> lower bounds, one for upper bounds. I don't think it can be done with a
> plain subquery (or at least I don't know how).
Anyarray is an alien to SQL, so functions are well justified here. What
makes me a bit uneasy is two almost identical functions. Should we
consider other options like a function with an additional parameter or a
function returning an array of bounds arrays (which is somewhat
wasteful, but probably it doesn't matter much here)?
> Finally, it renames the empty_range_frac to start with range_, per the
> earlier discussion. I wonder if the new column names for lower/upper
> bounds (range_lower_bounds_histograms/range_upper_bounds_histograms) are
> too long ...
It seems so. The ending -s should be left out since it's a single
histogram now. And I think that
range_lower_histogram/range_upper_histogram are descriptive enough.
I'm adding one more patch to shorten the column names, refresh the docs,
and make 'make check' happy (unfortunately, we have to edit
src/regress/expected/rules.out every time pg_stats definition changes).
>
> regards
>
> [1] https://commitfest.postgresql.org/41/3821/
>