Re: pg_stats and range statistics - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: pg_stats and range statistics
Date
Msg-id CAPpHfduZ3hApO-GFT+Rtn75w3gR8qhcw+-xGENjHAANFMA5xKQ@mail.gmail.com
Whole thread Raw
In response to Re: pg_stats and range statistics  (jian he <jian.universality@gmail.com>)
Responses Re: pg_stats and range statistics  (jian he <jian.universality@gmail.com>)
Re: pg_stats and range statistics  (jian he <jian.universality@gmail.com>)
Re: pg_stats and range statistics  (Egor Rogov <e.rogov@postgrespro.ru>)
List pgsql-hackers
Hi!

On Wed, Sep 6, 2023 at 6:18 PM jian he <jian.universality@gmail.com> wrote:
> +        <literal>lower(ARRAY[numrange(1.1,2.2),numrange(3.3,4.4)])</literal>
> should be
> +        <literal>ranges_lower(ARRAY[numrange(1.1,2.2),numrange(3.3,4.4)])</literal>
>
> +        <literal>upper(ARRAY[numrange(1.1,2.2),numrange(3.3,4.4)])</literal>
> should be
> +        <literal>ranges_upper(ARRAY[numrange(1.1,2.2),numrange(3.3,4.4)])</literal>
>
> https://www.postgresql.org/docs/current/catalog-pg-type.html
> there is no association between numrange and their base type numeric.
> so for template: anyarray ranges_lower(anyarray). I don't think we can
> input numrange array and return a numeric array.
>
> https://www.postgresql.org/docs/current/extend-type-system.html#EXTEND-TYPES-POLYMORPHIC
> >> When the return value of a function is declared as a polymorphic type, there must be at least one argument
positionthat is also >> polymorphic, and the actual data type(s) supplied for the polymorphic arguments determine the
actualresult type for that call. 
>
>
> regression=# select
> ranges_lower(ARRAY[numrange(1.1,2.2),numrange(3.3,4.4),
> numrange(5.5,6.6)]);
>  ranges_lower
> ---------------
>  {1.1,3.3,5.5}
> (1 row)
> regression=# \gdesc
>     Column    |    Type
> --------------+------------
>  ranges_lower | numrange[]
> (1 row)
>
> I don't think you can cast literal ' {1.1,3.3,5.5}' to numrange[].

Thank you for noticing this.  Indeed, our polymorphic type system
doesn't support this case.  In order to support this, we need
something like "anyrangearray" pseudo-type.  However, it seems
overkill to introduce a new pseudo-type just to update pg_stats.

Additionally, I found that the current patch can't handle infinite
range bounds and discards information about inclusiveness of range
bounds.  The infinite bounds could be represented as NULL (while I'm
not sure how good this representation is).  Regarding inclusiveness, I
don't see the possibility to represent them in a reasonable way within
an array of base types.  I also don't feel good about discarding the
accuracy in the pg_stats view.

In conclusion of all of the above, I decided to revise the patch and
show the bounds histogram as it's stored in pg_statistic.  I revised
the docs correspondingly.

Also for some reason, the patch added description of new columns to
the documentation of pg_user_mapping table.  I've fixed that by moving
them to the documentation of pg_stats view.

Also, I've extracted the new comment in pg_statistic.h into a separate patch.

I'm going to push this if there are no objections.

------
Regards,
Alexander Korotkov

Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: remove deprecated @@@ operator ?
Next
From: Heikki Linnakangas
Date:
Subject: Re: Relation bulk write facility