Re: pg_stats and range statistics - Mailing list pgsql-hackers

From Egor Rogov
Subject Re: pg_stats and range statistics
Date
Msg-id 80e37c96-bdb4-dd00-b2da-5a01366f685b@postgrespro.ru
Whole thread Raw
In response to Re: pg_stats and range statistics  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: pg_stats and range statistics  ("Gregory Stark (as CFM)" <stark.cfm@gmail.com>)
List pgsql-hackers
On 24.03.2023 01:46, Tomas Vondra wrote:

>
> So if you could clean it up a bit, and do something about the two open
> items I mentioned (a bunch of tests on different array,


I've added some tests to resgress/sql/rangetypes.sql, based on the same 
dataset that is used to test lower() and upper().


> and behavior
> consistent with lower/upper),


Done. This required to switch from construct_array(), which doesn't 
support NULLs, to construct_md_array(), which does. A nice side effect 
is that now we also support multidimentional arrays.

I've moved a common part of ranges_lower_bounds() and 
ranges_upper_bounds() to ranges_bounds_common(), following Justin's advice.


There is one thing I'm not sure what to do about. This check:

      if (typentry->typtype != TYPTYPE_RANGE)
          ereport(ERROR,
                  (errcode(ERRCODE_DATATYPE_MISMATCH),
                   errmsg("expected array of ranges")));

doesn't work, because the range_get_typcache() call errors out first 
("type %u is not a range type"). The message doesn't look friendly 
enough for user-faced SQL function. Should we duplicate 
range_get_typcache's logic and replace the error message?


>   that'd be great.
>
>> Do we stick with the ranges_upper(anyarray) and ranges_lower(anyarray)
>> functions? This approach is okay with me. Tomas, have you made up your
>> mind?
>>
> I think the function approach is fine, but in my January 22 message I
> was wondering why we're not actually naming them simply lower/upper.


I'd expect from lower(anyarray) function to return the lowest element in 
the array. This name doesn't hint that the function takes an array of 
ranges. So, ranges_ prefix seems justified to me.


>
>> Do we want to document these functions? They are very
>> pg_statistic-specific and won't be useful for end users imo.
>>
> I don't see why not to document them. Sure, we're using them in a fairly
> specific context, but I don't see why not to let people use them too
> (which would be hard without docs).


Okay. I've corrected the examples a bit.

The patch is attached.


Thanks,
Egor

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: running logical replication as the subscription owner
Next
From: Justin Pryzby
Date:
Subject: cfbot stuck