Re: general purpose array_sort - Mailing list pgsql-hackers

From jian he
Subject Re: general purpose array_sort
Date
Msg-id CACJufxGjhNmXvJubV4BWgskjYDBfd5dv-cjQhuy0drDtK_uHdg@mail.gmail.com
Whole thread Raw
In response to Re: general purpose array_sort  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-hackers
On Mon, Nov 4, 2024 at 1:46 PM Michael Paquier <michael@paquier.xyz> wrote:
>
>
> +    typentry = lookup_type_cache(elmtyp, TYPECACHE_LT_OPR);
> +    if (!OidIsValid(typentry->lt_opr))
> +        ereport(ERROR,
> +                (errcode(ERRCODE_UNDEFINED_FUNCTION),
> +                 errmsg("could not identify ordering operator for type %s",
> +                    format_type_be(elmtyp))));
>
> The patch introduces two error paths based on the fact that ordering
> operators could not be found depending on a data type that lacks the
> ordering operator and the array ordering operator part.  It is right
> to issue an error if these are lacking, like the various stats paths.
> Should we have some regression tests with specific data types for
> these errors, though?  The stats paths don't care much about these
> error cases, but it does not mean that we should not care about them.
> In short, let's have negative test coverage if we can.
>
select distinct oprleft::regtype from pg_operator where oprname = '='
and oprleft = oprright
except all
select distinct oprleft::regtype from pg_operator where oprname = '<'
and oprleft = oprright;
returns

hstore
cid
aclitem
xid
line

simple tests case using xid data type would be

SELECT array_sort('{{1,2,3}}'::xid[]);


> +typedef struct ArraySortCachedInfo
> +{
> +    TypeCacheEntry *typentry;
> +    TypeCacheEntry *array_typentry;
> +} ArraySortCachedInfo;
>
> Let's put that at the top of the file, with a comment about how it
> links to array_sort() for the caching with fn_extra.  Let's also
> document the meaning of the fields.
>
> FWIW, I am confused by this implementation, where you have to allocate
> the two TypeCacheEntry because of the fact that you have to deal with
> the 1-dimension case and the multi-dimension case.  In the context of
> a single function call, why do you need both typentry and
> array_typentry, actually?  Wouldn't it be enough to use one typentry
> that points to the typcache, meaning that you don't really need to use
> the extra business with fn_mcxt, no?  If you require both (because I
> may be wrong), perhaps you should have a regression test that's able
> to break when removing array_typentry, changing the code to only rely
> on typentry.   Note: I have just removed array_typentry in a quick
> test, current coverage was happy about it.  Feel free to prove me
> wrong.
>

drop table if exists t;
CREATE TABLE t (a int[]);
insert into t values ('{1,3}'),('{1,2,3}'),('{11}');
insert into t values ('{{1,12}}'), ('{{4,3}}');
SELECT array_sort(a) from t;

In the above case,
tuplesort_begin_datum needs the int type information and int[] type information.
otherwise the cached TypeCacheEntry is being used to sort mult-dimension array,
which will make the result false.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Pgoutput not capturing the generated columns
Next
From: Jelte Fennema-Nio
Date:
Subject: Re: Clear padding in PgStat_HashKey keys