On Mon, Jan 6, 2025 at 10:51 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
> > On 6 Jan 2025, at 15:54, John Naylor <johncnaylorls@gmail.com> wrote:
> > argument. Like some other patches in this series, this does have the
> > side effect of removing the ability to skip quinique(), so that should
> > be benchmarked (I can do that this week unless you beat me to it).
>
> With the same setup as in the first message of this thread we can do:
>
> postgres=# SELECT _int_contains(arr,ARRAY[1]) FROM arrays_to_sort;
>
> before patch patch
> Time: 567.928 ms
> after patch
> Time: 890.297 ms
> timing of this function is dominated by PREPAREARR(a);
>
> What bothers me is that PREPAREARR(a); is returning new array in case of empty input. That's why I considered little
refactoringof resize_intArrayType(): reorder cases so that if (num == ARRNELEMS(a)) was first.
Hmm, I'm confused. First, none of the arrays are empty that I can see
-- am I missing something?
Then, the first message setup is
CREATE TABLE arrays_to_sort AS
SELECT array_shuffle(a) arr
FROM
(SELECT ARRAY(SELECT generate_series(1, 1000000)) a),
generate_series(1, 10);
...so most of the time is in sorting the big array, and I don't see a
regression, the opposite in fact:
SELECT _int_contains(arr,ARRAY[1]) FROM arrays_to_sort;
master:
1492.552 ms
v9:
873.697 ms
The case I was concerned about was if the big array was already sorted
and unique. Then, it's conceivable that unnecessarily running qunique
would make things slower, but I don't see that either:
--ordered
CREATE TABLE arrays_sorted AS
SELECT a arr
FROM
(SELECT ARRAY(SELECT generate_series(1, 1000000)) a),
generate_series(1, 10);
SELECT _int_contains(arr,ARRAY[1]) FROM arrays_sorted;
master:
31.388
v9:
28.247
--
John Naylor
Amazon Web Services