Kaimeh <kkaimeh@gmail.com> writes:
> In postgresql 16 and 17 using array_agg with filter where gives an error,
> while in postgres 15 exact same query works.
> This is minimal sample for reproducing:
> create table test (id int, data jsonb);
> insert into test (id, data) values
> (1, '{"a": null}'),
> (2, '{"a": "2"}'),
> (3, '{"a": "2"}'),
> (4, '{"a": ""}');
> select array_agg(distinct (data->>'a')::int) filter (where data->>'a' is
> not null and data->>'a' != '') from test;
Ugh. EXPLAIN tells the tale:
Aggregate (cost=113.57..113.58 rows=1 width=32)
Output: array_agg(DISTINCT (((data ->> 'a'::text))::integer)) FILTER (WHERE (((data ->> 'a'::text) IS NOT NULL) AND
((data->> 'a'::text) <> ''::text)))
-> Sort (cost=88.17..91.35 rows=1270 width=32)
Output: data, (((data ->> 'a'::text))::integer)
Sort Key: (((test.data ->> 'a'::text))::integer)
-> Seq Scan on public.test (cost=0.00..22.70 rows=1270 width=32)
Output: data, ((data ->> 'a'::text))::integer
We have pushed the array_agg argument down in order to sort by it,
neglecting the fact that there's a filter clause that should prevent
evaluation failures.
Bisecting fingers this commit:
1349d2790bf48a4de072931c722f39337e72055e is the first bad commit
commit 1349d2790bf48a4de072931c722f39337e72055e
Author: David Rowley <drowley@postgresql.org>
Date: Tue Aug 2 23:11:45 2022 +1200
Improve performance of ORDER BY / DISTINCT aggregates
Fortunately, that commit didn't actually rip out the old code path.
The simplest fix I can think of is to disable the presorted-agg
optimization if (1) there's a FILTER clause and (2) the proposed
sort key is anything more complex than a Var. There might be
some wiggle room in (2) -- for instance, RelabelType(Var) should
be safe -- but we don't have a lot of intelligence about which
expression types are guaranteed error-free.
regards, tom lane