Re: Error when using array_agg with filter where clause in pg16 and pg17 - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Error when using array_agg with filter where clause in pg16 and pg17
Date
Msg-id 4040920.1744126348@sss.pgh.pa.us
Whole thread Raw
In response to Error when using array_agg with filter where clause in pg16 and pg17  (Kaimeh <kkaimeh@gmail.com>)
Responses Re: Error when using array_agg with filter where clause in pg16 and pg17
List pgsql-bugs
Kaimeh <kkaimeh@gmail.com> writes:
> In postgresql 16 and 17 using array_agg with filter where gives an error,
> while in postgres 15 exact same query works.

> This is minimal sample for reproducing:

> create table test (id int, data jsonb);
> insert into test (id, data) values
> (1, '{"a": null}'),
> (2, '{"a": "2"}'),
> (3, '{"a": "2"}'),
> (4, '{"a": ""}');
> select array_agg(distinct (data->>'a')::int) filter (where data->>'a' is
> not null and data->>'a' != '') from test;

Ugh.  EXPLAIN tells the tale:

 Aggregate  (cost=113.57..113.58 rows=1 width=32)
   Output: array_agg(DISTINCT (((data ->> 'a'::text))::integer)) FILTER (WHERE (((data ->> 'a'::text) IS NOT NULL) AND
((data->> 'a'::text) <> ''::text))) 
   ->  Sort  (cost=88.17..91.35 rows=1270 width=32)
         Output: data, (((data ->> 'a'::text))::integer)
         Sort Key: (((test.data ->> 'a'::text))::integer)
         ->  Seq Scan on public.test  (cost=0.00..22.70 rows=1270 width=32)
               Output: data, ((data ->> 'a'::text))::integer

We have pushed the array_agg argument down in order to sort by it,
neglecting the fact that there's a filter clause that should prevent
evaluation failures.

Bisecting fingers this commit:

1349d2790bf48a4de072931c722f39337e72055e is the first bad commit
commit 1349d2790bf48a4de072931c722f39337e72055e
Author: David Rowley <drowley@postgresql.org>
Date:   Tue Aug 2 23:11:45 2022 +1200

    Improve performance of ORDER BY / DISTINCT aggregates

Fortunately, that commit didn't actually rip out the old code path.
The simplest fix I can think of is to disable the presorted-agg
optimization if (1) there's a FILTER clause and (2) the proposed
sort key is anything more complex than a Var.  There might be
some wiggle room in (2) -- for instance, RelabelType(Var) should
be safe -- but we don't have a lot of intelligence about which
expression types are guaranteed error-free.

            regards, tom lane



pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: BUG #18884: The CURRENT_TIMESTAMP value being returned in UTC-2.
Next
From: David Rowley
Date:
Subject: Re: Error when using array_agg with filter where clause in pg16 and pg17