On Tue, 31 Dec 2024 at 06:14, PG Bug reporting form
<noreply@postgresql.org> wrote:
> string_agg function(text, text) design issue
> When I require distinct, I cannot continue using no for sorting; instead, I
> need to use no::text, otherwise I will encounter the exception: `in an
> aggregate with DISTINCT, ORDER BY expressions must appear in argument
> list.`
I don't really know if this is a design issue. If we wanted to allow
ORDER BY of a column that isn't in the DISTINCT list, the executor
would need to perform 2 sorts, after the first, we'd do the DISTINCT
and then we'd need to sort again for the ORDER BY before aggregating
the results. Since the executor only performs, at most, a single sort,
we impose the restriction that there are no columns in the ORDER BY
that aren't in the DISTINCT clause. This allows us to rearrange the
DISTINCT clause according to the ORDER BY clause and make use of a
single sort. For example, if you do string_agg(a,b ORDER BY b,a) then
the DISTINCT clause is rewritten to b,a rather than a,b. That's
semantically equivalent.
This is a limitation that could technically be lifted, so there's
nothing wrong with the design here. It just needs someone sufficiently
motivated to expand the functionality of the executor. However, since
there are workarounds (see below), it's possibly unlikely anyone will
be motivated to do so.
> ```sql
> select
> name,
> string_agg(no::text, ',' order by no::text asc) as nos
> from
> users
> group by name;
> ```
>
> This results in incorrect sorting and does not meet expectations.
>
> Could you optimize `string_agg` or suggest an alternative function to handle
> such a scenario?
You could use a subquery to get rid of the duplicates and then
aggregate without DISTINCT, e.g:
select name,string_agg(no::text, ',' order by no)
from (
select name,no from users group by name,no
) u group by name;
David