Re: Document aggregate functions better w.r.t. ORDER BY - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Document aggregate functions better w.r.t. ORDER BY
Date
Msg-id CAKFQuwZNrihaCkG_D_7s4hMeAVs4nr9QyTreizF5Dx4NWD=ybw@mail.gmail.com
Whole thread Raw
In response to Re: Document aggregate functions better w.r.t. ORDER BY  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Document aggregate functions better w.r.t. ORDER BY
Re: Document aggregate functions better w.r.t. ORDER BY
List pgsql-hackers
On Wed, Oct 25, 2023 at 4:22 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Oct 25, 2023 at 04:14:11PM -0700, David G. Johnston wrote:
> Yeah, we punt on the entire concept in the data type section:
>
> "Managing these errors and how they propagate through calculations is the
> subject of an entire branch of mathematics and computer science and will not be
> discussed here," ...
>
> Also, I'm now led to believe that the relevant IEEE 754 floating point addition
> is indeed commutative.  Given that, I am inclined to simply not add the order
> by clause at all to those four functions. (actually, you already got rid of the
> avg()s but the sum()s are still present, so just those two).

Ah, yes, sum() removed.  Updated patch attached.


The paragraph leading into the last added example needs to be tweaked:

If DISTINCT is specified within an aggregate, the data is sorted in ascending order while extracting unique values.  You can add an ORDER BY clause, limited to expressions matching the regular arguments of the aggregate, to sort the output in descending order.

(show existing - DISTINCT only - example here)

<programlisting>
WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
SELECT string_agg(DISTINCT v::text, ';' ORDER BY v::text DESC) FROM vals;
 string_agg
-----------
  4;3;2;1
</programlisting>

(existing note)

Question: Do you know whether we for certain always sort ascending here to compute the unique values or whether if, say, there is an index on the column in descending order (or ascending and traversed backwards) that the data within the aggregate could, with an order by, be returned in descending order?  If it is ascending, is that part of the SQL Standard (since it doesn't even allow an order by to give the user the ability the control the output ordering) or does the SQL Standard expect that even a random order would be fine since there are algorithms that can be used that do not involve sorting the input?

It seems redundant to first say "regular arguments" then negate it in order to say "DISTINCT list".  Using the positive form with "DISTINCT list" should get the point across sufficiently and succinctly.  It also avoids me feeling like there should be an example of what happens when you do "sort on an expression that is not included in the DISTINCT list".

Interestingly:

WITH vals (v,l) AS ( VALUES (1,'Z'),(3,'D'),(4,'R'),(3,'A'),(2,'T') )
SELECT string_agg(DISTINCT l, ';' ORDER BY l, ';' DESC) FROM vals;

ERROR:  in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
LINE 2: SELECT string_agg(DISTINCT l, ';' ORDER BY l, ';' DESC) FROM...

But both expressions in the argument list (el and semicolon) do appear in the ORDER BY...

David J.

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_stat_statements and "IN" conditions
Next
From: Michael Paquier
Date:
Subject: Re: Should we represent temp files as unsigned long int instead of signed long int type?