Re: Document aggregate functions better w.r.t. ORDER BY - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Document aggregate functions better w.r.t. ORDER BY
Date
Msg-id ZTrgnM4GGNM9SniX@momjian.us
Whole thread Raw
In response to Re: Document aggregate functions better w.r.t. ORDER BY  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: Document aggregate functions better w.r.t. ORDER BY
List pgsql-hackers
On Wed, Oct 25, 2023 at 10:34:10PM -0700, David G. Johnston wrote:
> I would reword the existing note to be something like:
> 
> The SQL Standard defines specific aggregates and their properties, including
> which of DISTINCT and/or ORDER BY is allowed.  Due to the extensible nature of
> PostgreSQL it accepts either or both clauses for any aggregate.

Uh, is this something in my patch or somewhere else?  I don't think
PostgreSQL extensible is an example of syntax flexibility.

> From the most recent patch:
> 
>     <para>
> -    If <literal>DISTINCT</literal> is specified in addition to an
> -    <replaceable>order_by_clause</replaceable>, then all the <literal>ORDER BY
> </literal>
> -    expressions must match regular arguments of the aggregate; that is,
> -    you cannot sort on an expression that is not included in the
> -    <literal>DISTINCT</literal> list.
> +    If <literal>DISTINCT</literal> is specified with an
> +    <replaceable>order_by_clause</replaceable>, <literal>ORDER
> +    BY</literal> expressions can only reference columns in the
> +    <literal>DISTINCT</literal> list.  For example:
> +<programlisting>
> +WITH vals (v1, v2) AS ( VALUES (1,'Z'),(3,'D'),(4,'R'),(3,'A'),(2,'T') )
> +SELECT array_agg(DISTINCT v2 ORDER BY v2 DESC) FROM vals;
> +  array_agg
> +-------------
> + {Z,T,R,D,A}
> +</programlisting>
> 
> The change to a two-column vals was mostly to try and find corner-cases that
> might need to be addressed.  If we don't intend to show the error case of
> DISTINCT v1 ORDER BY v2 then we should go back to the original example and just
> add ORDER BY v DESC.  I'm fine with not using string_agg here.
> 
> +    For example:
> +<programlisting>
> +WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
> +SELECT array_agg(DISTINCT v ORDER BY v DESC) FROM vals;
> + array_agg
> +-----------
> + {4,3,2,1}
> +</programlisting>

Okay, good, switched in the attached patch.

> We get enough complaints regarding "apparent ordering" that I would like to
> add:
> 
> As a reminder, while some DISTINCT processing algorithms produce sorted output
> as a side-effect, only by specifying ORDER BY is the output order guaranteed.

Well, we need to create a new email thread for this and look at all the
areas is applies to since this is a much larger issue.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Making aggregate deserialization (and WAL receive) functions slightly faster
Next
From: Peter Smith
Date:
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node