Re: Document aggregate functions better w.r.t. ORDER BY - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Document aggregate functions better w.r.t. ORDER BY
Date
Msg-id CAKFQuwZCZ5P09wBJGSDEQiGZooa21bJjFz1FEjvCs1_hHaB-Ow@mail.gmail.com
Whole thread Raw
In response to Re: Document aggregate functions better w.r.t. ORDER BY  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Document aggregate functions better w.r.t. ORDER BY
List pgsql-hackers
On Tue, Oct 24, 2023 at 1:39 PM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Dec 13, 2022 at 07:38:15PM -0700, David G. Johnston wrote:
> All,
>
> The recent discussion surrounding aggregates and ORDER BY moved me to look over
> our existing documentation, especially now that we've reworked the function
> tables, to see what improvements can be had by simply documenting those
> functions where ORDER BY may change the user-visible output.  I skipped range
> aggregates for the moment but handled the others on the aggregates page (not
> window functions).  This includes the float types for sum and avg.
>
> I added a note just before the table linking back to the syntax chapter and
> describing the newly added rules and syntax choice in the table.
>
> The nuances of floating point math suggest to me that specifying order by for
> those is in some kind of gray area and so I've marked it optional...any
> suggestions for wording (or an xref) to explain those nuances or should it just
> be shown non-optional like the others?  Or not shown at all?
>
> The novelty of my examples is up for bikeshedding.  I didn't want anything too
> long so a subquery didn't make sense, and I was trying to avoid duplication as
> well as multiple lines - hence creating a CTE that can be copied onto all of
> the example queries to produce the noted result.
>
> I added a DISTINCT example to array_agg because it is the first aggregate on
> the page and so hopefully will be seen during a cursory reading.  Plus,
> array_agg is the go-to function for doing this kind of experimentation.

I like this idea, though the examples seemed too detailed so I skipped
them.  Here is the trimmed-down patch I would like to apply.


I'd prefer to keep pointing out that the ones documented are those whose outputs will vary due to ordering.

I've been sympathetic to the user comments that we don't have enough examples.  Just using array_agg for that purpose, showing both DISTINCT and ORDER BY seems like a fair compromise (removes two from my original proposal).  The examples in the section we tell them to go see aren't of that great quality.  If you strongly dislike having the function table contain the examples we should at least improve the page we are sending them to.  (As an aside to this, I've personally always found the syntax block with the 5 syntaxes shown there to be intimidating/hard-to-read).

I'd at least suggest you reconsider the commentary and examples surrounding jsonb_object_agg.

The same goes for the special knowledge of floating point behavior for why we've chosen to document avg/sum, something that typically doesn't care about order, as having an optional order by.

David J.

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: CRC32C Parallel Computation Optimization on ARM
Next
From: Nathan Bossart
Date:
Subject: Re: Improve WALRead() to suck data directly from WAL buffers when possible