Re: Aggregate node doesn't include cost for sorting - Mailing list pgsql-hackers

From David Rowley
Subject Re: Aggregate node doesn't include cost for sorting
Date
Msg-id CAApHDvqb4Wm0FKuJqGksEoirKHwC5nUvFP3TQr3VU93jdDWNkw@mail.gmail.com
Whole thread Raw
In response to Re: Aggregate node doesn't include cost for sorting  (David Geier <geidav.pg@gmail.com>)
List pgsql-hackers
On Fri, 9 Dec 2022 at 01:12, David Geier <geidav.pg@gmail.com> wrote:
> Both plans were captured on 14.5, which is indeed prior to 1349d279.
>
> I disabled sequential scan to show that there's an alternative plan
> which is superior to the chosen plan: Index Only Scan is more expensive
> and takes longer than the Seq Scan, but the subsequent Aggregate runs
> much faster as it doesn't have to sort, making the plan overall superior.

Aha, 14.5. What's going on there is that it's still doing the sort.
The aggregate code in that version does not skip the sort because of
the presorted input. A likely explanation for the performance increase
is due to the presorted check in our qsort implementation. The
successful presort check is O(N), whereas an actual sort is O(N *
logN).

It's true that if we had been doing proper costing on these ORDER BY /
DISTINCT aggregates that we could have noticed that the input path's
pathkeys indicate that no sort is required and costed accordingly, but
if we'd gone to the trouble of factoring that into the costs, then it
would also have made sense to make nodeAgg.c not sort on presorted
input.  We got the latter in 1349d279. It's just we didn't do anything
about the costings in that commit.

Anyway, in the next version of Postgres, the planner is highly likely
to choose the 2nd plan in your original email. It'll also be even
faster than you've shown due to the aggregate code not having to store
and read tuples in the tuplesort object. Also, no O(N) presort check
either.  The performance should be much closer to what it would be if
you disabled seqscan and dropped the DISTINCT out of your aggregate.

David



pgsql-hackers by date:

Previous
From: Antonin Houska
Date:
Subject: sendFileWithContent() does not advance the source pointer
Next
From: Robert Haas
Date:
Subject: Re: Error-safe user functions