pgsql: Remove pessimistic cost penalization from Incremental Sort - Mailing list pgsql-committers

From David Rowley
Subject pgsql: Remove pessimistic cost penalization from Incremental Sort
Date
Msg-id E1p60NX-003oPg-P0@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Remove pessimistic cost penalization from Incremental Sort

When incremental sorts were added in v13 a 1.5x pessimism factor was added
to the cost modal.  Seemingly this was done because the cost modal only
has an estimate of the total number of input rows and the number of
presorted groups.  It assumes that the input rows will be evenly
distributed throughout the presorted groups.  The 1.5x pessimism factor
was added to slightly reduce the likelihood of incremental sorts being
used in the hope to avoid performance regressions where an incremental
sort plan was picked and turned out slower due to a large skew in the
number of rows in the presorted groups.

An additional quirk with the path generation code meant that we could
consider both a sort and an incremental sort on paths with presorted keys.
This meant that with the pessimism factor, it was possible that we opted
to perform a sort rather than an incremental sort when the given path had
presorted keys.

Here we remove the 1.5x pessimism factor to allow incremental sorts to
have a fairer chance at being chosen against a full sort.

Previously we would generally create a sort path on the cheapest input
path (if that wasn't sorted already) and incremental sort paths on any
path which had presorted keys.  This meant that if the cheapest input path
wasn't completely sorted but happened to have presorted keys, we would
create a full sort path *and* an incremental sort path on that input path.
Here we change this logic so that if there are presorted keys, we only
create an incremental sort path, and create sort paths only when a full
sort is required.

Both the removal of the cost pessimism factor and the changes made to the
path generation make it more likely that incremental sorts will now be
chosen.  That, of course, as with teaching the planner any new tricks,
means an increased likelihood that the planner will perform an incremental
sort when it's not the best method.  Our standard escape hatch for these
cases is an enable_* GUC.  enable_incremental_sort already exists for
this.

This came out of a report by Pavel Luzanov where he mentioned that the
master branch was choosing to perform a Seq Scan -> Sort -> Group
Aggregate for his query with an ORDER BY aggregate function.  The v15 plan
for his query performed an Index Scan -> Group Aggregate, of course, the
aggregate performed the final sort internally in nodeAgg.c for the
aggregate's ORDER BY.  The ideal plan would have been to use the index,
which provided partially sorted input then use an incremental sort to
provide the aggregate with the sorted input.  This was not being chosen
due to the pessimism in the incremental sort cost modal, so here we remove
that and rationalize the path generation so that sort and incremental sort
plans don't have to needlessly compete.  We assume that it's senseless
to ever use a full sort on a given input path where an incremental sort
can be performed.

Reported-by: Pavel Luzanov
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/9f61ddbf-2989-1536-b31e-6459370a6baa%40postgrespro.ru

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/4a29eabd1d91c5484426bc5836e0a7143b064f5a

Modified Files
--------------
contrib/postgres_fdw/expected/postgres_fdw.out |   2 +
contrib/postgres_fdw/sql/postgres_fdw.sql      |   2 +
src/backend/optimizer/path/allpaths.c          |  94 ++---
src/backend/optimizer/path/costsize.c          |  38 +-
src/backend/optimizer/plan/planner.c           | 480 ++++++++-----------------
src/test/regress/expected/incremental_sort.out |  13 -
src/test/regress/expected/partition_join.out   |   5 +-
src/test/regress/sql/incremental_sort.sql      |   5 -
8 files changed, 220 insertions(+), 419 deletions(-)


pgsql-committers by date:

Previous
From: David Rowley
Date:
Subject: pgsql: Re-adjust drop-index-concurrently-1 isolation test
Next
From: Thomas Munro
Date:
Subject: pgsql: Fix typo in reference to __FreeBSD__.