Hi,
This patch is a WIP fix for the issue described in [1], where the
planner picks a more expensive plan with partition-wise joins enabled,
and disabling this option produces a cheaper plan. That's a bit strange
because with the option disabled we consider *fewer* plans, so we should
not be able to generate a cheaper plan.
The problem lies in generate_orderedappend_paths(), which builds two
types of append paths - with minimal startup cost, and with minimal
total cost. That however does not work for queries with LIMIT, which
also need to consider at fractional cost, but the path interesting from
this perspective may be eliminated by other paths.
Consider three paths (this comes from the reproducer shared in [1]):
A: nestloop_path startup 0.585000 total 35708.292500
B: nestloop_path startup 0.292500 total 150004297.292500
C: mergejoin_path startup 9748.112737 total 14102.092737
With some reasonable LIMIT value (e.g. 5% of the data), we really want
to pick path A. But that path is dominated both in startup cost (by B)
and total cost (path C). Hence generate_orderedappend_paths() will
ignore it, and we'll generate a more expensive LIMIT plan.
In [2] Tom proposed to modify generate_orderedappend_paths() to also
consider the fractional cheapest_path, just like we do for startup and
total costs. The attached patch does exactly that, and it does seem to
do the trick.
There are some loose ends, though:
1) If get_cheapest_fractional_path_for_pathkeys returns NULL, it's not
clear whether to default to cheapest_startup or cheapest_total. We might
also consider an incremental sort, in which case the startup cost
matters (for Sort it does not). So we may need an enhanced version of
get_cheapest_fractional_path_for_pathkeys that generates such paths.
2) Same for the cheapest_total - maybe there's a partially sorted path,
and using it with incremental sort on top would be better than using
cheapest_total_path + sort.
3) Not sure if get_cheapest_fractional_path_for_pathkeys should worry
about require_parallel_safe, just like the other functions nearby.
I'd argue that this patch does not need to add the Incremental Sort
capabilities in (1) and (2) - it's just another place where we decided
not to consider this sort variant yet.
I'm not sure how much this depends on partition-wise join, and why
disabling it generates the right plan. The reproducer uses that, but
AFAICS generate_orderedappend_paths() works like this since 2010 (commit
11cad29c915). I'd bet the issue exists since then and it's possible to
get similar cases even without partition-wise join.
I can reproduce it on PostgreSQL 12, though (which however supports
partition-wise join).
Not sure whether this should be backpatched. We didn't get any reports
until now, so it doesn't seem like a pressing issue. OTOH most users
won't actually notice this, they'll just get worse plans without
realizing there's a better option.
regards
[1]
https://www.postgresql.org/message-id/011937a3-7427-b99f-13f1-c07a127cf94c%40enterprisedb.com
[2] https://www.postgresql.org/message-id/4006636.1618577893%40sss.pgh.pa.us
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company