Re: Add a greedy join search algorithm to handle large join problems - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Add a greedy join search algorithm to handle large join problems
Date
Msg-id 6db6d2ec-7529-4add-9a95-178fc318311d@vondra.me
Whole thread Raw
In response to Add a greedy join search algorithm to handle large join problems  (Chengpeng Yan <chengpeng_yan@outlook.com>)
Responses Re: Add a greedy join search algorithm to handle large join problems
List pgsql-hackers
On 12/2/25 04:48, Chengpeng Yan wrote:
> Hi hackers,
> 
> This patch implements GOO (Greedy Operator Ordering), a greedy
> join-order search method for large join problems, based on Fegaras (DEXA
> ’98) [1]. The algorithm repeatedly selects, among all legal joins, the
> join pair with the lowest estimated total cost, merges them, and
> continues until a single join remains. Patch attached.
> 
> To get an initial sense of performance, I reused the star join /
> snowflake examples and the testing script from the thread in [2]. The
> star-join GUC in that SQL workload was replaced with
> `enable_goo_join_search`, so the same tests can run under DP (standard
> dynamic programming) / GEQO(Genetic Query Optimizer) / GOO. For these
> tests, geqo_threshold was set to 15 for DP, and to 5 for both GEQO and
> GOO. Other planner settings, including join_collapse_limit, remained at
> their defaults.
> 
> On my local machine, a single-client pgbench run produces the following
> throughput (tps):
> 
>                     |    DP    |   GEQO   |    GOO
> --------------------+----------+----------+-----------
> starjoin    (inner) |  1762.52 |  192.13  |  6168.89
> starjoin    (outer) |  1683.92 |  173.90  |  5626.56
> snowflake   (inner) |  1829.04 |  133.40  |  3929.57
> snowflake   (outer) |  1397.93 |   99.65  |  3040.52
> 

Seems interesting, and also much more ambitious than what I intended to
do in the starjoin thread (which is meant to be just a simplistic
heuristics on top of the regular join order planning).

I think a much broader evaluation will be needed, comparing not just the
planning time, but also the quality of the final plan. Which for the
starjoin tests does not really matter, as the plans are all equal in
this regard.


regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Next
From: Matthias van de Meent
Date:
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements