Home > mailing lists

Re: Add a greedy join search algorithm to handle large join problems - Mailing list pgsql-hackers

From	Chengpeng Yan
Subject	Re: Add a greedy join search algorithm to handle large join problems
Date	February 14 08:39:33
Msg-id	8927A117-A7EA-41E8-94B3-0B4F7767DA8B@outlook.com Whole thread Raw
In response to	Re: Add a greedy join search algorithm to handle large join problems (lakshmi <lakshmigcdac@gmail.com>)
Responses	Re: Add a greedy join search algorithm to handle large join problems
List	pgsql-hackers

Tree view

> 2026年2月13日 19:14，lakshmi <lakshmigcdac@gmail.com> 写道：
>
> HI all,
> I tested the latest GOO patch (v4) on a fresh build from the current PostgreSQL master. The patch applied cleanly, the server built without issues, and regression tests passed except for the expected EXPLAIN output differences due to the new join ordering behavior.
>
> As a quick sanity check, I compared DP, GEQO, and GOO on a small multi-join query:
>
>      DP planning: ~0.66 ms
>      GEQO planning: ~2.28 ms
>      GOO planning: ~0.38 ms
> Execution times were similar across all three (~1.5–1.7 ms) with no correctness issues. Even on a small join, GEQO shows higher planning overhead, while GOO plans faster with comparable execution cost.
> I then evaluated scaling using synthetic 15-table and 20-table joins with EXPLAIN (ANALYZE, TIMING OFF):
>      15 tables
>      DP: ~22.9 ms | ~23.4 ms
>      GEQO: ~46.7 ms | ~20.5 ms
>      GOO: ~1.8 ms | ~22.4 ms
>
>       20 tables
>       DP: ~48.1 ms | ~30.5 ms
>      GEQO: ~51.0 ms | ~26.7 ms
>      GOO: ~3.2 ms | ~29.0 ms
>
> Planning time increases notably for DP and remains relatively high for GEQO, while GOO stays very low even at 20 joins, indicating substantially
> reduced planning overhead. Execution costs remain broadly comparable, with no obvious regressions from GOO in this synthetic workload.
>
> Although this uses a controlled synthetic join graph rather than JOB/TPC-H, the scaling behavior appears consistent with GOO’s goal of significantly cheaper planning than DP/GEQO while preserving similar plan quality.
>
> I plan to continue testing with more realistic workloads and will share further results if anything notable appears.
>
> Thanks for the interesting work.
>
> Regards,
> Lakshmi

Hi,

Thank you very much for testing v4 and sharing the results. I really
appreciate the effort and the detailed feedback.

I also agree with Tomas’s point that we need better benchmark context to
evaluate plan quality, not only planning time.

I’ve prepared a v5 refresh on top of v4, still split into two patches
(v5-0001 and v5-0002).
I also ran `make check-world` on current master with v5 applied, and it
passes on my side.

Compared with v4:

[PATCH v5 1/2]
- keeps the base GOO join-search path focused on a single greedy signal
(cost);
- fixes issues found during recent testing (mainly around candidate
probing/cleanup and failure paths);
- improves stability/determinism in candidate selection (including tie
handling);
- updates regression outputs accordingly.

[PATCH v5 2/2]
- extends `goo_greedy_strategy` and adds the `selectivity` heuristic
suggested by Tomas;
- improves combined mode so multiple greedy signals are evaluated in a
common framework, and the final plan is selected by lowest estimated
`total_cost`;
- keeps strategy-layer changes isolated from the base path for easier
comparison and review.

My current next steps are:

1. Continue evaluating plan quality on more datasets/workloads. I’ve
already collected several candidate tests: some are JOB-based
variants, and others are synthetic workloads. Next, I plan to
consolidate these into a unified test set (with reproducible
setup/details), publish it, and run broader comparative evaluation.

2. Prototype a hybrid handoff approach: use greedy contraction first to
reduce the join graph, then let DP optimize the reduced problem. The
goal is a smoother transition around the threshold, avoiding abrupt
plan-shape changes from a hard optimizer switch.

3. Explore more join-ordering improvements incrementally, including
ideas from “Simplicity Done Right for Join Ordering” and related
work.

Thanks again for the careful testing and detailed feedback.

--
Best regards,
Chengpeng Yan

Attachment

pgsql-hackers by date:

From: Manni Wood
Date: 14 February, 06:34:13
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD

From: jian he
Date: 14 February, 09:55:13
Subject: Re: CREATE TABLE LIKE INCLUDING POLICIES

Re: Add a greedy join search algorithm to handle large join problems - Mailing list pgsql-hackers

Attachment

Previous

Next