Re: Add a greedy join search algorithm to handle large join problems - Mailing list pgsql-hackers

From Chengpeng Yan
Subject Re: Add a greedy join search algorithm to handle large join problems
Date
Msg-id 706B00F7-9B07-4130-8A34-1F473B5B6C54@Outlook.com
Whole thread Raw
In response to Re: Add a greedy join search algorithm to handle large join problems  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
Hi,

Thanks for taking a look.

> On Dec 2, 2025, at 13:36, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> Is pgbench the right workload to test this, I mean what are we trying
> to compare here the planning time taken by DP vs GEQO vs GOO or the
> quality of the plans generated by different join ordering algorithms
> or both?  All pgbench queries are single table scans and there is no
> involvement of the join search, so I am not sure how we can justify
> these gains?

Just to clarify: as noted in the cover mail, the numbers are not from
default pgbench queries, but from the star-join / snowflake workloads in
thread [1], using the benchmark included in the v5-0001 patch. These
workloads contain multi-table joins and do trigger join search; you can
reproduce them by configuring the GUCs as described in the cover mail.

The benchmark tables contain no data, so execution time is negligible;
the results mainly reflect planning time of the different join-ordering
methods, which is intentional for this microbenchmark.

A broader evaluation on TPC-H / TPC-DS / JOB is TODO, covering both
planning time and plan quality. That should provide a more
representative picture of GOO, beyond this synthetic setup.

References:
[1] Star/snowflake join thread and benchmarks:
https://www.postgresql.org/message-id/a22ec6e0-92ae-43e7-85c1-587df2a65f51%40vondra.me

--
Best regards,
Chengpeng Yan




pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: PG version is not seen in pg_upgrade test log
Next
From: Mihail Nikalayeu
Date:
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements