Re: Add a greedy join search algorithm to handle large join problems - Mailing list pgsql-hackers

From Chengpeng Yan
Subject Re: Add a greedy join search algorithm to handle large join problems
Date
Msg-id A06F7CFA-C351-4D2C-B55D-827C48F46850@Outlook.com
Whole thread Raw
In response to Re: Add a greedy join search algorithm to handle large join problems  (Chengpeng Yan <chengpeng_yan@Outlook.com>)
List pgsql-hackers
Recently, I have been testing the TPC-H SF=1 dataset using four simple
greedy join-ordering strategies: join cardinality (estimated output
rows), selectivity, estimated result size in bytes, and cheapest total
path cost. These can be roughly seen as either output-oriented
heuristics (rows / selectivity / result size), which try to optimize the
shape of intermediate results, or a cost-oriented heuristic, which
prefers the locally cheapest join step.

The main goal of these experiments is to check whether the current
greedy rules show obvious structural weaknesses, and to use the observed
behavior as input for thinking about how a greedy rule might evolve.
While there is unlikely to be a perfect greedy strategy, I am hoping to
identify approaches that behave reasonably well across many cases and
avoid clear pathological behavior.

In the attached files, v3-0001 is identical to the previously submitted
v2-0001 patch and contains the core implementation of the GOO algorithm.
The v3-0002 patch adds testing-only code to evaluate different greedy
rules, including a GUC (goo_greedy_strategy) used only for switching
strategies during experiments.

All tests were performed on the TPC-H SF=1 dataset. After loading the
data, I ran the following commands before executing the benchmarks:

```
VACUUM FREEZE ANALYZE;
CHECKPOINT;

ALTER SYSTEM SET join_collapse_limit = 100;
ALTER SYSTEM SET max_parallel_workers_per_gather = 0;
ALTER SYSTEM SET statement_timeout = 600000;
ALTER SYSTEM SET shared_buffers = ‘4GB’;
ALTER SYSTEM SET effective_cache_size = ‘8GB’;
ALTER SYSTEM SET work_mem = ‘1GB’;
SELECT pg_reload_conf();
```

The detailed benchmark results are summarized in tpch.pdf. Execution
times are reported as ratios, using the DP-based optimizer’s execution
time as the baseline (1.0).

The compressed archive tpch_tests_result.zip contains summary.csv, which
is the raw data used to generate tpch.pdf and was produced by the
run_job.sh script. It also includes files (xxx_plan.txt), which were
generated by the run_analysis.sh script and record the EXPLAIN ANALYZE
outputs for the same query under different join-ordering algorithms, to
make plan differences easier to compare.

Based on the TPC-H results, my high-level observations are:
* The threeoutput-oriented greedy rules (rows, selectivity, result size)
show noticeable regressions compared to DP overall, with a relatively
large number of outliers.
* Using total path cost as the greedy key produces results that are
generally closer to DP, but still shows some clear outliers.

To understand why these regressions occur, I mainly looked at Q20 and
Q7, which show particularly large and consistent regressions and expose
different failure modes.

In Q20, there is a join between partsupp and an aggregated lineitem
subquery. For this join, the planner’s rowcount estimate is wrong by
orders of magnitude (tens of rows estimated versus hundreds of thousands
actually produced). As a result, output-oriented greedy rules strongly
prefer this join very early, because it appears to be extremely
shrinking. In reality, it processes large inputs and produces a large
intermediate, and this early misordering can significantly amplify
downstream join costs. This makes Q20 a clear outlier for output-based
greedy rules when estimates are severely wrong.

Q7 exposes a different issue. The cost-based greedy rule tends to choose
a locally cheap join early, but that join creates an intermediate which
later joins become much more expensive to process. In this case, an
early commitment under relatively weak constraints leads to a
many-to-many intermediate that is only filtered after fact-table joins
are applied. This illustrates how a purely cost-driven greedy rule can
make locally reasonable decisions that turn out to be globally harmful.

Taken together, these outliers suggest that all four single-metric
greedy rules tested so far have structural limitations. Output-oriented
rules appear fragile when join rowcount estimates are badly wrong, while
cost-oriented greedy decisions can still lead to locally reasonable but
globally poor plans.

One question this naturally raises is whether making irreversible greedy
choices based only on a local ranking signal is sufficient, or whether
some mechanism is needed to make the approach more robust and to limit
the impact of such outliers.

As a next step, based on the current results, I plan to ignore
selectivity (which performs poorly in many cases), treat rows as largely
redundant with result_size, and move on to testing on the JOB benchmark.
I also plan to compare the behavior of DP, GEQO, and GOO on JOB, and to
use those results to better understand which signals are most useful for
guiding greedy decisions.

I would be very interested in hearing people’s thoughts on these
observations and on possible directions to explore next.


--
Best regards,
Chengpeng Yan
Attachment

pgsql-hackers by date:

Previous
From: Daniil Davydov
Date:
Subject: Re: Proposal : Use bump memory context for temp buffers
Next
From: Christoph Berg
Date:
Subject: Re: failed NUMA pages inquiry status: Operation not permitted