Re: Add a greedy join search algorithm to handle large join problems - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Add a greedy join search algorithm to handle large join problems
Date
Msg-id 6c6e1764-96d4-456b-82de-56e37e074976@vondra.me
Whole thread Raw
In response to Re: Add a greedy join search algorithm to handle large join problems  (Tomas Vondra <tomas@vondra.me>)
List pgsql-hackers
Hi,

Here's a more complete set of results from a TPC-DS run. See the
run-queries-2.sh script for more details. There are also .sql files with
DDL to create the database, etc. It does not include the parts to
generate the data etc. (you'll need to the generator from TPC site).

The attached CSV has results for scales 1 and 10, with 0 and 4 parallel
workers. It runs three configurations:

- master (geqo=off, threshold=12)
- master-geqo (goo=off, threshold=2)
- master-goo (goo=on, threshold=2)

There's a couple more fields, e.g. whether it's cold/cached run, etc.

A very simple summary of the results is the total duration of the run,
for all 99 queries combined:

  scale  workers     caching     master     master-geqo    master-goo
  ===================================================================
      1        0        cold        816             399          1124
                        warm        784             369          1097
               4        cold        797             384          1085
                        warm        774             366          1069
  -------------------------------------------------------------------
     10        0        cold       2760            2653          2340
                        warm       2580            2470          2177
               4        cold       2563            2423          1969
                        warm       2439            2325          1859

This is interesting, and also a bit funny.

The funny part is that geqo seems to do better than master - on scale 1
it's pretty clear, on scale 10 the difference is much smaller.

The interesting part is that "goo" is doing worse than master (or geqo)
on scale 1, and better on scale 10. I wonder how would it do on larger
scales, but I don't have such results.

There's a PDF with per-query results too.

This may be a little bit misleading because the statement timeout was
set to 300s, and there's a couple queries that did not complete before
this timeout. Maybe it'd be better to not include these queries. I
haven't tried, though.

It might be interesting to look at some of the queries that got worse,
and check why. Maybe that'd help you with picking the heuristics?

FWIW I still think no heuristics can be perfect, so getting slower plans
for some queries should not be treated as "hard failure". The other
thing is the quality of plans depends on GUCs like random_page_cost, and
I kept them at default values.

Anyway, I hope this is helpful input.


regards

-- 
Tomas Vondra

Attachment

pgsql-hackers by date:

Previous
From: "Greg Burd"
Date:
Subject: Re: greenfly lwlock corruption in REL_14_STABLE and REL_15_STABLE
Next
From: Pavel Stehule
Date:
Subject: Re: Add a greedy join search algorithm to handle large join problems