Re: Why could GEQO produce plans with lower costs than thestandard_join_search? - Mailing list pgsql-hackers

From Donald Dong
Subject Re: Why could GEQO produce plans with lower costs than thestandard_join_search?
Date
Msg-id B026DB12-961E-4529-9F08-B9C68826D9E5@csumb.edu
Whole thread Raw
In response to Re: Why could GEQO produce plans with lower costs than the standard_join_search?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Why could GEQO produce plans with lower costs than the standard_join_search?
List pgsql-hackers
On May 23, 2019, at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> Donald Dong <xdong@csumb.edu> writes:
>> On May 22, 2019, at 11:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> You're still asking us to answer hypothetical questions unsupported
>>> by evidence.  In what case does that really happen?
> 
>> I attached the query plan and debug_print_rel output for GEQO and
>> standard_join_search.
> 
>>             planstate->total_cost    cheapest_total_path
>> GEQO        54190.13                54239.03
>> STD            54179.02                54273.73
> 
>> Here I observe GEQO  produces a lower
>> cheapest_total_path->total_cost, but its planstate->total_cost is higher
>> than what standard_join_search produces.
> 
> Well,
> 
> (1) the plan selected by GEQO is in fact more expensive than
> the one found by the standard search.  Not by much --- as Andrew
> observes, this difference is less than what the planner considers
> "fuzzily the same" --- but nonetheless 54190.13 > 54179.02.
> 
> (2) the paths you show do not correspond to the finally selected
> plans --- they aren't even the same shape.  (The Gathers are in
> different places, to start with.)  I'm not sure where you were
> capturing the path data, but it looks like you missed top-level
> parallel-aggregation planning, and that managed to find some
> plans that were marginally cheaper than the ones you captured.
> Keep in mind that GEQO only considers join planning, not
> grouping/aggregation.
> 
> Andrew's point about fuzzy cost comparison is also a good one,
> though we needn't invoke it to explain these particular numbers.

Oh, that's very good to know! I captured the path at the end of the
join_search_hook. If I understood correctly, top-level
parallel-aggregation will be applied later, so GEQO is not taking it
into consideration during the join searching?

By looking at the captured costs, I thought GEQO found a better join
order than the standard_join_search. However, the final plan using
the join order produced by GEQO turns out to be more expansive. Would
that imply if GEQO sees a join order which is identical to the one
produced by standard_join_search, it will discard it since the
cheapest_total_path has a higher cost, though the final plan may be
cheaper?

Here is another query (JOB/27a.sql) which has more significant cost
differences:

        planstate->total_cost    cheapest_total_path
GEQO    343016.77            343016.75
STD        342179.13            344137.33

Regards,
Donald Dong



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: initdb recommendations
Next
From: Euler Taveira
Date:
Subject: Fix link for v12