Re: BUG #15618: Duplicating a join clause breaks index use - Mailing list pgsql-bugs

From Alex
Subject Re: BUG #15618: Duplicating a join clause breaks index use
Date
Msg-id 602024371.2009661.1549122810083@mail.yahoo.com
Whole thread Raw
In response to Re: BUG #15618: Duplicating a join clause breaks index use  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Indeed, increasing those two parameter values results in a fast query.

With default values, the planning time is ~2ms but execution time is ~4600ms.
With increased values, the planning time increases by 0.5ms (2.5ms) but execution time drops to 0.7ms (aprox x6000 faster).

As the impact on the execution time is very high, shouldn't another mechanism be used instead of parameters join_collapse_limit or from_collapse_limit, such as some maximum planning time?
With a maximum planning time, faster machines will benefit from better query planning.

Another suggestion is to run a benchmark on the machine to automatically determine some good values, and attach hardware info to the results, to recompute in case hardware changes (although hdd performance may degrade with fragmentation and ssd with space occupied).
On Saturday, February 2, 2019, 5:08:15 PM GMT+2, Tom Lane <tgl@sss.pgh.pa.us> wrote:


Alex <cdalxndr@yahoo.com> writes:

> select
> ...
> from
>     product p
>     left outer join product_property pp on p.id=pp.product_id
>     left outer join product_property_value ppv on pp.value_id=ppv.id
>     left outer join description_element de on pp.description_source=de.id
>     left outer join description_value dev on de.description_value_id=dev.id
>     left outer join description_element dep on de.parent=dep.id
>     left outer join description_value depv on dep.description_value_id=depv.id
>     left outer join shop_category sc_dup on p.shop_category=sc_dup.id           --duplicate join
>     left outer join shop_category sc on p.shop_category=sc.id
>     left outer join category c on sc.category=c.id
> where
>     c.id=8511


I suspect the real issue here has little to do with the "duplicate" nature
of the extra join, and a lot to do with the fact that your query exceeds
join_collapse_limit.  That causes the planner to break it up in a
semi-arbitrary way to limit planning time.  Adding one more join moves
where the break happens, and evidently is preventing the planner from
discovering a good join order.

You want join_collapse_limit to be at least 10 to handle this query well,
since 10 tables are mentioned.  The default value, 8, is kinda tuned for
older and slower machines than most of us use today.  I wouldn't make it
enormous, because it does prevent exponential planning time growth for
many tables, but you could likely get away with 15 or so.  You should
likely keep from_collapse_limit in sync with that too.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #15618: Duplicating a join clause breaks index use
Next
From: Evgeniy Efimkin
Date:
Subject: Re: BUG #15617: walsender hang if sync replica disconnected fromnetwork