Re: from_collapse_limit vs. geqo_threshold - Mailing list pgsql-hackers

From Robert Haas
Subject Re: from_collapse_limit vs. geqo_threshold
Date
Msg-id 603c8f070905210513m29b1a916la3d0e53e8bf02f8d@mail.gmail.com
Whole thread Raw
In response to Re: from_collapse_limit vs. geqo_threshold  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: from_collapse_limit vs. geqo_threshold  (Greg Stark <stark@enterprisedb.com>)
Re: from_collapse_limit vs. geqo_threshold  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, May 21, 2009 at 7:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> Robert,
>>> It appears that this statement has been in our documentation since Tom
>>> Lane added FROM_COLLAPSE_LIMIT (back then, it was capitalized) on
>>> January 25, 2003 (9bf97ff426de9), but I can't find any justification
>>> for it anywhere.  I think we either need to justify this advice, or
>>> remove it.
>
>> ... trying to remember why I wrote that ... what would happen if
>> FROM_COLLAPSE_LIMIT was *more* than GEQO_THRESHOLD?
>
> I think I wrote it, not you.  The point of the advice is to keep
> subquery collapsation (hm, what's the right noun form?  Need caffeine)
> from turning a non-GEQO query into a GEQO one, and thus subjecting
> you to unpredictable plans.  Maybe the resulting plans would be better
> on average, or maybe they wouldn't, but in any case they'd be
> unpredictable.

That's more or less what I figured, but my real world experience is
that pulling up subqueries and using GEQO leads to plans that are
random but tolerable, whereas not pulling up subqueries leads to plans
that are almost uniformly bad.  Actually, it works OK if really would
have needed to materialize the entire subquery, but otherwise it
stinks.  My real unvarnished opinion on this topic is that
from_collapse_limit is a loaded foot-gun waiting to go off.  We might
as well have an option where if the number of tables in the query
exceeds a certain threshold, we'll just sequential-scan the table
rather than considering the use of indices.  That option would
actually be better, because everyone who read the documentation would
be absolutely certain that they wanted to turn that option OFF,
whereas the behavior of from_collapse_limit is sufficiently complex
that it isn't obvious that it's a terrible idea.

...Robert


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: from_collapse_limit vs. geqo_threshold
Next
From: Peter Eisentraut
Date:
Subject: Re: Compiler warning