Re: Problems with hash join over nested loop - Mailing list pgsql-performance

From Tom Lane
Subject Re: Problems with hash join over nested loop
Date
Msg-id 5575.1383070823@sss.pgh.pa.us
Whole thread Raw
In response to Re: Problems with hash join over nested loop  (Jim Nasby <jnasby@enova.com>)
Responses Re: Problems with hash join over nested loop
List pgsql-performance
Jim Nasby <jnasby@enova.com> writes:
> On 10/29/13 11:45 AM, Tom Lane wrote:
>> Jim Nasby <jnasby@enova.com> writes:
>>> I'm also wondering if it's time to raise those limits.

>> Yeah, possibly.  The current default values were set on machines much
>> smaller/slower than most current hardware.
>>
>> I think also that the collapse limits were invented mainly to keep people
>> out of GEQO's clutches, but we've made some significant fixes in GEQO
>> since then.  Maybe the real answer is to make the default collapse limits
>> much higher, and lower geqo_threshold to whatever we think the threshold
>> of pain is for applying the regular planner.

> In my test case geqo does seem to do a good job. I'll see if I can get some data on how number of relations affects
planningtime... I don't get much of a warm fuzzy about lowering geqo... 

Yeah, it's probably not that simple.  A trawl through the archives
reminded me that we've discussed this quite a bit in the past already.
The collapse limits are important for the regular planner not only to
limit runtime but also to limit planner memory consumption; moreover,
GEQO doesn't behave all that well either with very large join problems.
These facts killed a proposal back in 2009 to remove the collapse limits
altogether.  There was also some discussion in 2011, see thread here:
http://www.postgresql.org/message-id/BANLkTin4ncKMg+bEixi1WB1RJPrZ5dVRgQ@mail.gmail.com
but the general feeling seemed to be that we needed more planner
infrastructure work first.  In particular it seems like the best way
forward might require limiting subproblem size using something more
sophisticated than just "number of relations".

            regards, tom lane


pgsql-performance by date:

Previous
From: Jim Nasby
Date:
Subject: Re: Problems with hash join over nested loop
Next
From: Jim Nasby
Date:
Subject: Re: Problems with hash join over nested loop