Re: Problems with hash join over nested loop - Mailing list pgsql-performance

From Jim Nasby
Subject Re: Problems with hash join over nested loop
Date
Msg-id 527015BC.3020405@enova.com
Whole thread Raw
In response to Re: Problems with hash join over nested loop  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Problems with hash join over nested loop
List pgsql-performance
On 10/29/13 1:20 PM, Tom Lane wrote:
> Jim Nasby <jnasby@enova.com> writes:
>> On 10/29/13 11:45 AM, Tom Lane wrote:
>>> Jim Nasby <jnasby@enova.com> writes:
>>>> I'm also wondering if it's time to raise those limits.
>
>>> Yeah, possibly.  The current default values were set on machines much
>>> smaller/slower than most current hardware.
>>>
>>> I think also that the collapse limits were invented mainly to keep people
>>> out of GEQO's clutches, but we've made some significant fixes in GEQO
>>> since then.  Maybe the real answer is to make the default collapse limits
>>> much higher, and lower geqo_threshold to whatever we think the threshold
>>> of pain is for applying the regular planner.
>
>> In my test case geqo does seem to do a good job. I'll see if I can get some data on how number of relations affects
planningtime... I don't get much of a warm fuzzy about lowering geqo... 
>
> Yeah, it's probably not that simple.  A trawl through the archives
> reminded me that we've discussed this quite a bit in the past already.
> The collapse limits are important for the regular planner not only to
> limit runtime but also to limit planner memory consumption; moreover,
> GEQO doesn't behave all that well either with very large join problems.
> These facts killed a proposal back in 2009 to remove the collapse limits
> altogether.  There was also some discussion in 2011, see thread here:
> http://www.postgresql.org/message-id/BANLkTin4ncKMg+bEixi1WB1RJPrZ5dVRgQ@mail.gmail.com
> but the general feeling seemed to be that we needed more planner
> infrastructure work first.  In particular it seems like the best way
> forward might require limiting subproblem size using something more
> sophisticated than just "number of relations".

Yeah, I saw one mention of 1GB... that's a bit disconcerting.

Is there a way to measure memory consumption during planning, short of something like strace? (I've got no dev tools
availableon our servers.) 
--
Jim Nasby, Lead Data Architect   (512) 569-9461


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Problems with hash join over nested loop
Next
From: Tom Lane
Date:
Subject: Re: Problems with hash join over nested loop