Home > mailing lists

Re: Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery)) - Mailing list pgsql-hackers

From	Yeb Havinga
Subject	Re: Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery))
Date	July 27, 2011 11:41:02
Msg-id	4E302357.90704@gmail.com Whole thread Raw
In response to	Re: Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery)) (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery))
List	pgsql-hackers

Tree view

On 2011-07-27 16:16, Robert Haas wrote:
> On Tue, Jul 26, 2011 at 5:37 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>> Yeb Havinga<yebhavinga@gmail.com>  writes:
>>> A few days ago I read Tomas Vondra's blog post about dss tpc-h queries
>>> on PostgreSQL at
>>> http://fuzzy.cz/en/articles/dss-tpc-h-benchmark-with-postgresql/ - in
>>> which he showed how to manually pull up a dss subquery to get a large
>>> speed up. Initially I thought: cool, this is probably now handled by
>>> Hitoshi's patch, but it turns out the subquery type in the dss query is
>>> different.
>> Actually, I believe this example is the exact opposite of the
>> transformation Hitoshi proposes.  Tomas was manually replacing an
>> aggregated subquery by a reference to a grouped table, which can be
>> a win if the subquery would be executed enough times to amortize
>> calculation of the grouped table over all the groups (some of which
>> might never be demanded by the outer query).  Hitoshi was talking about
>> avoiding calculations of grouped-table elements that we don't need,
>> which would be a win in different cases.  Or at least that was the
>> thrust of his original proposal; I'm not sure where the patch went since
>> then.
>>
>> This leads me to think that we need to represent both cases as the same
>> sort of query and make a cost-based decision as to which way to go.
>> Thinking of it as a pull-up or push-down transformation is the wrong
>> approach because those sorts of transformations are done too early to
>> be able to use cost comparisons.
> I think you're right.  OTOH, our estimates of what will pop out of an
> aggregate are so poor that denying the user to control the plan on the
> basis of how they write the query might be a net negative.  :-(
>

Tom and Robert, thank you both for your replies. I think I'm having some 
blind spots and maybe false assumptions regarding the overal work in the 
optimizer, as it is not clear to me what 'the same sort of query' would 
look like. I was under the impression that using cost to select the best 
paths is only done per simple query, and fail to see how a total 
combined plan with pulled up subquery could be compared on cost with a 
total plan where the subquery is still a separate subplan, since the 
range tables / simple-queries to compare are different.

regards,
Yeb

pgsql-hackers by date:

From: Peter Eisentraut
Date: 27 July 2011, 11:18:44
Subject: Re: XMLATTRIBUTES vs. values of type XML

From: Alexander Korotkov
Date: 27 July 2011, 11:43:53
Subject: Re: WIP: Fast GiST index build

Re: Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery)) - Mailing list pgsql-hackers

Previous

Next