Re: agg/order-by question - Mailing list pgsql-hackers

From Sailesh Krishnamurthy
Subject Re: agg/order-by question
Date
Msg-id bxyoezzsdsn.fsf@datafix.cs.berkeley.edu
Whole thread Raw
In response to Re: agg/order-by question  (Bruno Wolff III <bruno@wolff.to>)
List pgsql-hackers
>>>>> "Bruno" == Bruno Wolff, <Bruno> writes:
   Bruno> You might try this in 7.4. I am pretty sure a change was   Bruno> made a couple of weeks ago to let group by
workwith either   Bruno> sort order. Also hash aggragates have been available for   Bruno> quite a while in 7.4. This
isa better plan when there are   Bruno> only a small number of distinct values.
 

Gotcha ! Thanks. 

TelegraphCQ is still on the 7.3.2 code base .. after doing one hellish
merge in March, I'm not too eager to do another, although merging more
often is likely going to be less painful.

I knew about the hash-aggregates - we had set spilling of
hash-aggregates to disk for large number of distinct values (with a
crude form of recursive partitioning) as a course project for our
undergraduate database class at Berkeley. When I get some time, I want
to clean up my solution code and contribute it as a patch. I don't
think that will be before the end of summer though.

BTW, some systems prefer sorted grouped-aggregates to hashed
grouped-aggregates - even for small distinct values. How it works is
to just update the running aggregates in place in the sort tournament
tree. The only requirement is to be able to compute aggregates of
aggregates, so that partial aggs for the same distinct values across
different runs can be merged. The advantage is that you get sorted
grouped aggregation for the same cost of unsorted hash-grouped
agg. The disadvantage is that you lose the modularity of the sort.


-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh




pgsql-hackers by date:

Previous
From: ivan
Date:
Subject: new src :>
Next
From: Tom Lane
Date:
Subject: Re: [GENERAL] PG crash on simple query, story continues