Re: Large Scale Aggregation (HashAgg Enhancement) - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Large Scale Aggregation (HashAgg Enhancement)
Date
Msg-id 1137525457.3180.272.camel@localhost.localdomain
Whole thread Raw
In response to Re: Large Scale Aggregation (HashAgg Enhancement)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Large Scale Aggregation (HashAgg Enhancement)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Mon, 2006-01-16 at 12:36 -0500, Tom Lane wrote:

> The tricky part is to preserve the existing guarantee that tuples are
> merged into their aggregate in arrival order.  (This does not matter for
> the standard aggregates but it definitely does for custom aggregates,
> and there will be unhappy villagers appearing on our doorsteps if we
> break it.)  I think this can work correctly under the above sketch but
> it needs to be verified.  It might require different handling of the
> TODO files than what hashjoin does.

You almost had me there... but there isn't any "arrival order". The sort
that precedes an aggregation only sorts on the GROUP BY columns, not on
additional columns - so by the SQL standard there is not a guaranteed
ordering of the data into a aggregate. That is exactly what windowed
aggregates are for. (There isn't any way of specifying an ORDER BY yet
either).

The only way of doing this is by doing a derived tableselect a, sum(b) from (select a,b order by a,b);
but AFAICS this is not part of the standard??

It is highly likely that rows are clumped together, but there just isn't
any guarantee that is the case. Any update of any row would change the
arrival order. Should we support something that has worked by luck?

I've been looking into windowed aggregates; these will provide this
functionality should people require it. I don't see how we'd be able to
do windowed aggregates and hashAgg at the same time, so this seems less
relevant. 

Best Regards, Simon Riggs



pgsql-hackers by date:

Previous
From: Jaime Casanova
Date:
Subject: Re: Error working with Temporary Sequences in plpgsql in 8.1 (8.0 works fine)
Next
From: Tom Lane
Date:
Subject: Re: Large Scale Aggregation (HashAgg Enhancement)