Re: Large Scale Aggregation (HashAgg Enhancement) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Large Scale Aggregation (HashAgg Enhancement)
Date
Msg-id 28124.1137509530@sss.pgh.pa.us
Whole thread Raw
In response to Re: Large Scale Aggregation (HashAgg Enhancement)  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> writes:
> On Mon, 2006-01-16 at 20:02 -0500, Tom Lane wrote:
>> But our idea of the number of batches needed can change during that
>> process, resulting in some inner tuples being initially assigned to the
>> wrong temp file.  This would also be true for hashagg.

> So we correct that before we start reading the outer table.

Why?  That would require a useless additional pass over the data.  With
the current design, we can process and discard at least *some* of the
data in a temp file when we read it, but a reorganization pass would
mean that it *all* goes back out to disk a second time.

Also, you assume that we can accurately tell how many tuples will fit in
memory in advance of actually processing them --- a presumption clearly
false in the hashagg case, and not that easy to do even for hashjoin.
(You can tell the overall size of a temp file, sure, but how do you know
how it will split when the batch size changes?  A perfectly even split
is unlikely.)

> OK, I see what you mean. Sounds like we should have a new definition for
> Aggregates, "Sort Insensitive" that allows them to work when the input
> ordering does not effect the result, since that case can be optimised
> much better when using HashAgg.

Please don't propose pushing this problem onto the user until it's
demonstrated that there's no other way.  I don't want to become the
next Oracle, with forty zillion knobs that it takes a highly trained
DBA to deal with.

> But all of them sound ugly.

I was thinking along the lines of having multiple temp files per hash
bucket.  If you have a tuple that needs to migrate from bucket M to
bucket N, you know that it arrived before every tuple that was assigned
to bucket N originally, so put such tuples into a separate temp file
and process them before the main bucket-N temp file.  This might get a
little tricky to manage after multiple hash resizings, but in principle
it seems doable.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Magnus Hagander"
Date:
Subject: Re: Docs off on ILIKE indexing?
Next
From: Tom Lane
Date:
Subject: Re: [GENERAL] [PATCH] Better way to check for getaddrinfo function.