On Tue, Mar 27, 2012 at 3:22 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> According to what I've learned in the last couple of months, array_agg
> is not ready for fallback ways like dumping to tuplestore. Even
> merge-state is not able for them. The problem is that the executor
> doesn't know how to serialize/deserialize the internal type trans
> value. So in one implementation, the existence of merge function is a
> flag to switch back to sort grouping not hash aggregate and array_agg
> is one of such aggregate functions. That said, if you invent a new
> flag to note the aggregate is not dump-ready, it'd be worth inventing
> state merge function to aggregate infrastructure anyway.
>
> So I can imagine a way without state-merge function nor dumping to
> tuplestore would be to sort hash table content the rest of inputs so
> that we can switch to sort grouping. Since we have hash table, we can
> definitely sort them in memory, and we can put to disk everything that
> comes later than the fallback and read it after the scan finishes. Now
> we have sorted state values and sorted input, we can continue the rest
> of work.
It's a little bit tricky to make this work - you have to get all of
the values out of the hash-table you've built and stick them into a
Tuplesort object - but I think it can be made to work, and it seems
more elegant than anything else proposed so far.
I also agree with you and with Greg Stark that it would be good to
invent a state-merge function. Although it wouldn't apply to every
case, it would make some very common cases a lot more efficient, both
in run time and in memory.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company