Re: RFC: array_agg() per SQL:200n - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: RFC: array_agg() per SQL:200n |
Date | |
Msg-id | 200803250258.m2P2wF909601@momjian.us Whole thread Raw |
In response to | RFC: array_agg() per SQL:200n (Neil Conway <neilc@samurai.com>) |
List | pgsql-hackers |
Add to TODO: * Add SQL-standard array_agg() and unnest() array functions http://archives.postgresql.org/pgsql-hackers/2008-01/msg01017.php --------------------------------------------------------------------------- Neil Conway wrote: > I recently noticed that SQL:200n[1] defines a new aggregate function, > array_agg(). The relevant portions of the spec are: > > p. 66: "If ARRAY_AGG is specified, then an array value with one element > formed from the <value expression> evaluated for each row that > qualifies." > > p. 556: <array aggregate function> ::= > ARRAY_AGG <left paren> > <value expression> [ ORDER BY <sort specification list> ] > <right paren> > > p. 564 discusses the required behavior. The result of array_agg() is an > array with one element per input value, sorted according to the optional > ORDER BY clause. NULL input values are included in the array, and the > result for an empty group is NULL, not an empty array. Note that per > page 66, I'd expect array values in the input to array_agg() not to be > flattened. > > I'd like to implement array_agg() for 8.4. In the past, we've talked > about moving the array_accum() example aggregate into the backend[2]. > Now that there's SQL-standard syntax, that's another reason to do it -- > I think this is clearly useful functionality. > > The previous discussion got tied up in how to expose the aggregate's > transition value to the type system. The problem is that the aggregate > wants to use a transition value that is not a SQL-level type, to allow > efficient array append operations. Various solutions were mooted about, > typically involving a pass-by-val pseudotype used to hold a pointer to > the C struct holding the transition state. > > AFAIK the conclusion reached by the previous thread was that to be type > safe, you'd need one distinct pseudotype per aggregate function, along > with some way to let the planner distinguish this class of pseudotypes > from other types (in order to apply the heuristic that functions like > these are likely to consume more memory). You could identify this class > by an additional column in pg_type, but I think we'd need a lot of > machinery to do this properly (e.g. to allow these types to be created > via SQL). I wonder if this isn't over-engineering: the simple approach > originally followed by Stephen Frost was to declare the transition value > as, say, int8, and just disallow the transition and final functions from > being called outside an aggregate context. AFAIK this would be safe, > although of course it is ugly. > > To parse the ORDER BY clause, we'd need to special-case array_agg() in > the grammar, which is a bit unfortunate. Implementation-wise, because > there is no way to lazily evaluate an array expression, I don't think > there's much to be gained by using the tuplesort infrastructure -- we'll > need to materialize the entire array into memory when the final function > is called anyway. Therefore, a simpler approach might be to just > accumulate inputs in the transition function as usual, and then qsort() > them in the final function. We could also have the planner arrange for > the sort to be skipped if it knows that the input to the aggregate will > be delivered in a compatible ordering. > > Comments welcome. > > -Neil > > [1] http://www.wiscorp.com/SQLStandards.html ; apparently SQL:200n is > likely to become SQL:2008 without further changes. > > [2] http://archives.postgresql.org/pgsql-hackers/2006-10/msg00362.php > http://archives.postgresql.org/pgsql-patches/2006-10/msg00059.php > http://archives.postgresql.org/pgsql-hackers/2006-10/msg00683.php > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
pgsql-hackers by date: