Thread: order function in aggregate
Hi, I'm relatively new to object oriented databases and postgres (~ 4 months), so I'm not sure what is wrong with my custom aggregate function. I'm using Postgres 8.3.3 (results are same on Windows and Ubuntu versions). Here is my data required to explain my problem: ------------------------------------------ CREATE TABLE aggr_test (id serial NOT NULL,sub text,grp text,CONSTRAINT aggr_test_pkey PRIMARY KEY (id) ) WITH (OIDS=FALSE); INSERT INTO aggr_test (id, sub, grp) VALUES (1, 'one', 'pom pom'); INSERT INTO aggr_test (id, sub, grp) VALUES (2, 'two', 'pom pom'); INSERT INTO aggr_test (id, sub, grp) VALUES (3, 'three', 'pom pom'); INSERT INTO aggr_test (id, sub, grp) VALUES (4, 'two', 'la la'); INSERT INTO aggr_test (id, sub, grp) VALUES (5, 'three', 'la la'); INSERT INTO aggr_test (id, sub, grp) VALUES (6, 'one', 'la la'); CREATE OR REPLACE FUNCTION concat(text, text)RETURNS text AS $BODY$DECLAREt text; BEGINIF $1 ISNULL OR $2 ISNULL THEN t = COALESCE($1,$2);ELSIF character_length($1) > 0 THEN t = $1 ||', '|| $2;ELSE t= $2;END IF;RETURN t; END;$BODY$LANGUAGE 'plpgsql' IMMUTABLECOST 100; CREATE AGGREGATE commacat("text") (SFUNC=concat,STYPE=text,SORTOP="<" ); ------------------------------------------ Now, I would like to group a summary of the strings in "sub", but they must be ordered alphabetically. Here I attempt the SQL: select commacat(sub), grp from aggr_test group by grp; However, on my system it will output the "commacat" column as: "one, two, three";"pom pom" "two, three, one";"la la" (notice that the items in the first column are ordered differently, since the input values were entered in non-alphabetically) This is where I get confused, since in the aggregate function, I have specified `SORTOP="<"`, which according to the documentation (http://www.postgresql.org/docs/8.3/interactive/sql-createaggregate.html) "must be equivalent to": SELECT sub FROM aggr_test ORDER BY sub USING <; -- LIMIT 1; or SELECT sub FROM aggr_test ORDER BY sub USING >; -- LIMIT 1; (I've modified the example to show the desired effect of ASC and DESC sorting, respectively) So my question is: why is `SORTOP="<"` in my "commacat" aggregate function not working? Any suggestions? Is this a bug? Note: I am aware that I can achieve my goal using a subquery: SELECT commacat(sub), grp FROM (SELECT * FROM aggr_test ORDER BY grp, sub) AS foo GROUP BY grp; however, I'm looking to see if this ordering can done naturally within the aggregate function. Thanks in advance! +mt
Michael Toews wrote: > Now, I would like to group a summary of the strings in "sub", but they > must be ordered alphabetically. Here I attempt the SQL: > > select commacat(sub), grp from aggr_test group by grp; > > However, on my system it will output the "commacat" column as: > > "one, two, three";"pom pom" > "two, three, one";"la la" > > (notice that the items in the first column are ordered differently, > since the input values were entered in non-alphabetically) > > This is where I get confused, since in the aggregate function, I have > specified `SORTOP="<"`, which according to the documentation > (http://www.postgresql.org/docs/8.3/interactive/sql-createaggregate.html) > "must be equivalent to": > > SELECT sub FROM aggr_test ORDER BY sub USING <; -- LIMIT 1; That's for things like min()/max() where you only want the first/last value of some set. You want all of them. You could accumulate the values in an array and then sort that with the final-func that create aggregate supports. -- Richard Huxton Archonet Ltd
Richard Huxton <dev@archonet.com> writes: > Michael Toews wrote: >> This is where I get confused, since in the aggregate function, I have >> specified `SORTOP="<"`, which according to the documentation >> (http://www.postgresql.org/docs/8.3/interactive/sql-createaggregate.html) >> "must be equivalent to": >> >> SELECT sub FROM aggr_test ORDER BY sub USING <; -- LIMIT 1; > That's for things like min()/max() where you only want the first/last > value of some set. You want all of them. Right: sortop is an assertion that the aggregate behaves like min() or max(), not a request for the system to do something strange to the aggregate's input. > You could accumulate the values in an array and then sort that with the > final-func that create aggregate supports. The traditional way to get presorted input to an aggregate is select myagg(x) from (select whatever as x from ... order by ...) ss; You can avoid that with an internal sort in the aggregate, as Richard suggests, but it might not be worth the trouble --- and in any case it'd be hard to sort internally on anything except the exact values being aggregated. With the subselect approach you can order on anything at all. regards, tom lane
Richard Huxton wrote: > Michael Toews wrote: > > You could accumulate the values in an array and then sort that with > the final-func that create aggregate supports. Thanks for the help. Here was my final functions to aggregate things into a comma serpented text element (if anyone is interested): CREATE OR REPLACE FUNCTION commacat_fn(anyarray) RETURNS text AS $BODY$select array_to_string(sort($1),', ');$BODY$ LANGUAGE 'sql' IMMUTABLE STRICT COST 100; ALTER FUNCTION commacat_fn(anyarray) OWNER TO postgres; CREATE AGGREGATE commacat(anyelement) ( SFUNC=array_append, STYPE=anyarray, FINALFUNC=commacat_fn, INITCOND='{}' ); --------------- Lastly a random quick example: select attrelid, commacat(attname) as attnames from pg_attribute group by attrelid order by attrelid; Certainly there are far better examples that can be used to distill information in a table to a comma-separated list. In some specific applications, a "sort(myarraytype[])" function will need to be created if the data type in the aggregate column does not have a sort function (fortunately, most data types already have this function defined). Regards, +mt