Thread: Sorting aggregate column contents

Sorting aggregate column contents

From

Everton Luís Berz

Date:

02 May 2006, 17:58:24

Is it possible to sort the content of an aggregate text column?

Query:
select s.name, ag_concat(c.name) from state s
inner join city c on (c.idstate = s.idstate)
group by s.name
order by s.name;

Result: name  | ag_concat
-------+--------------------------- RS    | Porto Alegre, Gramado SP    | Osasco
(2 rows)

Expected result: name  | ag_concat
-------+--------------------------- RS    | Gramado, Porto Alegre SP    | Osasco
(2 rows)

I tried "order by s.name, c.name" but it causes a error:
ERROR:  column "c.name" must appear in the GROUP BY clause or be used in 
an aggregate


My function and aggregate code:
CREATE FUNCTION f_concat (text, text) RETURNS text AS $$
DECLARE  t text;
BEGIN  IF character_length($1) > 0 THEN    t = $1 || ', ' || $2;  ELSE    t = $2;  END IF;  RETURN t;
END;
$$ LANGUAGE plpgsql;

CREATE AGGREGATE ag_concat (    sfunc = f_concat,    basetype = text,    stype = text,    initcond = ''
);


-- 
Everton

Re: Sorting aggregate column contents

From

Volkan YAZICI

Date:

02 May 2006, 18:16:18

On May 02 06:00, Everton Luís Berz wrote:
> Is it possible to sort the content of an aggregate text column?
> 
> Query:
> select s.name, ag_concat(c.name) from state s
> inner join city c on (c.idstate = s.idstate)
> group by s.name
> order by s.name;

IMHO, you can receive results ordered by using a subselect:

SELECT T.s_name, ag_concat(T.c_name) FROM (SELECT s.name, c.name         FROM state AS s         INNER JOIN city AS c
ON(c.idstate = s.idstate)         ORDER BY s.name, c.name) AS T (s_name, c_name) GROUP BY T.s_name;
 


Regards.

Re: Sorting aggregate column contents

From

Bruno Wolff III

Date:

02 May 2006, 21:15:36

On Wed, May 03, 2006 at 00:13:40 +0300, Volkan YAZICI <yazicivo@ttnet.net.tr> wrote:
> On May 02 06:00, Everton Luís Berz wrote:
> > Is it possible to sort the content of an aggregate text column?
> > 
> > Query:
> > select s.name, ag_concat(c.name) from state s
> > inner join city c on (c.idstate = s.idstate)
> > group by s.name
> > order by s.name;
> 
> IMHO, you can receive results ordered by using a subselect:
> 
> SELECT T.s_name, ag_concat(T.c_name)
>   FROM (SELECT s.name, c.name
>           FROM state AS s
>           INNER JOIN city AS c ON (c.idstate = s.idstate)
>           ORDER BY s.name, c.name) AS T (s_name, c_name)
>   GROUP BY T.s_name;

Note that this is nonstandard, but is an intentional (but I am not sure if
it's documented) feature of Postgres. There can be some similar situations
where you need to use OFFSET 0 to prevent optimizations that will break the
ordering.

Re: Sorting aggregate column contents

From

Everton Luís Berz

Date:

02 May 2006, 21:24:42

It works fine. But I wouldn't like using subselect's, then if somebody elseknows about an operator or something like
thatto put on the 
 
aggregator, please tell me.


Volkan YAZICI escreveu:
> On May 02 06:00, Everton Luís Berz wrote:
>   
>> Is it possible to sort the content of an aggregate text column?
>>
>> Query:
>> select s.name, ag_concat(c.name) from state s
>> inner join city c on (c.idstate = s.idstate)
>> group by s.name
>> order by s.name;
>>     
>
> IMHO, you can receive results ordered by using a subselect:
>
> SELECT T.s_name, ag_concat(T.c_name)
>   FROM (SELECT s.name, c.name
>           FROM state AS s
>           INNER JOIN city AS c ON (c.idstate = s.idstate)
>           ORDER BY s.name, c.name) AS T (s_name, c_name)
>   GROUP BY T.s_name;
>
>
> Regards.
>
>

Re: Sorting aggregate column contents

From

"Ben K."

Date:

03 May 2006, 01:18:12

> It works fine. But I wouldn't like using subselect's, then if somebody else
> knows about an operator or something like that to put on the aggregator, 
> please tell me.


I think the nature of the f_concat makes it difficult to sort, since it 
simply adds the next value, so if the source table gives value in the 
order of 'a','c','d','b' there's no way to handle them within f_concat 
unless you modify and rearrange the previous result string from within 
f_concat.


So the source table (city) should be sorted. I don't know if this is a 
standard way, but this one seems to do that.


======================================================
select s.name, ag_concat(c.name) from state s inner join (select * from 
city order by name desc) as c on c.idstate=s.idstate group by s.name order by 1;

OR

select s.name, ag_concat(c.name) from state s, (select * from city order 
by name desc) as c where c.idstate = s.idstate group by s.name order by 1;
======================================================


I'm just reordering the source table on the fly. Curiously, if you don't 
have 'desc' you'll get a reverse ordered list. (z,...,a)

I think your needs may also be met without any aggregator as well (there 
may be marginal cases which I haven't thought of, but I assume they can be 
handled if needed)

======================================================
select s.name, array_to_string(array(select name from city where 
idstate = s.idstate order by name),',') from state s;
======================================================
  name |   array_to_string
------+---------------------  RP   | Gramado,Port Alegre  SP   | Osasco


* I see normalization issue here but guess it's not important.



Regards,

Ben K.
Developer
http://benix.tamu.edu

Re: Sorting aggregate column contents

From

Everton Luís Berz

Date:

03 May 2006, 12:04:18

Thanks a lot the explanation.

I tested all cases and I noticed that reordering the source table (city) 
not works on all cases, so I think Postgresql perform different internal 
sort to optimize some query's.
I noticed this in other query I performed:
    select s.ano,           s.semestre,           dhc.iddisciplinahorariocurriculo,           count(*),
ag_concatenar_com_quebra_de_linha(td.turno)AS turno      from disciplinahorariocurriculo dhc
 
inner join horariocurriculo hc on (hc.idhorariocurriculo = 
dhc.idhorariocurriculo)
inner join semestre s on (s.idsemestre = hc.idsemestre)
inner join (select tdinterno.iddisciplinahorariocurriculo, t.turno from 
turnodisciplina tdinterno        inner join turno t on (t.idturno = tdinterno.idturno)          order by
tdinterno.iddisciplinahorariocurriculo,t.turno) as 
 
td on (td.iddisciplinahorariocurriculo = dhc.iddisciplinahorariocurriculo)
--     where dhc.iddisciplinahorariocurriculo = 8282  group by 1, 2, 3    having count(*) > 1  order by 1, 2, 3;
 ano  | semestre | iddisciplinahorariocurriculo | count |   turno
... 2004 |        2 |                         8282 |     3 | 23, 63, 43
            ^ ^ ^
 
...

If I remove the comment in the 'where' line there is the right result: ano  | semestre | iddisciplinahorariocurriculo |
count|   turno
 
------+----------+------------------------------+-------+------------ 2004 |        2 |                         8282 |
  3 | 23, 43, 63                                                              ^ ^ ^
 
(1 row)


I didn't know the array_to_string way, I think I will use it. It's safe 
and easy.

Regards,
--
Everton


Ben K. escreveu:
>> It works fine. But I wouldn't like using subselect's, then if somebody 
>> else
>> knows about an operator or something like that to put on the 
>> aggregator, please tell me.
> 
> 
> 
> I think the nature of the f_concat makes it difficult to sort, since it 
> simply adds the next value, so if the source table gives value in the 
> order of 'a','c','d','b' there's no way to handle them within f_concat 
> unless you modify and rearrange the previous result string from within 
> f_concat.
> 
> 
> So the source table (city) should be sorted. I don't know if this is a 
> standard way, but this one seems to do that.
> 
> 
> ======================================================
> select s.name, ag_concat(c.name) from state s inner join (select * from 
> city order by name desc) as c on c.idstate=s.idstate group by s.name 
> order by 1;
> 
> OR
> 
> select s.name, ag_concat(c.name) from state s, (select * from city order 
> by name desc) as c where c.idstate = s.idstate group by s.name order by 1;
> ======================================================
> 
> 
> I'm just reordering the source table on the fly. Curiously, if you don't 
> have 'desc' you'll get a reverse ordered list. (z,...,a)
> 
> I think your needs may also be met without any aggregator as well (there 
> may be marginal cases which I haven't thought of, but I assume they can 
> be handled if needed)
> 
> ======================================================
> select s.name, array_to_string(array(select name from city where idstate 
> = s.idstate order by name),',') from state s;
> ======================================================
> 
>   name |   array_to_string
> ------+---------------------
>   RP   | Gramado,Port Alegre
>   SP   | Osasco
> 
> 
> * I see normalization issue here but guess it's not important.
> 
> 
> 
> Regards,
> 
> Ben K.
> Developer
> http://benix.tamu.edu
>

Re: Sorting aggregate column contents

From

"Everton Luís Berz"

Date:

03 May 2006, 14:21:53

People from brazilian postgresql list sent me another way to sort the
column contents. The way is the aggregate accumulate values and after
run an array sort function. It worked fine. I think the subquery in
function f_select_array does not decrease performance. Follow the
code:

--from http://archives.postgresql.org/pgsql-general/2005-12/msg01093.php
CREATE FUNCTION f_select_array(anyarray) RETURNS SETOF anyelement AS $$
BEGIN  FOR i IN array_lower($1, 1) .. array_upper($1, 1) LOOP      RETURN NEXT $1[i];  END LOOP;  RETURN;
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

CREATE FUNCTION f_sort_array(anyarray) RETURNS anyarray AS $$
SELECT array(SELECT * FROM f_select_array($1) ORDER BY 1)
$$ LANGUAGE sql IMMUTABLE STRICT;


-- from http://www.postgresql.org/docs/8.1/interactive/xaggr.html
CREATE AGGREGATE ag_accum (  sfunc = array_append,  basetype = anyelement,  stype = anyarray,  initcond = '{}'
);


select e.sigla, array_to_string(f_sort_array(ag_accum(m.nome)), ', ')
AS municipio from estado e
inner join municipio m on (m.idestado = e.idestado)
group by e.sigla
order by e.sigla;
/* result */
sigla |       municipio
-------+-----------------------
RS    | Gramado, Porto Alegre
SP    | Osasco
(2 rows)

--
Everton

On 5/3/06, Everton Luís Berz <everton.berz@gmail.com> wrote:
> Thanks a lot the explanation.
>
> I tested all cases and I noticed that reordering the source table (city)
> not works on all cases, so I think Postgresql perform different internal
> sort to optimize some query's.
> I noticed this in other query I performed:
>
>      select s.ano,
>             s.semestre,
>             dhc.iddisciplinahorariocurriculo,
>             count(*),
>             ag_concatenar_com_quebra_de_linha(td.turno) AS turno
>        from disciplinahorariocurriculo dhc
> inner join horariocurriculo hc on (hc.idhorariocurriculo =
> dhc.idhorariocurriculo)
> inner join semestre s on (s.idsemestre = hc.idsemestre)
> inner join (select tdinterno.iddisciplinahorariocurriculo, t.turno from
> turnodisciplina tdinterno
>          inner join turno t on (t.idturno = tdinterno.idturno)
>            order by tdinterno.iddisciplinahorariocurriculo, t.turno) as
> td on (td.iddisciplinahorariocurriculo = dhc.iddisciplinahorariocurriculo)
> --     where dhc.iddisciplinahorariocurriculo = 8282
>    group by 1, 2, 3
>      having count(*) > 1
>    order by 1, 2, 3;
>
>   ano  | semestre | iddisciplinahorariocurriculo | count |   turno
> ...
>   2004 |        2 |                         8282 |     3 | 23, 63, 43
>                                                                ^ ^ ^
> ...
>
> If I remove the comment in the 'where' line there is the right result:
>   ano  | semestre | iddisciplinahorariocurriculo | count |   turno
> ------+----------+------------------------------+-------+------------
>   2004 |        2 |                         8282 |     3 | 23, 43, 63
>                                                                ^ ^ ^
> (1 row)
>
>
> I didn't know the array_to_string way, I think I will use it. It's safe
> and easy.
>
> Regards,
> --
> Everton
>
>
> Ben K. escreveu:
> >> It works fine. But I wouldn't like using subselect's, then if somebody
> >> else
> >> knows about an operator or something like that to put on the
> >> aggregator, please tell me.
> >
> >
> >
> > I think the nature of the f_concat makes it difficult to sort, since it
> > simply adds the next value, so if the source table gives value in the
> > order of 'a','c','d','b' there's no way to handle them within f_concat
> > unless you modify and rearrange the previous result string from within
> > f_concat.
> >
> >
> > So the source table (city) should be sorted. I don't know if this is a
> > standard way, but this one seems to do that.
> >
> >
> > ======================================================
> > select s.name, ag_concat(c.name) from state s inner join (select * from
> > city order by name desc) as c on c.idstate=s.idstate group by s.name
> > order by 1;
> >
> > OR
> >
> > select s.name, ag_concat(c.name) from state s, (select * from city order
> > by name desc) as c where c.idstate = s.idstate group by s.name order by 1;
> > ======================================================
> >
> >
> > I'm just reordering the source table on the fly. Curiously, if you don't
> > have 'desc' you'll get a reverse ordered list. (z,...,a)
> >
> > I think your needs may also be met without any aggregator as well (there
> > may be marginal cases which I haven't thought of, but I assume they can
> > be handled if needed)
> >
> > ======================================================
> > select s.name, array_to_string(array(select name from city where idstate
> > = s.idstate order by name),',') from state s;
> > ======================================================
> >
> >   name |   array_to_string
> > ------+---------------------
> >   RP   | Gramado,Port Alegre
> >   SP   | Osasco
> >
> >
> > * I see normalization issue here but guess it's not important.
> >
> >
> >
> > Regards,
> >
> > Ben K.
> > Developer
> > http://benix.tamu.edu
> >
>


--
Everton