Thread: Better way to write aggregates?
Hi, I more or less often come about the problem of aggregating a child table counting it's different states. The cleanest solution I've come up with so far is: BEGIN; CREATE TABLE parent ( id int not null, name text not null, UNIQUE(id) ); CREATE TABLE child ( name text not null, state int not null, parent int not null references parent(id) ); CREATE VIEW parent_childs AS SELECT c.parent, count(c.state) as childtotal, count(c.state) - count(nullif(c.state,1)) as childstate1, count(c.state) - count(nullif(c.state,2)) as childstate2, count(c.state) - count(nullif(c.state,3)) as childstate3 FROM child c GROUP BY parent; CREATE VIEW parent_view AS SELECT p.*, pc.* FROM parent p LEFT JOIN parent_childs pc ON (p.id = pc.parent); COMMIT; Is this the fastest way to build these aggregates (not considering tricks with triggers, etc)? The count(state) - count(nullif(...)) looks a bit clumsy. I also experimented with a pgsql function to sum these up, but considered it as not-so-nice and it also always forces a sequential scan on the data. Thanks for any advice, Jan
Jan, I write queries like this CREATE VIEW parent_childs AS SELECT c.parent, count(c.state) as childtotal, sum(case when c.state = 1 then 1 else 0 end) as childstate1, sum(case when c.state = 2 then 1 else 0 end) as childstate2, sum(case when c.state = 3 then 1 else 0 end) as childstate3 FROM child c GROUP BY parent; ---------- Original Message ----------- From: Jan Dittmer <jdi@l4x.org> To: pgsql-performance@postgresql.org Sent: Fri, 21 Apr 2006 10:37:10 +0200 Subject: [PERFORM] Better way to write aggregates? > Hi, > > I more or less often come about the problem of aggregating a > child table counting it's different states. The cleanest solution > I've come up with so far is: > > BEGIN; > CREATE TABLE parent ( > id int not null, > name text not null, > UNIQUE(id) > ); > > CREATE TABLE child ( > name text not null, > state int not null, > parent int not null references parent(id) > ); > > CREATE VIEW parent_childs AS > SELECT > c.parent, > count(c.state) as childtotal, > count(c.state) - count(nullif(c.state,1)) as childstate1, > count(c.state) - count(nullif(c.state,2)) as childstate2, > count(c.state) - count(nullif(c.state,3)) as childstate3 > FROM child c > GROUP BY parent; > > CREATE VIEW parent_view AS > SELECT p.*, > pc.* > FROM parent p > LEFT JOIN parent_childs pc ON (p.id = pc.parent); > COMMIT; > > Is this the fastest way to build these aggregates (not considering > tricks with triggers, etc)? The count(state) - count(nullif(...)) looks > a bit clumsy. > I also experimented with a pgsql function to sum these up, but considered > it as not-so-nice and it also always forces a sequential scan on the > data. > > Thanks for any advice, > > Jan > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend ------- End of Original Message -------
Jim Buttafuoco wrote: > Jan, > > I write queries like this > > CREATE VIEW parent_childs AS > SELECT > c.parent, > count(c.state) as childtotal, > sum(case when c.state = 1 then 1 else 0 end) as childstate1, > sum(case when c.state = 2 then 1 else 0 end) as childstate2, > sum(case when c.state = 3 then 1 else 0 end) as childstate3 > FROM child c > GROUP BY parent; It would help if booleans could be casted to integer 1/0 :-) But performance wise it should be about the same? I think I'll run some tests later today with real data. Would an index on NULLIF(state,1) help count(NULLIF(state,1)) ? Can one build an index on (case when c.state = 3 then 1 else 0 end)? Thanks, Jan
I don't think an index will help you with this query. ---------- Original Message ----------- From: Jan Dittmer <jdi@l4x.org> To: jim@contactbda.com Cc: pgsql-performance@postgresql.org Sent: Fri, 21 Apr 2006 14:35:33 +0200 Subject: Re: [PERFORM] Better way to write aggregates? > Jim Buttafuoco wrote: > > Jan, > > > > I write queries like this > > > > CREATE VIEW parent_childs AS > > SELECT > > c.parent, > > count(c.state) as childtotal, > > sum(case when c.state = 1 then 1 else 0 end) as childstate1, > > sum(case when c.state = 2 then 1 else 0 end) as childstate2, > > sum(case when c.state = 3 then 1 else 0 end) as childstate3 > > FROM child c > > GROUP BY parent; > > It would help if booleans could be casted to integer 1/0 :-) But > performance wise it should be about the same? I think I'll > run some tests later today with real data. > Would an index on NULLIF(state,1) help count(NULLIF(state,1)) ? > Can one build an index on (case when c.state = 3 then 1 else 0 end)? > > Thanks, > > Jan ------- End of Original Message -------
Jan Dittmer <jdi@l4x.org> writes: > It would help if booleans could be casted to integer 1/0 :-) As of 8.1 there is such a cast in the system: regression=# select 't'::bool::int; int4 ------ 1 (1 row) In earlier releases you can make your own. As for the original question, though: have you looked at the crosstab functions in contrib/tablefunc? regards, tom lane