Thread: multiple group by on same table
Hi, I'm going to need to GROUP BY the same table multiple times. That is, something like: select (some aggregate functions here) from tableA group by f1, f2 select (some other aggregate functions here) from tableA group by f3, f4 etc The table is pretty large; can someone suggest the best way of doing it? Is running N queries at the same time (that is, using N connections with N threads in the client code) the only way to speed up things (so that the "concurrent scan" thing can help)? Or it's more likely that it won't help that much, given that we have a fairly good storage? Just trying to get some ideas before starting testing.... (table will be 5M rows, where some of the group by select could return 3-400K groups) Leonardo
On 05/04/2011 01:51 PM, Leonardo Francalanci wrote: > Hi, > > > I'm going to need to GROUP BY the same table > multiple times. That is, something like: > > select (some aggregate functions here) from > tableA group by f1, f2 > > select (some other aggregate functions here) from > tableA group by f3, f4 > > etc > > The table is pretty large; can someone suggest the > best way of doing it? Is running N queries at the same > time (that is, using N connections with N threads in > the client code) the only way to speed up things (so > that the "concurrent scan" thing can help)? Or it's > more likely that it won't help that much, given that > we have a fairly good storage? Just trying to get some > ideas before starting testing.... > > (table will be 5M rows, where some of the group by > select could return 3-400K groups) > > Leonard The fastest way of doing it would probably be to build materialized views for each reporting query and call it your data warehouse. In any case, each resultset will require its own query, so the only way to get it done in parallel is separate connections. If you use a single connection, the queries will have to be run serial (one after the other). Sim
Ciao Leonardo, I am not sure if this could apply to your case, but maybe - unless you have done it before - you could look at windowing functions (http://www.postgresql.org/docs/current/interactive/tutorial-window.html). They require PG8.4+ though. Cheers, Gabriele On Wed, 4 May 2011 11:51:08 +0100 (BST), Leonardo Francalanci <m_lists@yahoo.it> wrote: > Hi, > > > I'm going to need to GROUP BY the same table > multiple times. That is, something like: > > select (some aggregate functions here) from > tableA group by f1, f2 > > select (some other aggregate functions here) from > tableA group by f3, f4 > > etc > > The table is pretty large; can someone suggest the > best way of doing it? Is running N queries at the same > time (that is, using N connections with N threads in > the client code) the only way to speed up things (so > that the "concurrent scan" thing can help)? Or it's > more likely that it won't help that much, given that > we have a fairly good storage? Just trying to get some > ideas before starting testing.... > > (table will be 5M rows, where some of the group by > select could return 3-400K groups) > > Leonardo -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support Gabriele.Bartolini@2ndQuadrant.it - www.2ndQuadrant.it
> I am not sure if this could apply to your case, but maybe - unless you have >done it before - > > you could look at windowing functions Ciao Gabriele, the problem is that the only thing the N queries have in common is the base table; everything else is different, because the different "group by"s aren't related one to another. My understanding is that windowing functions can't help in that case, but I'll look at them Thank you Leonardo