Thread: multiple group by on same table

multiple group by on same table

From
Leonardo Francalanci
Date:
Hi,


I'm going to need to GROUP BY the same table
multiple times. That is, something like:

select (some aggregate functions here) from
tableA group by f1, f2

select (some other aggregate functions here) from
tableA group by f3, f4

etc

The table is pretty large; can someone suggest the
best way of doing it? Is running N queries at the same
time (that is, using N connections with N threads in
the client code) the only way to speed up things (so
that the "concurrent scan" thing can help)? Or it's
more likely that it won't help that much, given that
we have a fairly good storage? Just trying to get some
ideas before starting testing....

(table will be 5M rows, where some of the group by
select could return 3-400K groups)

Leonardo

Re: multiple group by on same table

From
Sim Zacks
Date:
On 05/04/2011 01:51 PM, Leonardo Francalanci wrote:

> Hi,
>
>
> I'm going to need to GROUP BY the same table
> multiple times. That is, something like:
>
> select (some aggregate functions here) from
> tableA group by f1, f2
>
> select (some other aggregate functions here) from
> tableA group by f3, f4
>
> etc
>
> The table is pretty large; can someone suggest the
> best way of doing it? Is running N queries at the same
> time (that is, using N connections with N threads in
> the client code) the only way to speed up things (so
> that the "concurrent scan" thing can help)? Or it's
> more likely that it won't help that much, given that
> we have a fairly good storage? Just trying to get some
> ideas before starting testing....
>
> (table will be 5M rows, where some of the group by
> select could return 3-400K groups)
>
> Leonard

The fastest way of doing it would probably be to build materialized
views for each reporting query and call it your data warehouse.
In any case, each resultset will require its own query, so the only way
to get it done in parallel is separate connections. If you use a single
connection, the queries will have to be run serial (one after the other).

Sim

Re: multiple group by on same table

From
Gabriele Bartolini
Date:
 Ciao Leonardo,

  I am not sure if this could apply to your case, but maybe - unless you
 have done it before - you could look at windowing functions
 (http://www.postgresql.org/docs/current/interactive/tutorial-window.html).
 They require PG8.4+ though.

 Cheers,
 Gabriele

 On Wed, 4 May 2011 11:51:08 +0100 (BST), Leonardo Francalanci
 <m_lists@yahoo.it> wrote:
> Hi,
>
>
> I'm going to need to GROUP BY the same table
> multiple times. That is, something like:
>
> select (some aggregate functions here) from
> tableA group by f1, f2
>
> select (some other aggregate functions here) from
> tableA group by f3, f4
>
> etc
>
> The table is pretty large; can someone suggest the
> best way of doing it? Is running N queries at the same
> time (that is, using N connections with N threads in
> the client code) the only way to speed up things (so
> that the "concurrent scan" thing can help)? Or it's
> more likely that it won't help that much, given that
> we have a fairly good storage? Just trying to get some
> ideas before starting testing....
>
> (table will be 5M rows, where some of the group by
> select could return 3-400K groups)
>
> Leonardo

--
  Gabriele Bartolini - 2ndQuadrant Italia
  PostgreSQL Training, Services and Support
  Gabriele.Bartolini@2ndQuadrant.it - www.2ndQuadrant.it

Re: multiple group by on same table

From
Leonardo Francalanci
Date:
>  I am not sure if this could apply to your case, but maybe  - unless you have
>done it before -
>
> you could look at windowing functions

Ciao Gabriele,

the problem is that the only thing the N queries have in
common is the base table; everything else is different,
because the different "group by"s aren't related one to
another. My understanding is that windowing functions
can't help in that case, but I'll look at them


Thank you

Leonardo