Re: Really SLOW using GROUP BY ...!? - Mailing list pgsql-general

From Tom Lane
Subject Re: Really SLOW using GROUP BY ...!?
Date
Msg-id 16754.973718804@sss.pgh.pa.us
Whole thread Raw
In response to RE: Really SLOW using GROUP BY ...!?  ("Mikheev, Vadim" <vmikheev@SECTORBASE.COM>)
List pgsql-general
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
> Try
> select sum(points) from gains where idcond >= _minimum_id_cond_value_
> group by idcond;

> to see if forced index usage will help. Unfortunately, PG will anyway
> try to sort result before grouping,

I beg your pardon?

regression=# set enable_seqscan TO on;
SET VARIABLE
regression=# explain select sum(unique1) from tenk1 group by hundred;
NOTICE:  QUERY PLAN:

Aggregate  (cost=997.39..1047.39 rows=1000 width=8)
  ->  Group  (cost=997.39..1022.39 rows=10000 width=8)
        ->  Sort  (cost=997.39..997.39 rows=10000 width=8)
              ->  Seq Scan on tenk1  (cost=0.00..333.00 rows=10000 width=8)

EXPLAIN
regression=# set enable_seqscan TO off;
SET VARIABLE
regression=# explain select sum(unique1) from tenk1 group by hundred;
NOTICE:  QUERY PLAN:

Aggregate  (cost=0.00..1119.88 rows=1000 width=8)
  ->  Group  (cost=0.00..1094.88 rows=10000 width=8)
        ->  Index Scan using tenk1_hundred on tenk1  (cost=0.00..1069.88 rows=10000 width=8)

EXPLAIN

Unfortunately neither of these plans is likely to be especially speedy
on ~3 million rows.  The index scan will just thrash the disk, unless
the table has been clustered recently --- and given the deficiencies of
our CLUSTER implementation, I'd hesitate to recommend using it.

I have a personal TODO item to see about implementing group + aggregate
with a hash table of active aggregate values, per a suggestion recently
from devik@cdi.cz.  That would allow this query to be done with a
sequential scan and no sort, which is probably what Oracle is doing.
Won't happen for 7.1 though ...

            regards, tom lane

pgsql-general by date:

Previous
From: oberpwd@anubis.network.com (Wade D. Oberpriller)
Date:
Subject: Built-in Postgres Types as shown in Chap. 4 Programmer's Guide
Next
From: Bruce Momjian
Date:
Subject: Re: VACUUM AND VACUUM ANALYSE