Re: Performance aggregates - Mailing list pgsql-general

From Tom Lane
Subject Re: Performance aggregates
Date
Msg-id 11350.989955445@sss.pgh.pa.us
Whole thread Raw
In response to Re: Performance aggregates  (snpe <snpe@infosky.net>)
List pgsql-general
snpe <snpe@infosky.net> writes:
> select roba,sum(izn)
> from e_kalkn k,e_kalkns ks
> where k.id=ks.id
> group by roba
> order by roba
>
> is 2.5 times faster on one commercial database

That other DBMS is probably implementing the GROUP BY + aggregate
using a hash table to compute all the aggregates in parallel,
rather than sorting first as Postgres does.  You still need to sort
in the end to meet the ORDER BY spec, but you are only sorting the
aggregate results not the inputs.  Disadvantage: if there are a lot
of distinct values of ROBA then your hash table may overrun memory.

We have a TODO item to implement hashed aggregation.  Teaching the
planner to make an intelligent choice between sorted and hashed
aggregation will probably be harder than the actual execution code.

            regards, tom lane

pgsql-general by date:

Previous
From: snpe
Date:
Subject: Re: Performance aggregates
Next
From: Tom Lane
Date:
Subject: Re: Re: Re: Re: What's the best front end/client under MS Windows?