Re: iceberg queries - Mailing list pgsql-sql

From Jan Wieck
Subject Re: iceberg queries
Date
Msg-id 3E40493A.DCDD2C4B@Yahoo.com
Whole thread Raw
In response to Re: iceberg queries  (Christoph Haller <ch@rodos.fzk.de>)
List pgsql-sql
Tom Lane wrote:
> 
> Jan Wieck <JanWieck@Yahoo.com> writes:
> > As to the original question, if an index is available that returns the
> > rows in the sort order of the GROUP BY clause, PostgreSQL defaults to an
> > index scan, otherwise it will do a sort of the rows matching an optional
> > WHERE clause. This sorted set is then grouped and aggregated and
> > filtered by the HAVING clause after aggregation.
> 
> Note that as of 7.4, the planner will probably pick hashed aggregation
> rather than sort-based aggregation, if it can predict that the number
> of groups will not be too large for a hash table to fit in memory.
> This means we can do a seqscan (or, perhaps, an indexscan to match
> WHERE conditions) and avoid sorting.  So I expect performance on this
> type of query to be a good deal better in 7.4.  There are a few
> benchmark comparisons in the pghackers archives a couple months back.
> 
>                         regards, tom lane

If it can predict.

I guess the question was asked because one expects performance problems.
From that I conclude that the amount of data is significant in this
particular case. That does not necessarily but usually mean a large
number of unique groups.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


pgsql-sql by date:

Previous
From: justin
Date:
Subject: Serialized Transaction Locking Issues
Next
From: Tom Lane
Date:
Subject: Re: Serialized Transaction Locking Issues