Home > mailing lists

Re: SQL - Indexing for performance on uniquness check... - Mailing list pgsql-novice

From	Tom Lane
Subject	Re: SQL - Indexing for performance on uniquness check...
Date	July 19, 2004 01:06:05
Msg-id	8589.1090209948@sss.pgh.pa.us Whole thread Raw
In response to	Re: SQL - Indexing for performance on uniquness check... (Josh Berkus <josh@agliodbs.com>)
List	pgsql-novice

Tree view

Josh Berkus <josh@agliodbs.com> writes:
> Charles,
>> Sample query to return non-uniqueness
>> SELECT A1, A2, A3, ..., An
>> FROM Table
>> GROUP BY A1, A2, A3, ..., An
>> HAVING Count(*)>1

> In order for it to be even possible to use an index (a hashaggregate
> operation, actually) on this table, you'd have to include *all* of the GROUP
> BY columns in a single, multi-column index.

> However, it would be unlikely for PG to use any kind of an index in the
> operation above, because of the number of columns, the unlikelyness of
> grouping (i.e. there will only be a minority of rows with count(*) > 1) and
> the fact that you're running this against the whole table.  So any
> kind of an index is liable to be useless.

Yeah.  If you are not expecting a huge number of groups, I think that it
would be more interesting to try a HashAggregate plan than a sort/group
plan.  For this you need 7.4 or later and a sort_mem setting large
enough to cover whatever the planner estimates the hashtable size to be.

            regards, tom lane

pgsql-novice by date:

From: "Scott Marlowe"
Date: 18 July 2004, 19:38:12
Subject: Re: Calling Functions and Stored Procedures

From: Tom Lane
Date: 19 July 2004, 01:23:52
Subject: Re: FOR-IN-EXECUTE, why fail?

Re: SQL - Indexing for performance on uniquness check... - Mailing list pgsql-novice

Previous

Next