Re: Querying distinct values from a large table - Mailing list pgsql-performance

From Bruno Wolff III
Subject Re: Querying distinct values from a large table
Date
Msg-id 20070130164435.GA22099@wolff.to
Whole thread Raw
In response to Querying distinct values from a large table  (Igor Lobanov <ilobanov@swsoft.com>)
List pgsql-performance
On Tue, Jan 30, 2007 at 14:33:34 +0600,
  Igor Lobanov <ilobanov@swsoft.com> wrote:
> Greetings!
>
> I have rather large table with about 5 millions of rows and a dozen of
> columns. Let's suppose that columns are named 'a', 'b', 'c' etc. I need
> to query distinct pairs of ('a';'b') from this table.
>
> Is there any way to somehow improve the performance of this operation?
> Table can not be changed.

DISTINCT currently can't use a hash aggregate plan and will use a sort.
If there aren't many distinct values, the hash aggregate plan will run much
faster. To get around this limitation, rewrite the query as a group by.
Something like:
SELECT a, b FROM table GROUP BY a, b;

pgsql-performance by date:

Previous
From: Gregory Stark
Date:
Subject: Re: Querying distinct values from a large table
Next
From: Alvaro Herrera
Date:
Subject: Re: Querying distinct values from a large table