Re: select ... distinct performance - Mailing list pgsql-general

From Don Bowman
Subject Re: select ... distinct performance
Date
Msg-id FE045D4D9F7AED4CBFF1B3B813C85337045D7DD6@mail.sandvine.com
Whole thread Raw
In response to select ... distinct performance  (Don Bowman <don@sandvine.com>)
List pgsql-general
From: Martijn van Oosterhout [mailto:kleptog@svana.org]
>
> On Wed, Jan 28, 2004 at 11:20:30PM -0500, Don Bowman wrote:
> > I have a table with a large number of rows (10K in the
> example below,
> > but >1M in some databases). I would like to find the distinct
> > values for one of the columns. The column is indexed.
> >
> > I would have expected that this would be a very fast operation,
> > simply walking down the index. In the example below, there is
> > only 1 unique value, but it takes 2 seconds. I would have
> > expected more like ~50ms.
>
> The problem is that the index doesn't contain info about
> which rows are
> visibile in your current transaction, so it has to load the
> entire table to
> check. Looks like it used the index to avoid a sort step. I
> don't think
> there is a way to write this that doesn't need the whole table.
>
> Hope this helps,
> --

It would appear that postgresql does not support index-only fetches
(e.g. DB2).

or, perhaps a materialized view. I see there is some work on going
for this.

I haven't tried a stored procedure like this...

select first name into prev_name from Table
while FETCH_OK:
  return prev_name
  fetch first name into prev_name from Table where name > prev_name
end

to see if it can walk the index.

--don

pgsql-general by date:

Previous
From:
Date:
Subject: Re: Grant question
Next
From: Tom Lane
Date:
Subject: Re: Incorrect Results From Select When Using Non-English Characters