Home > mailing lists

Re: select distinct and index usage - Mailing list pgsql-general

From	Martijn van Oosterhout
Subject	Re: select distinct and index usage
Date	April 8, 2008 06:50:52
Msg-id	20080408065030.GA31397@svana.org Whole thread Raw
In response to	select distinct and index usage ("David Wilson" <david.t.wilson@gmail.com>)
List	pgsql-general

Tree view

On Sun, Apr 06, 2008 at 07:32:58PM -0400, David Wilson wrote:
> I have a reasonably large table (~75m rows,~18gb) called "vals". It
> includes an integer datestamp column with approximately 4000 unique
> entries across the rows; there is a normal btree index on the
> datestamp column. When I attempt something like "select distinct
> datestamp from vals", however, explain tells me it's doing a
> sequential scan:

I'm a bit late to the party, but someone had a similar problem a while
back and solved it with an SRF as follows (pseudo-code):

BEGIN
  curr := (SELECT field FROM table ORDER BY field LIMIT 1 )
  RETURN NEXT curr;

  WHILE( curr )
    curr := (SELECT field FROM table WHERE field > curr ORDER BY field LIMIT 1 )
    RETURN NEXT curr;
  END
END

If you have 5000 unique values it will do 5000 index lookup which would
be bad except it's better than 75 million rows as is in your case.
Whether it's faster is something you'll have to test, but it's another
approach to the problem.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Attachment

signature.asc

pgsql-general by date:

From: Volkan YAZICI
Date: 08 April 2008, 06:42:27
Subject: Re: Most Occurring Value

From: "Mikko Partio"
Date: 08 April 2008, 07:14:45
Subject: "too many trigger records found for relation xyz"

Re: select distinct and index usage - Mailing list pgsql-general

Attachment

Previous

Next