Re: select distinct and index usage - Mailing list pgsql-general

From Stephen Denne
Subject Re: select distinct and index usage
Date
Msg-id F0238EBA67824444BC1CB4700960CB48051100F6@dmpeints002.isotach.com
Whole thread Raw
In response to Re: select distinct and index usage  (Alban Hertroys <dalroi@solfertje.student.utwente.nl>)
List pgsql-general
Alban Hertroys wrote
> Something that might help you, but I'm not sure whether it
> might hurt
> the performance of other queries, is to cluster that table on
> val_datestamp_idx. That way the records are already (mostly) sorted
> on disk in the order of the datestamps, which seems to be the brunt
> of above query plan.

I've a question about this suggestion, in relation to what the cost estimation calculation does, or could possibly do:
If there are 4000 distinct values in the index, found randomly amongst 75 million rows, then you might be able to check
thevisibility of all those index values through reading a smaller number of disk pages than if the table was clustered
bythat index. 
As an example, say there are 50 rows per page, at a minimum you could be very lucky and determine that they where all
visiblethrough reading only 80 data pages. More likely you'd be able to determine that through a few hundred pages. If
thetable was clustered by an index on that field, you'd have to read 4000 pages. 

Is this question completely unrelated to PostgreSQL implementation reality, or something worth considering?

Regards,
Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any
attachmentsis confidential and may be subject to legal privilege.  If it is not intended for you please advise by reply
immediately,destroy it and do not copy, disclose or use it in any way. 
__________________________________________________________________
  This email has been scanned by the DMZGlobal Business Quality
              Electronic Messaging Suite.
Please see http://www.dmzglobal.com/dmzmessaging.htm for details.
__________________________________________________________________



pgsql-general by date:

Previous
From: Manuel Sugawara
Date:
Subject: Re: Cannot use a standalone backend to VACUUM in "postgres""
Next
From: Tom Lane
Date:
Subject: Re: tsvector_update_trigger throws error "column is not of tsvector type"