Re: select distinct and index usage - Mailing list pgsql-general

From Gregory Stark
Subject Re: select distinct and index usage
Date
Msg-id 878wzo60hy.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: select distinct and index usage  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: select distinct and index usage  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-general
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Tom Lane escribió:
>>> What I think you'll find, though, is that once you do force an indexscan
>>> to be picked it'll be slower.  Full-table index scans are typically
>>> worse than seqscan+sort, unintuitive though that may sound.

The original poster's implicit expectation is that an index scan would be
faster because it shouldn't have to visit every tuple. Once it's found a tuple
with a particular value it should be able to use the index to skip to the next
key value.

I thought our DISTINCT index scan does do that but it still has to read the
index leaf pages sequentially. It doesn't back-track up the tree structure and
refind the next key.

>> Hmm, should we switch the CLUSTER code to do that?
>
> It's been suggested before, but I'm not sure.  The case where an
> indexscan can win is where the table is roughly in index order already.
> So if you think about periodic CLUSTER to maintain table ordering,
> I suspect you'd want the indexscan implementation for all but maybe
> the first time.

I think we would push a query through the planner to choose the best plan
based on the statistics. I'm not sure how this would play with the visibility
rules -- iirc not all scan types can be used with all visibility modes. And
also I'm not sure how Heikki's MVCC-safe cluster would work if it's not sure
what order it's scanning the heap.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

pgsql-general by date:

Previous
From: "Mikko Partio"
Date:
Subject: Re: "too many trigger records found for relation xyz"
Next
From: Zdenek Kotala
Date:
Subject: Re: "too many trigger records found for relation xyz"