Re: CLUSTER and indisclustered - Mailing list pgsql-hackers

From mark Kirkwood
Subject Re: CLUSTER and indisclustered
Date
Msg-id 3D4CA1DE.5040005@slithery.org
Whole thread Raw
In response to Re: CLUSTER and indisclustered  (Gavin Sherry <swm@linuxworld.com.au>)
Responses Re: CLUSTER and indisclustered  (Curt Sampson <cjs@cynic.net>)
List pgsql-hackers
Gavin Sherry wrote:

>On Sat, 3 Aug 2002, Bruce Momjian wrote:
>
>>Gavin Sherry wrote:
>>
>>>Hi all,
>>>
>>>It occured to me on the plane home that now that CLUSTER is fixed we may
>>>be able to put pg_index.indisclustered to use. If CLUSTER was to set
>>>indisclustered to true when it clusters a heap according to the given
>>>index, we could speed up sequantial scans. There are two possible ways.
>>>
>>>1) Planner determines that a seqscan is appropriate *and* the retrieval is
>>>qualified by the key(s) of one of the relation's indexes
>>>2) Planner determines that the relation is clustered on disk according to
>>>the index over the key(s) used to qualify the retrieval
>>>3) Planner sets an appropriate nodeTag for the retrieval (SeqScanCluster?)
>>>4) ExecProcNode() calls some new scan routine, ExecSeqScanCluster() ?
>>>5) ExecSeqScanCluster() calls ExecScan() with a new ExecScanAccessMtd (ie,
>>>different from SeqNext) called SeqClusterNext
>>>6) SeqClusterNext() has all the heapgettup() logic with two
>>>exceptions: a) we find the first tuple more intelligently (instead of
>>>scanning from the first page) b) if we have found tuple(s) matching the
>>>ScanKey when we encounter an non-matching tuple (via
>>>HeapTupleSatisfies() ?) we return a NULL'ed out tuple, terminating the
>>>scan
>>>
>>Gavin, is that a big win compared to just using the index and looping
>>through the entries, knowing that the index matches are on the same
>>page, and the heap matches are on the same page.
>>
>
>Bruce,
>
>It would cut out the index over head. Besides at (1) (above) we would have
>determined that an index scan was too expensive and we would be using a
>SeqScan instead. This would just be faster, since a) we would locate the
>tuples more intelligently b) we wouldn't need to scan the whole heap once
>we'd found all tuples matching the scan key.
>
>Gavin
>


Gavin and Bruce,

I am not so sure index access in these cases is such an overhead - since 
the clustered nature of the table means that many index elements will 
point to table data that is located  in a few pages .


Ok, this change would save you the initial access of the index structure 
itself - but isnt
the usual killer for indexes is the "thrashing" that happens when the 
"pointed to" table data is spread over a many pages.

I did some tests on this a while ago ( 7.1 era) and discovered that for 
a clustered table a sequential scan did not start to win until the 
target dataset was ~30% of the table itself.
While the suggested change would of course mean that seq scans would do 
much better than they did in my tests, it would be interesting to see 
some timings...


regards

Mark




pgsql-hackers by date:

Previous
From: Richard Tucker
Date:
Subject: Re: PITR, checkpoint, and local relations
Next
From: "Sander Steffann"
Date:
Subject: Re: Why is MySQL more chosen over PostgreSQL?