Re: CLUSTER and indisclustered - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: CLUSTER and indisclustered
Date
Msg-id 200208040257.g742vXI24664@candle.pha.pa.us
Whole thread Raw
In response to CLUSTER and indisclustered  (Gavin Sherry <swm@linuxworld.com.au>)
Responses Re: CLUSTER and indisclustered  (Gavin Sherry <swm@linuxworld.com.au>)
List pgsql-hackers
Gavin Sherry wrote:
> Hi all,
> 
> It occured to me on the plane home that now that CLUSTER is fixed we may
> be able to put pg_index.indisclustered to use. If CLUSTER was to set
> indisclustered to true when it clusters a heap according to the given
> index, we could speed up sequantial scans. There are two possible ways.
> 
> 1) Planner determines that a seqscan is appropriate *and* the retrieval is
> qualified by the key(s) of one of the relation's indexes
> 2) Planner determines that the relation is clustered on disk according to
> the index over the key(s) used to qualify the retrieval
> 3) Planner sets an appropriate nodeTag for the retrieval (SeqScanCluster?)
> 4) ExecProcNode() calls some new scan routine, ExecSeqScanCluster() ?
> 5) ExecSeqScanCluster() calls ExecScan() with a new ExecScanAccessMtd (ie,
> different from SeqNext) called SeqClusterNext
> 6) SeqClusterNext() has all the heapgettup() logic with two
> exceptions: a) we find the first tuple more intelligently (instead of
> scanning from the first page) b) if we have found tuple(s) matching the
> ScanKey when we encounter an non-matching tuple (via
> HeapTupleSatisfies() ?) we return a NULL'ed out tuple, terminating the
> scan

Gavin, is that a big win compared to just using the index and looping
through the entries, knowing that the index matches are on the same
page, and the heap matches are on the same page.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: CLUSTER and indisclustered
Next
From: Bruce Momjian
Date:
Subject: Re: getpid() function