Indexscan API cleanup proposal - Mailing list pgsql-hackers

From Tom Lane
Subject Indexscan API cleanup proposal
Date
Msg-id 11436.1021846949@sss.pgh.pa.us
Whole thread Raw
List pgsql-hackers
As I mentioned in passing awhile ago, I'm thinking of simplifying the
indexscan API and arranging for heapscans and indexscans to have
essentially identical APIs.  Here's the plan:

heap_beginscan and index_beginscan will have near-identical signatures:

HeapScanDesc heap_beginscan(Relation relation, Snapshot snapshot,                           int nkeys, ScanKey key)
IndexScanDesc index_beginscan(Relation relation, Snapshot snapshot,                             int nkeys, ScanKey
key)

This differs from the existing signatures in that the atend and
scanFromEnd parameters are removed --- they are useless and have only
fostered confusion.  (heap_beginscan currently ignores atend entirely
anyway.  index_beginscan looks like it does something with the parameter,
but in point of fact all the index AMs determine scan direction on the
basis of the direction first passed to index_getnext.)  Also, a snapshot
will now be passed to index_beginscan and saved in the IndexScanDesc
structure.

Similarly, the heap_rescan and index_rescan routines will drop their
scanFromEnd parameters.

The getnext routines will have the signatures

HeapTuple heap_getnext(HeapScanDesc scan, ScanDirection direction)
HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction)

with semantics in both cases essentially identical to the present
heap_getnext: return the next tuple that passes the scankeys and
snapshot tests, scanning in the specified direction; or return NULL
if no more tuples remain.  This will simplify callers of index_getnext
quite a bit.

I also intend to provide a lower-level routine

bool index_getnext_indexitem(IndexScanDesc scan, ScanDirection direction)

which finds the next index item passing the scankeys, but ignores time
qualification issues.  The actual TID of the index tuple and its
referenced heap tuple will be available from IndexScanDesc fields if
TRUE is returned.  This routine will satisfy those few callers of
index_getnext who actually wanted to see the individual index entries.
(In current code the only users of this routine will be the index AMs'
own bulk-delete routines, so I'm not sure there's any need to export
such an API at all --- but will provide it just in case.)

The interfaces from these routines to the underlying index AMs will
have parallel changes.  In particular, the index AMs' gettuple routines
will change to have signatures equivalent to index_getnext_indexitem:
rather than having to palloc and pfree an IndexRetrieveResult object
on each cycle, they'll just return TRUE or FALSE, passing the TID info
in the IndexScanDesc.

Although I think these changes are worth making just on grounds of
code beautification, the real motivation for doing this is to centralize
time-qual checking for indexscans into index_getnext, rather than having
it scattered over all the callers as at present.  Once that's done we
will have a single point of attack for addressing the problem of killing
dead index tuples in advance of VACUUM.  What I am looking to do once this
API change is in place is to make index_getnext look like
for (;;){    get next index tuple;    if (no more tuples)        return NULL;    heap_fetch corresponding heap tuple;
if (HeapTupleSatisfies(tuple, scan->snapshot))        return tuple;    /*     * If we can't see it, maybe no one can.
 */    if (HeapTupleSatisfiesVacuum(tuple) == HEAPTUPLE_DEAD)        kill the index entry;}
 

where "kill the index entry" involves informing the index AM that it can
somehow mark the index entry uninteresting and not to be returned at all
during future indexscans.  (For performance reasons this'll probably get
merged into the next "get next index tuple" operation, but that remains
to be designed in detail.)

Comments?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bear Giles
Date:
Subject: Re: pq_eof() broken with SSL
Next
From: "Jim Buttafuoco"
Date:
Subject: bt_fixroot: not valid old root page