Re: Scan by TID (was RE: [HACKERS] How to add a new build-in operator) - Mailing list pgsql-hackers

From Bernard Frankpitt
Subject Re: Scan by TID (was RE: [HACKERS] How to add a new build-in operator)
Date
Msg-id 38036A0D.363234BC@pop.dn.net
Whole thread Raw
In response to Re: Scan by TID (was RE: [HACKERS] How to add a new build-in operator)  (Bruce Momjian <maillist@candle.pha.pa.us>)
Responses Re: Scan by TID (was RE: [HACKERS] How to add a new build-in operator)
Re: Scan by TID (was RE: [HACKERS] How to add a new build-in operator)
List pgsql-hackers
Bruce,I think that an index interface would be simpler than you think. 
The index does not need any disk storage which takes out virtually all
the complexity in implementation.  All that you really need to implement
is the scan interface, and the only state that the scan needs is a
single flag that indicates when getnext has already been called once.
All that getnext need do is return the ctid, and flip the flag so that
it knows to return null on the next call.  You also need to ensure that
the access method functions used by the optimizer return appropriate
values to ensure that the cost of an index search is always zero.  I
have some suitable functions for that.


With all due respect to people who I am sure know a lot more about this
than I do, it seems to me that extensive use of TIDs in user code might
place an unwelcome restraint on the internal database design.  If you
follow the arguments of the reiserfs people, the whole idea of a
buffered cache with fix size blocks is a necessary hack to cope with a
less than optimal underlying filesystem.  In the ideal world that
reiserfs promises (:-)) disk access efficiency would be independent of
file-size, and it would be feasible to construct the buffered cache from
raw tuples of variable size.  The files on disk would be identified by
OID.  reiserfs uses a B-tree varient to cope with very large name
spaces.

Similar considerations would seem to apply if the storage layer of the
database is separated from the rest of the backend by a high-speed
qnetwork interface on something like a hard-disk farm.  ( See for
example some of the Mariposa work ).

Until things like that actually happen (Version 10.* perhaps) I can see
that TIDs are a useful addition, but you might want to fasten them in
with a pyrotechnic interface so that you can blow them away if need be.

I have a URL for the reiserfs stuff at home, if anyone is interested
email me and I will dig it up and post it.

Bernie Frankpitt


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: New developer globe (was: Re: [HACKERS] Interesting Quote you might enjoy about PGSQL.)
Next
From: Bruce Momjian
Date:
Subject: Re: Scan by TID (was RE: [HACKERS] How to add a new build-in operator)