On 05/04/2019 23:25, Andres Freund wrote:
> I think what's in v12 - I don't know of any non-cleanup / bugfix work
> pending for 12 - is a pretty reasonable initial set of features.
Hooray!
> - the (optional) bitmap heap scan API - that's fairly intrinsically
> block based. An AM could just internally subdivide TIDs in a different
> way, but I don't think a bitmap scan like we have would e.g. make a
> lot of sense for an index oriented table without any sort of stable
> tid.
If an AM doesn't implement the bitmap heap scan API, what happens?
Bitmap scans are disabled?
Even if an AM isn't block-oriented, the bitmap heap scan API still makes
sense as long as there's some correlation between TIDs and physical
location. The only really broken thing about that currently is the
prefetching: nodeBitmapHeapScan.c calls PrefetchBuffer() directly with
the TID's block numbers. It would be pretty straightforward to wrap that
in a callback, so that the AM could do something different.
Or move even more of the logic to the AM, so that the AM would get the
whole TIDBitmap in table_beginscan_bm(). It could then implement the
fetching and prefetching as it sees fit.
I don't think it's urgent, though. We can cross that bridge when we get
there, with the first AM that needs that flexibility.
> The most constraining factor for storage, I think, is that currently the
> API relies on ItemPointerData style TIDs in a number of places (i.e. a 6
> byte tuple identifier).
I think 48 bits would be just about enough, but it's even more limited
than you might at the moment. There are a few places that assume that
the offsetnumber <= MaxHeapTuplesPerPage. See ginpostinglist.c, and
MAX_TUPLES_PER_PAGE in tidbitmap.c. Also, offsetnumber can't be 0,
because that makes the ItemPointer invalid, which is inconvenient if you
tried to use ItemPointer as just an arbitrary 48-bit integer.
- Heikki