Re: MaxOffsetNumber for Table AMs - Mailing list pgsql-hackers

From Robert Haas
Subject Re: MaxOffsetNumber for Table AMs
Date
Msg-id CA+TgmobU68bat=x=WjnkHPBzvOUz+wnoEu6u03VYRkHiwoAqYA@mail.gmail.com
Whole thread Raw
In response to Re: MaxOffsetNumber for Table AMs  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: MaxOffsetNumber for Table AMs  (Peter Geoghegan <pg@bowt.ie>)
Re: MaxOffsetNumber for Table AMs  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Fri, Apr 30, 2021 at 1:37 PM Jeff Davis <pgsql@j-davis.com> wrote:
> The particular problem I have now is that Table AMs seem to support
> indexes just fine, but TIDs are under-specified so I don't know what I
> really have to work with. BlockNumber seems well-specified as
> 0..0XFFFFFFFE (inclusive), but I don't know what the valid range of
> OffsetNumber is for the purposes of a TableAM.

I agree that this is a problem.

> Part of changing to uint64 would be specifying the TIDs in a way that I
> could rely on in the future.

I mean, from my perspective, the problem here is that the abstraction
layer is leaky and things outside of the table AM layer know what heap
is doing under the hood, and rely on it. If we could refactor the
abstraction to be less leaky, it would be clearer what assumptions
table AM authors can make. If we can't, any specification doesn't seem
worth much.

> In the future we may support primary unique indexes at the table AM
> layer, which would get more interesting. I can see an argument for a
> TID being an arbitrary datum in that case, but I haven't really
> considered the design implications. Is this what you are suggesting?

I think that would be the best long-term plan. I guess I have two
distinguishable concerns. One is that I want to be able to have
indexes with a payload that's not just a 6-byte TID; e.g. adding a
partition identifier to support global indexes, or replacing the
6-byte TID with a primary key reference to support indirect indexes.
The other concern is to be able to have table AMs that use arbitrary
methods to identify a tuple. For example, if somebody implemented an
index-organized table, the "TID" would really be the primary key.

Even though these are distinguishable concerns, they basically point
in the same direction as far as index layout is concerned. The
implications for the table AM layer are somewhat different in the two
cases, but both argue that some places that are now talking about TIDs
should be changed to talk about Datums or something of that sort.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: MaxOffsetNumber for Table AMs
Next
From: Robert Haas
Date:
Subject: Re: Procedures versus the "fastpath" API