Re: MaxOffsetNumber for Table AMs - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: MaxOffsetNumber for Table AMs
Date
Msg-id CAEze2WgnnYRK-Lp4Ch6NN+y-fjnaeDgEfBcER5m-8gu7QjCFbg@mail.gmail.com
Whole thread Raw
In response to Re: MaxOffsetNumber for Table AMs  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: MaxOffsetNumber for Table AMs
List pgsql-hackers
On Wed, 5 May 2021 at 22:09, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Wed, May 5, 2021 at 12:43 PM Matthias van de Meent
> <boekewurm+postgres@gmail.com> wrote:
> > I believe that it cannot be "just" an additive thing, at least not
> > through a normal INCLUDEd column, as you'd get duplicate TIDs in the
> > index, with its related problems. You also cannot add it as a key
> > column, as this would disable UNIQUE indexes; one of the largest use
> > cases of global indexes. So, you must create specialized
> > infrastructure for this identifier.
>
> You're just quibbling about the precise words that I used. Of course
> it is true that there must be some sense in which a global index
> partition key attribute will need to be special to the implementation
> -- how else could a global index enforce uniqueness? That was clearly
> implied.

This implication was not 100% clear to me, and the last thread on
global indexes that implemented it through INCLUDEd columns didn't
mention this. As such, I wanted to explicitly mention that this
partition/table identifier would need to be part of the keyspace.

> > And when we're already adding specialized infrastructure, then this
> > should probably be part of a new TID infrastructure.
>
> This is a non-sequitur.

I may have skipped some reasoning:

I believe that the TID is the unique identifier of that tuple, within context.

For normal indexes, the TID as supplied directly by the TableAM is
sufficient, as the context is that table.
For global indexes, this TID must include enough information to relate
it to the table the tuple originated from.
In the whole database, that would be the OID of the table + the TID as
supplied by the table.

As such, the identifier of the logical row (which can be called the
TID), as stored in index tuples in global indexes, would need to
consist of the TableAM supplied TID + the (local) id of the table
containing the tuple. Assuming we're in agreement on that part, I
would think it would be consistent to put this in TID infrastructure,
such that all indexes that use such new TID infrastructure can be
defined to be global with only minimal effort.

> > And if we're going to change TID infrastructure to allow for more
> > sizes (as we'd need normal TableAM TIDs, and global index
> > partition-identifying TIDs), I'd argue that it should not be too much
> > more difficult to create an infrastructure for 'new TID' in which the
> > table AM supplies type, size and strict ordering information for these
> > 'new TID's.
> >
> > And if this 'new TID' size is not going to be defined by the index AM
> > but by the indexed object (be it a table or a 'global' or whatever
> > we'll build indexes on), I see no reason why this 'new TID'
> > infrastructure couldn't eventually support variable length TIDs; or
> > constant sized usertype TIDs (e.g. the 3 int columns of the primary
> > key of a clustered table).
>
> You're not considering the big picture. It's not self-evident that
> anybody will ever have much use for a variable-width TID in their
> table AM, at least beyond some fairly simple scheme -- because of the
> fundamental issue of TID not working as a stable identifier of logical
> rows in Postgres.

ZHeap states that it can implement stable TIDs within limits, as IIRC
it requires retail index deletion support for all indexes on the
updated columns of that table. I fail to see why this same
infrastructure could not be used for supporting clustered tables,
while enforcing these limits only soft enforced in ZHeap (that is,
only allowing index AMs that support retail index tuple deletion).

> If it was very clear that there would be *some*
> significant benefit then the costs might start to look reasonable. But
> there isn't. "Build it and they will come" is not at all convincing to
> me.

Clustered tables / Index-oriented Tables are very useful for tables of
which most columns are contained in the PK, or otherwise are often
ordered by their PK. I don't know of any way that would allow us to
build a clustered table _without_ including the primary key in some
form into the TID, or otherwise introducing a layer of indirection
that would undo the clustered access implicated by the clustered
table.

Additionally, compacting/re-clustering a table would be _much_ cheaper
for clustered tables, as the indexes attached to that table would not
need rebuilding: all TIDs will stay valid across the clustering
operation.

With regards,

Matthias van de Meent



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Bogus collation version recording in recordMultipleDependencies
Next
From: David Fetter
Date:
Subject: Make some column descriptions easier to distinguish visually