Re: MaxOffsetNumber for Table AMs - Mailing list pgsql-hackers
From | Matthias van de Meent |
---|---|
Subject | Re: MaxOffsetNumber for Table AMs |
Date | |
Msg-id | CAEze2WgnnYRK-Lp4Ch6NN+y-fjnaeDgEfBcER5m-8gu7QjCFbg@mail.gmail.com Whole thread Raw |
In response to | Re: MaxOffsetNumber for Table AMs (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: MaxOffsetNumber for Table AMs
|
List | pgsql-hackers |
On Wed, 5 May 2021 at 22:09, Peter Geoghegan <pg@bowt.ie> wrote: > > On Wed, May 5, 2021 at 12:43 PM Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: > > I believe that it cannot be "just" an additive thing, at least not > > through a normal INCLUDEd column, as you'd get duplicate TIDs in the > > index, with its related problems. You also cannot add it as a key > > column, as this would disable UNIQUE indexes; one of the largest use > > cases of global indexes. So, you must create specialized > > infrastructure for this identifier. > > You're just quibbling about the precise words that I used. Of course > it is true that there must be some sense in which a global index > partition key attribute will need to be special to the implementation > -- how else could a global index enforce uniqueness? That was clearly > implied. This implication was not 100% clear to me, and the last thread on global indexes that implemented it through INCLUDEd columns didn't mention this. As such, I wanted to explicitly mention that this partition/table identifier would need to be part of the keyspace. > > And when we're already adding specialized infrastructure, then this > > should probably be part of a new TID infrastructure. > > This is a non-sequitur. I may have skipped some reasoning: I believe that the TID is the unique identifier of that tuple, within context. For normal indexes, the TID as supplied directly by the TableAM is sufficient, as the context is that table. For global indexes, this TID must include enough information to relate it to the table the tuple originated from. In the whole database, that would be the OID of the table + the TID as supplied by the table. As such, the identifier of the logical row (which can be called the TID), as stored in index tuples in global indexes, would need to consist of the TableAM supplied TID + the (local) id of the table containing the tuple. Assuming we're in agreement on that part, I would think it would be consistent to put this in TID infrastructure, such that all indexes that use such new TID infrastructure can be defined to be global with only minimal effort. > > And if we're going to change TID infrastructure to allow for more > > sizes (as we'd need normal TableAM TIDs, and global index > > partition-identifying TIDs), I'd argue that it should not be too much > > more difficult to create an infrastructure for 'new TID' in which the > > table AM supplies type, size and strict ordering information for these > > 'new TID's. > > > > And if this 'new TID' size is not going to be defined by the index AM > > but by the indexed object (be it a table or a 'global' or whatever > > we'll build indexes on), I see no reason why this 'new TID' > > infrastructure couldn't eventually support variable length TIDs; or > > constant sized usertype TIDs (e.g. the 3 int columns of the primary > > key of a clustered table). > > You're not considering the big picture. It's not self-evident that > anybody will ever have much use for a variable-width TID in their > table AM, at least beyond some fairly simple scheme -- because of the > fundamental issue of TID not working as a stable identifier of logical > rows in Postgres. ZHeap states that it can implement stable TIDs within limits, as IIRC it requires retail index deletion support for all indexes on the updated columns of that table. I fail to see why this same infrastructure could not be used for supporting clustered tables, while enforcing these limits only soft enforced in ZHeap (that is, only allowing index AMs that support retail index tuple deletion). > If it was very clear that there would be *some* > significant benefit then the costs might start to look reasonable. But > there isn't. "Build it and they will come" is not at all convincing to > me. Clustered tables / Index-oriented Tables are very useful for tables of which most columns are contained in the PK, or otherwise are often ordered by their PK. I don't know of any way that would allow us to build a clustered table _without_ including the primary key in some form into the TID, or otherwise introducing a layer of indirection that would undo the clustered access implicated by the clustered table. Additionally, compacting/re-clustering a table would be _much_ cheaper for clustered tables, as the indexes attached to that table would not need rebuilding: all TIDs will stay valid across the clustering operation. With regards, Matthias van de Meent
pgsql-hackers by date: