Re: MaxOffsetNumber for Table AMs - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: MaxOffsetNumber for Table AMs |
Date | |
Msg-id | CAH2-WznmdB4AUx-KhiSGFbtyr0X5gEoB1AzzAQV2t69o1Krm_w@mail.gmail.com Whole thread Raw |
In response to | Re: MaxOffsetNumber for Table AMs (Andres Freund <andres@anarazel.de>) |
Responses |
Re: MaxOffsetNumber for Table AMs
|
List | pgsql-hackers |
On Mon, May 3, 2021 at 10:01 PM Andres Freund <andres@anarazel.de> wrote: > > For example, the TIDs should always work like unsigned integers -- the > > table AM must be willing to work with that restriction. > > Isn't that more a question of the encoding than the concrete representation? I don't think so, no. How does B-Tree deduplication work without something like that? The fact of the matter is that things are very tightly coupled in all kinds of ways. I'm all for decoupling them to the extent required to facilitate a new and useful table AM. But I am unlikely to commit to months of work based on abstract arguments and future work. I think that you'll find that I'm not the only one that sees it that way. > > You'd then have posting lists tuples in nbtree whose TIDs were all > > either 6 bytes or 8 bytes wide, with a mix of each possible (though > > not particularly likely) on the same leaf page. Say when you have a > > table that exceeds the current MaxBlockNumber restrictions. It would > > be relatively straightforward for nbtree deduplication to simply > > refuse to mix 6 byte and 8 byte datums together to avoid complexity in > > boundary cases. The deduplication pass logic has the flexibility that > > this requires already. > > Which nbtree cases do you think would have an easier time supporting > switching between 6 or 8 byte tids than supporting fully variable width > tids? Given that IndexTupleData already is variable-width, it's not > clear to me why supporting two distinct sizes would be harder than a > fully variable size? I assume it's things like BTDedupState->htids? Stuff like that, yeah. The space utilization stuff inside nbtsplitloc.c and nbtdedup.c pretty much rests on the assumption that TIDs are fixed width. Obviously there are some ways in which that could be revised if there was a really good reason to do so -- like an actual concrete reason with some clear basis in reality. You have no obligation to make me happy, but FYI I find arguments like "but why wouldn't you just allow arbitrary-width TIDs?" to be deeply unconvincing. Do you really expect me to do a huge amount of work and risk a lot of new bugs, just to facilitate something that may or may not ever happen? Would you do that if you were in my position? > I don't think anybody is arguing that AMs cannot accept any restrictions? I do > think it's pretty clear that it's not entirely obvious what the concrete set > of proper restrictions would be, where we won't end up needing to re-evaluate > limits in a few years are. I'm absolutely fine with the fact that the table AM has these issues -- I would expect it. I would like to help! I just find these wildly abstract discussions to be close to a total waste of time. The idea that we should let a thousand table AM flowers bloom and then review what to do seems divorced from reality. Even if the table AM becomes wildly successful there will still only have been maybe 2 - 4 table AMs that ever really had a chance. Supposing that we have no idea what they could possibly look like just yet is just navel gazing. > If you add to that the fact that variable-width tids will often end up > considerably smaller than our current tids, it's not obvious why we should use > bitspace somewhere to indicate an 8 byte tid instead of a a variable-width > tid? It's not really the space overhead. It's the considerable complexity that it would add. -- Peter Geoghegan
pgsql-hackers by date: