Re: MaxOffsetNumber for Table AMs - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: MaxOffsetNumber for Table AMs |
Date | |
Msg-id | CA+TgmoZLNeGMp_y7ri1CFt_WBQ0rQMm=uxHUGXXnqKF8S029KA@mail.gmail.com Whole thread Raw |
In response to | Re: MaxOffsetNumber for Table AMs (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: MaxOffsetNumber for Table AMs
|
List | pgsql-hackers |
On Wed, May 5, 2021 at 11:50 AM Peter Geoghegan <pg@bowt.ie> wrote: > I'm being very vocal here because I'm concerned that we're going about > generalizing TIDs in the wrong way. To me it feels like there is a > loss of perspective about what really matters. Well, which things matter is a question of opinion, not fact. > No other database system has something like indirect indexes. They > have clustered indexes, but that's rather different. I don't think this is true at all. If you have a clustered index - i.e. the table is physically arranged according to the index ordering - then your secondary indexes all pretty much have to be what we're calling indirect indexes. They can hardly point to a physical identifier if rows are being moved around. I believe InnoDB works this way, and I think Oracle's index-organized tables do too. I suspect there are other examples. > > There might be some slight disagreement about whether it's useful to > > generalize TIDs from a 48-bit address space to a 64-bit address space > > without making it fully general. Like Andres, I am unconvinced that's > > meaningfully easier, and I am convinced that it's meaningfully less > > good, but other people can disagree and that's fine. I'm perfectly > > willing to change my opinion if somebody shows up with a patch that > > demonstrates the value of this approach. > > It's going to be hard if not impossible to provide empirical evidence > for the proposition that 64-bit wide TIDs (alongside 48-bit TIDs) are > the way to go. Same with any other scheme. We're talking way too much > about TIDs themselves and way too little about table AM use cases, the > way the data structures might work in new table AMs, and so on. I didn't mean that it has to be a test result showing that 64-bit TIDs outperform 56-bit TIDs or something. I just meant there has to be a reason to believe it's good, which could be based on a discussion of use cases or whatever. If we *don't* have a reason to believe it's good, we shouldn't do it. My point is that so far I am not seeing a whole lot of value of this proposed approach. For a 64-bit TID to be valuable to you, one of two things has to be true: you either don't care about having indexes that store TIDs on your new table type, or the index types you want to use can store those 64-bit TIDs. Now, I have not yet heard of anyone working on a table AM who does not want to be able to support adding btree indexes. There may be someone that I don't know about, and if so, fine. But otherwise, we need a way to store them. And that requires changing the page format for btree indexes. But surely we do not want to make all TIDs everywhere wider in future btree versions, so at least two TID widths - 6 bytes and 8 bytes - would have to be supported. And if we're at all going to do that, I think it's certainly worth asking whether supporting varlena TIDs would really be all that much harder. You seem to think it is, and you might be right, but I'm not ready to give up, because I do not see how we are ever going to get global indexes or indirect indexes without doing it, and those would be good features to have. If we can't ever get them, so be it, but you seem to kind of be saying that things like global indexes and indirect indexes are hard, and therefore they don't count as reasons why we might want variable-width TIDs. But one very large reason why those things are hard is that they require variable-width TIDs, so AFAICS this boils down to saying that we don't want the feature because it's hard to implement. But we should not conflate feasibility with desirability. I am quite sure that lots of people want global indexes. The number of people who want indirect indexes is in my estimation much smaller, but it's probably not zero, or else Alvaro wouldn't have tried his hand at writing a patch. Whether we can *get* those things is in doubt; whether it will happen in the near future is very much in doubt. But I at least am not in doubt about whether people want it, because I hear complaints about the lack of global indexes on an almost-daily basis. If those complaints are all from people hoping to fake me out into spending time on something that is worthless to them, my colleagues are very good actors. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: