Re: MaxOffsetNumber for Table AMs - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: MaxOffsetNumber for Table AMs |
Date | |
Msg-id | CA+TgmoZeVGJ0_SNhJkFG=+OPD4GUKYEMwNJNo4vPtSK4Tn2cJQ@mail.gmail.com Whole thread Raw |
In response to | Re: MaxOffsetNumber for Table AMs (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: MaxOffsetNumber for Table AMs
|
List | pgsql-hackers |
On Wed, May 5, 2021 at 1:15 PM Peter Geoghegan <pg@bowt.ie> wrote: > > I don't think this is true at all. If you have a clustered index - > > i.e. the table is physically arranged according to the index ordering > > - then your secondary indexes all pretty much have to be what we're > > calling indirect indexes. They can hardly point to a physical > > identifier if rows are being moved around. I believe InnoDB works this > > way, and I think Oracle's index-organized tables do too. I suspect > > there are other examples. > > But these systems don't have indirect indexes *on a heap table*! Why > would they ever do it that way? They already have rowid/TID as a > stable identifier of logical rows, so having indirect indexes that > point to a heap table's rows would be strictly worse than the generic > approach for indexes on a heap table. One advantage of indirect indexes is that you can potentially avoid a lot of writes to the index. If a non-HOT update is performed, but the primary key is not updated, the index does not need to be touched. I think that's a potentially significant savings, even if bottom-up index deletion would have prevented the page splits. Similarly, you can mark a dead line pointer unused without having to scan the indirect index, because the index isn't pointing to that dead line pointer anyway. Hmm, but I guess you have another cleanup problem. What prevents someone from inserting a new row with the same primary key as a previously-deleted row but different values in some indirectly-indexed column? Then the old index entries, if still present, could mistakenly refer to the new row. I don't know whether Alvaro thought of that problem when he was working on this previously, or whether he solved it somehow. Possibly that's a big enough problem that the whole idea is dead in the water, but it's not obvious to me that this is so. And, anyway, this whole argument is predicated on the fact that the only table AM we have right now is heapam. If we had a table AM that organized the data by primary key value, we'd still want to be able to have secondary indexes, and they'd have to use the primary key value as the TID. > I think that global indexes are well worth having, and should be > solved some completely different way. The partition key can be an > additive thing. I agree that the partition identifier should be an additive thing, but where would we add it? It seems to me that the obvious answer is to make it a column of the index tuple. And if we can do that, why can't we put whatever kind of TID-like stuff people want in the index tuple, too? Maybe part of the problem here is that I don't actually understand how posting lists are represented... -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: