Re: MaxOffsetNumber for Table AMs - Mailing list pgsql-hackers

From Robert Haas
Subject Re: MaxOffsetNumber for Table AMs
Date
Msg-id CA+TgmoZeVGJ0_SNhJkFG=+OPD4GUKYEMwNJNo4vPtSK4Tn2cJQ@mail.gmail.com
Whole thread Raw
In response to Re: MaxOffsetNumber for Table AMs  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: MaxOffsetNumber for Table AMs
List pgsql-hackers
On Wed, May 5, 2021 at 1:15 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > I don't think this is true at all. If you have a clustered index -
> > i.e. the table is physically arranged according to the index ordering
> > - then your secondary indexes all pretty much have to be what we're
> > calling indirect indexes. They can hardly point to a physical
> > identifier if rows are being moved around. I believe InnoDB works this
> > way, and I think Oracle's index-organized tables do too. I suspect
> > there are other examples.
>
> But these systems don't have indirect indexes *on a heap table*! Why
> would they ever do it that way? They already have rowid/TID as a
> stable identifier of logical rows, so having indirect indexes that
> point to a heap table's rows would be strictly worse than the generic
> approach for indexes on a heap table.

One advantage of indirect indexes is that you can potentially avoid a
lot of writes to the index. If a non-HOT update is performed, but the
primary key is not updated, the index does not need to be touched. I
think that's a potentially significant savings, even if bottom-up
index deletion would have prevented the page splits. Similarly, you
can mark a dead line pointer unused without having to scan the
indirect index, because the index isn't pointing to that dead line
pointer anyway.

Hmm, but I guess you have another cleanup problem. What prevents
someone from inserting a new row with the same primary key as a
previously-deleted row but different values in some indirectly-indexed
column? Then the old index entries, if still present, could mistakenly
refer to the new row. I don't know whether Alvaro thought of that
problem when he was working on this previously, or whether he solved
it somehow. Possibly that's a big enough problem that the whole idea
is dead in the water, but it's not obvious to me that this is so.

And, anyway, this whole argument is predicated on the fact that the
only table AM we have right now is heapam. If we had a table AM that
organized the data by primary key value, we'd still want to be able to
have secondary indexes, and they'd have to use the primary key value
as the TID.

> I think that global indexes are well worth having, and should be
> solved some completely different way. The partition key can be an
> additive thing.

I agree that the partition identifier should be an additive thing, but
where would we add it? It seems to me that the obvious answer is to
make it a column of the index tuple. And if we can do that, why can't
we put whatever kind of TID-like stuff people want in the index tuple,
too? Maybe part of the problem here is that I don't actually
understand how posting lists are represented...

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: MaxOffsetNumber for Table AMs
Next
From: Jeff Davis
Date:
Subject: Re: MaxOffsetNumber for Table AMs