Re: 64.4.2. Bottom-up Index Deletion - Mailing list pgsql-docs

From David G. Johnston
Subject Re: 64.4.2. Bottom-up Index Deletion
Date
Msg-id CAKFQuwZ7=V8yXQB00KRNgDfQ=_8WMcw9WFmqYnhETfWENzC-6A@mail.gmail.com
Whole thread Raw
In response to Re: 64.4.2. Bottom-up Index Deletion  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: 64.4.2. Bottom-up Index Deletion  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-docs
On Mon, Nov 7, 2022 at 5:20 PM Peter Geoghegan <pg@bowt.ie> wrote:
Hi Hussein,

Apologies for the very delayed response. I'm aware that you've taken
an interest in this subject as part of your YouTube channel. Thanks
for publicizing the work!

On Tue, Jul 12, 2022 at 7:14 PM PG Doc comments form
<noreply@postgresql.org> wrote:
> Would be nice to add a note: old tuple versions in the index referencing the
> same logical row cannot be deleted by bottom up index deletion process when
> older transactions that might require the old state the row are still
> running

It's really hard to write documentation for something like this,
because it's difficult to decide what your audience really needs to
know. I agree that it's important to get this specific point across,
though. In fact I thought that I already conveyed the same idea at
this point:

"All indexes will need a successor physical index tuple that points to
the latest version in the table. Each new tuple within each index will
generally need to coexist with the original “updated” tuple for a
short period of time (typically until shortly after the UPDATE
transaction commits)."

The implication is that we need the old version to coexist until after
the updater transaction commits and is seen by every possible MVCC
snapshot as having committed -- nobody sees the old version anymore.
Maybe we could augment the existing sentences I have highlighted?
Could it be more explicit?

I'm having trouble finding any major issues with the present wording.  Though it seems to be assuming the reader holds sufficient MVCC knowledge to understand the import of "until shortly after the UPDATE transaction commits".  Maybe a bit more explicitness is in order.

On the point of "will generally need to coexist" - I don't see why we are being wishy-washy here, though.

When updating a row where bottom-up deletion is chosen the most recent tuple cannot be removed to make room for the new tuple; in particular, because the current update may not commit.

I'm also not inherently understanding how the bottom-up pass can know a tuple is safe to remove based upon visibility information when that information is not present in the index AND it doesn't rely upon LP_DEAD.

A bit nit-picky but I think relevant to the above confusion:

"B-Tree indexes incrementally delete" - is it really the index self-modifying or is it an active user session taking some time to perform each pass?  Describing it as, say: 

"The updating session will locate all the logically equivalent tuples (on the same page) via the index and check them for global visibility, removing those that it finds that are both older than the most recent tuple and no longer visible to all other sessions."

David J.

pgsql-docs by date:

Previous
From: Maciek Sakrejda
Date:
Subject: Re: Usability ideas: text width and headers that are links
Next
From: Peter Geoghegan
Date:
Subject: Re: 64.4.2. Bottom-up Index Deletion