Re: Deleting older versions in unique indexes to avoid page splits - Mailing list pgsql-hackers

From Victor Yegorov
Subject Re: Deleting older versions in unique indexes to avoid page splits
Date
Msg-id CAGnEbohMXdMKVLaU_SwKrZM1q4RGO4_-U2Ati6-8FvwUTFkkcQ@mail.gmail.com
Whole thread Raw
In response to Re: Deleting older versions in unique indexes to avoid page splits  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Deleting older versions in unique indexes to avoid page splits  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
ср, 25 нояб. 2020 г. в 05:35, Peter Geoghegan <pg@bowt.ie>:
Then I had a much better idea: Make the existing LP_DEAD stuff a
little more like bottom-up index deletion. We usually have to access
heap blocks that the index tuples point to today, in order to have a
latestRemovedXid cutoff (to generate recovery conflicts). It's worth
scanning the leaf page for index tuples with TIDs whose heap block
matches the index tuples that actually have their LP_DEAD bits set.
This only consumes a few more CPU cycles. We don't have to access any
more heap blocks to try these extra TIDs, so it seems like a very good
idea to try them out.

I don't seem to understand this.

Is it: we're scanning the leaf page for all LP_DEAD tuples that point to the same
heap block? Which heap block we're talking about here, the one that holds
entry we're about to add (the one that triggered bottom-up-deletion due to lack
of space I mean)?

I ran the regression tests with an enhanced version of the patch, with
this LP_DEAD-deletion-with-extra-TIDs thing. It also had custom
instrumentation that showed exactly what happens in each case. We
manage to delete at least a small number of extra index tuples in
almost all cases -- so we get some benefit in practically all cases.
And in the majority of cases we can delete significantly more. It's
not uncommon to increase the number of index tuples deleted. It could
go from 1 - 10 or so without the enhancement to LP_DEAD deletion, to
50 - 250 with the LP_DEAD enhancement. Some individual LP_DEAD
deletion calls can free more than 50% of the space on the leaf page.

I am missing a general perspective here.

Is it true, that despite the long (vacuum preventing) transaction we can re-use space,
as after the DELETE statements commits, IndexScans are setting LP_DEAD hints after
they check the state of the corresponding heap tuple?

If my thinking is correct for both cases — nature of LP_DEAD hint bits and the mechanics of
suggested optimization — then I consider this a very promising improvement!

I haven't done any testing so far since sending my last e-mail.
If you'll have a chance to send a new v10 version with LP_DEAD-deletion-with-extra-TIDs thing,
I will do some tests (planned).

--
Victor Yegorov

pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: Implementing Incremental View Maintenance
Next
From: Masahiko Sawada
Date:
Subject: Re: Transactions involving multiple postgres foreign servers, take 2