Re: Deleting older versions in unique indexes to avoid page splits - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Deleting older versions in unique indexes to avoid page splits
Date
Msg-id CAH2-Wzn3MLnyQAU_iToa6Q0b7r9SoJiUKLNt+Ji+hcJH9SKmRw@mail.gmail.com
Whole thread Raw
In response to Re: Deleting older versions in unique indexes to avoid page splits  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Deleting older versions in unique indexes to avoid page splits  (Victor Yegorov <vyegorov@gmail.com>)
List pgsql-hackers
On Tue, Nov 17, 2020 at 12:45 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > I am thinking of two more scenarios that require testing:
> > - queue in the table, with a high rate of INSERTs+DELETEs and a long transaction.
>
> I see your point. This is going to be hard to make work outside of
> unique indexes, though. Unique indexes are already not dependent on
> the executor hint -- they can just use the "uniquedup" hint. The code
> for unique indexes is prepared to notice duplicates in
> _bt_check_unique() in passing, and apply the optimization for that
> reason.

I thought about this some more. My first idea was to simply always try
out bottom-up deletion (i.e. behave as if the hint from the executor
always indicates that it's favorable). I couldn't really justify that
approach, though. It results in many bottom-up deletion passes that
end up wasting cycles (and unnecessarily accessing heap blocks).

Then I had a much better idea: Make the existing LP_DEAD stuff a
little more like bottom-up index deletion. We usually have to access
heap blocks that the index tuples point to today, in order to have a
latestRemovedXid cutoff (to generate recovery conflicts). It's worth
scanning the leaf page for index tuples with TIDs whose heap block
matches the index tuples that actually have their LP_DEAD bits set.
This only consumes a few more CPU cycles. We don't have to access any
more heap blocks to try these extra TIDs, so it seems like a very good
idea to try them out.

I ran the regression tests with an enhanced version of the patch, with
this LP_DEAD-deletion-with-extra-TIDs thing. It also had custom
instrumentation that showed exactly what happens in each case. We
manage to delete at least a small number of extra index tuples in
almost all cases -- so we get some benefit in practically all cases.
And in the majority of cases we can delete significantly more. It's
not uncommon to increase the number of index tuples deleted. It could
go from 1 - 10 or so without the enhancement to LP_DEAD deletion, to
50 - 250 with the LP_DEAD enhancement. Some individual LP_DEAD
deletion calls can free more than 50% of the space on the leaf page.

I believe that this is a lower risk way of doing better when there is
a high rate of INSERTs+DELETEs. Most of the regression test cases I
looked at were in the larger system catalog indexes, which often look
like that.

We don't have to be that lucky for a passing index scan to set at
least one or two LP_DEAD bits. If there is any kind of
physical/logical correlation, then we're bound to also end up deleting
some extra index tuples by the time that the page looks like it might
need to be split.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: [PATCH] Add features to pg_stat_statements
Next
From: Michael Paquier
Date:
Subject: Re: A few new options for CHECKPOINT