Re: [WIP] [B-Tree] Retail IndexTuple deletion - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [WIP] [B-Tree] Retail IndexTuple deletion
Date
Msg-id CAH2-WzmPf0pmnS4go9+MmqQMA0MijsZ9XR+zdMy9J2gkEzadxA@mail.gmail.com
Whole thread Raw
In response to Re: [WIP] [B-Tree] Retail IndexTuple deletion  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [WIP] [B-Tree] Retail IndexTuple deletion  ("Andrey V. Lepikhov" <a.lepikhov@postgrespro.ru>)
List pgsql-hackers
On Mon, Jul 2, 2018 at 9:28 AM, Peter Geoghegan <pg@bowt.ie> wrote:
>> Execution time of last "VACUUM test;" command on my notebook was:
>>
>> with bulk deletion: 1.6 s;
>> with Quick Vacuum Strategy: 5.2 s;
>> with Quick Vacuum Strategy & TID sorting: 0.6 s.
>
> I'm glad that you looked into this. You could make this faster still,
> by actually passing the lowest heap TID in the list of TIDs to kill to
> _bt_search() and _bt_binsrch(). You might have to work through several
> extra B-Tree leaf pages per bttargetdelete() call without this (you'll
> move right multiple times within bttargetdelete()).

I should add: I think that this doesn't matter so much in your
original test case with v1 of my patch, because you're naturally
accessing the index tuples in almost the most efficient way already,
since you VACUUM works its way from the start of the table until the
end of the table. You'll definitely need to pass a heap TID to
routines like _bt_search() once you start using my v2, though, since
that puts the heap TIDs in DESC sort order. Otherwise, it'll be almost
as slow as the plain "Quick Vacuum Strategy" case was.

In general, the big idea with my patch is that heap TID is just
another attribute. I am not "cheating" in any way; if it's not
possible to descend the tree and arrive at the right leaf page without
looking through several leaf pages, then my patch is broken.

You might also use _bt_moveright() with my patch. That way, you can
quickly detect that you need to move right immediately, without going
through all the items on the page. This should only be an issue in the
event of a concurrent page split, though. In my patch, I use
_bt_moveright() in a special way for unique indexes: I need to start
at the first leaf page a duplicate could be on for duplicate checking,
but once that's over I want to "jump immediately" to the leaf page the
index tuple actually needs to be inserted on. That's when
_bt_moveright() is called. (Actually, that looks like it breaks unique
index enforcement in the case of my patch, which I need to fix, but
you might still do this.)

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Speedup of relation deletes during recovery
Next
From: Andrew Gierth
Date:
Subject: Re: Should contrib modules install .h files?