Re: [WIP] [B-Tree] Retail IndexTuple deletion - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [WIP] [B-Tree] Retail IndexTuple deletion
Date
Msg-id CAD21AoApVGFf3q7WV5FuFzHDRtew9+fHHdCyOkk1uG+XG_6OKw@mail.gmail.com
Whole thread Raw
In response to Re: [WIP] [B-Tree] Retail IndexTuple deletion  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [WIP] [B-Tree] Retail IndexTuple deletion  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Fri, Jul 13, 2018 at 4:00 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Tue, Jul 3, 2018 at 5:17 AM, Andrey V. Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> Done.
>> Attachment contains an update for use v.2 of the 'Ensure nbtree leaf tuple
>> keys are always unique' patch.
>
> My v3 is still pending, but is now a lot better than v2. There were
> bugs in v2 that were fixed.
>
> One area that might be worth investigating is retail index tuple
> deletion performed within the executor in the event of non-HOT
> updates. Maybe LP_REDIRECT could be repurposed to mean "ghost record",
> at least in unique index tuples with no NULL values. The idea is that
> MVCC index scans can skip over those if they've already found a
> visible tuple with the same value.

I think that's a good idea. The overhead of marking it as ghost seems
small and it would speed up index scans. If MVCC index scans have
already found a visible tuples with the same value they can not only
skip scanning but also kill them? If can, we can kill index tuples
without checking the heap.

> Also, when there was about to be a
> page split, they could be treated a little bit like LP_DEAD items. Of
> course, the ghost bit would have to be treated as a hint that could be
> "wrong" (e.g. because the transaction hasn't committed yet), so you'd
> have to go to the heap in the context of a page split, to double
> check. Also, you'd need heuristics that let you give up on this
> strategy when it didn't help.
>
> I think that this could work well enough for OLTP workloads, and might
> be more future-proof than doing it in VACUUM. Though, of course, it's
> still very complicated.

Agreed.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Runtime partition pruning for MergeAppend
Next
From: Etsuro Fujita
Date:
Subject: Re: de-deduplicate code in DML execution hooks in postgres_fdw