Re: Improve the efficiency of _bt_killitems. - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Improve the efficiency of _bt_killitems.
Date
Msg-id e0aa1bbd-7aea-47d9-ae55-df55fda034c3@iki.fi
Whole thread Raw
In response to Re: Improve the efficiency of _bt_killitems.  (feichanghong <feichanghong@qq.com>)
List pgsql-hackers
On 01/11/2024 10:41, feichanghong wrote:
> 
> 
>> On Nov 1, 2024, at 16:24, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>
>> On 01/11/2024 09:19, feichanghong wrote:
>>> Hi hackers,
>>> In the _bt_killitems function, the following logic is present: we 
>>> search to the right for an index item that matches the heap TID and 
>>> attempt to mark it as dead. If that index item has already been 
>>> marked as dead by other concurrent processes, we will continue
>>> searching. However, there should not be any more matching index items
>>> on the current page.
>> Why could there not be more matching items on the page?
>>
>> Are you assuming a unique index? Even then it's not right; you can 
>> have multiple index entries point to different heap tuples with the 
>> same key, as long as they're not visible at the same time. For 
>> example, if you UPDATE or DELETE+INSERT a row.
> 
> Maybe I didn't describe it clearly. What I meant to say is that there 
> shouldn't
> be multiple index items on the current page pointing to the same heap 
> TID(ctid).
> rather than the same index key.

Ah, gotcha. That makes sense.

There's more we could do here in the similar vein:

If the TID was found in a posting list, but not all of the items were 
dead, 'killtuple' is set to false so we will still continue the search.

Then there's this comment:

>                 /*
>                  * Don't bother advancing the outermost loop's int iterator to
>                  * avoid processing killed items that relate to the same
>                  * offnum/posting list tuple.  This micro-optimization hardly
>                  * seems worth it.  (Further iterations of the outermost loop
>                  * will fail to match on this same posting list's first heap
>                  * TID instead, so we'll advance to the next offnum/index
>                  * tuple pretty quickly.)
>                  */

I think we should do that micro-optimization. It's a single line of code 
and a single instruction to advance 'i' variable AFAICS. While I agree 
with the comment that it doesn't matter much performance-wise, it seems 
simpler to just do it than explain why we don't do it.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: Improving tracking/processing of buildfarm test failures
Next
From: Peter Eisentraut
Date:
Subject: Re: Adding NetBSD and OpenBSD to Postgres CI