page_collect_tuples without long lock on page (Was Re: IPC/MultixactCreation on the Standby server) - Mailing list pgsql-hackers

From Yura Sokolov
Subject page_collect_tuples without long lock on page (Was Re: IPC/MultixactCreation on the Standby server)
Date
Msg-id ccae4510-07f2-452b-884e-93547e934d58@postgrespro.ru
Whole thread Raw
In response to Re: IPC/MultixactCreation on the Standby server  (Andrey Borodin <x4mmm@yandex-team.ru>)
List pgsql-hackers
17.07.2025 21:34, Andrey Borodin пишет:
>> On 30 Jun 2025, at 15:58, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>> page_collect_tuples() holds a lock on the buffer while examining tuples visibility, having InterruptHoldoffCount >
0.Tuple visibility check might need WAL to go on, we have to wait until some next MX be filled in.
 
>> Which might need a buffer lock or have a snapshot conflict with caller of page_collect_tuples().
> 
> Thinking more about the problem I see 3 ways to deal with this deadlock:
> 2. Teach page_collect_tuples() to do HeapTupleSatisfiesVisibility() without holding buffer lock.
> 
> Personally, I see point 2 as very invasive in a code that I'm not too familiar with.

If there were no SetHintBits inside of HeapTupleSatisfies* , then it could
be just "copy line pointers and tuple headers under lock, release lock,
check tuples visibility using copied arrays".

But hint bits makes it much more difficult.

Probably, tuple headers could be copied twice and compared afterwards. If
there are change in hint bits, page should be relocked.

And call to MarkBufferDirtyHint should be delayed.

A very dirty variant is in attach. I've made it just for fun. It passes
'regress', 'isolation' and 'recovery'. But I didn't benchmark it.

-- 
regards
Yura Sokolov aka funny-falcon
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: 024_add_drop_pub.pl might fail due to deadlock
Next
From: Tom Lane
Date:
Subject: Re: Regression with large XML data input