Re: IPC/MultixactCreation on the Standby server - Mailing list pgsql-hackers

From Álvaro Herrera
Subject Re: IPC/MultixactCreation on the Standby server
Date
Msg-id 202507181030.k5pakywfa3xk@alvherre.pgsql
Whole thread Raw
In response to Re: IPC/MultixactCreation on the Standby server  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: IPC/MultixactCreation on the Standby server
List pgsql-hackers
On 2025-Jul-17, Andrey Borodin wrote:

> Thinking more about the problem I see 3 ways to deal with this deadlock:
> 1. We check for recovery conflict even in presence of
> InterruptHoldoffCount. That's what patch v4 does.
> 2. Teach page_collect_tuples() to do HeapTupleSatisfiesVisibility()
> without holding buffer lock.
> 3. Why do we even HOLD_INTERRUPTS() when aquire shared lock??

Hmm, as you say, doing (3) is a very invasive system-wide change, but
can we do it more localized?  I mean, what if we do RESUME_INTERRUPTS()
just before going to sleep on the CV, and restore with HOLD_INTERRUPTS()
once the sleep is done?  That would only affect this one place rather
than the whole system, and should also (AFAICS) solve the issue.


> Yet, I see 3 as a correct solution. Can't we just abstain from
> HOLD_INTERRUPTS() if taken LWLock is not exclusive?

Hmm, the code in LWLockAcquire says

    /*
     * Lock out cancel/die interrupts until we exit the code section protected
     * by the LWLock.  This ensures that interrupts will not interfere with
     * manipulations of data structures in shared memory.
     */
    HOLD_INTERRUPTS();

which means if we want to change this, we would have to inspect every
single use of LWLocks in shared mode in order to be certain that such a
change isn't problematic.  This is a discussion I'm not prepared for.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Si quieres ser creativo, aprende el arte de perder el tiempo"



pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: Foreign key isolation tests
Next
From: Alexander Korotkov
Date:
Subject: Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly