Home > mailing lists

Re: Latches vs lwlock contention - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Latches vs lwlock contention
Date	July 22 22:15:37
Msg-id	7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi Whole thread Raw
In response to	Re: Latches vs lwlock contention (Heikki Linnakangas <hlinnaka@iki.fi>)
List	pgsql-hackers

Tree view

On 28/09/2023 12:58, Heikki Linnakangas wrote:
> On 28/10/2022 06:56, Thomas Munro wrote:
>> One example is heavyweight lock wakeups.  If you run BEGIN; LOCK TABLE
>> t; ... and then N other sessions wait in SELECT * FROM t;, and then
>> you run ... COMMIT;, you'll see the first session wake all the others
>> while it still holds the partition lock itself.  They'll all wake up
>> and begin to re-acquire the same partition lock in exclusive mode,
>> immediately go back to sleep on*that*  wait list, and then wake each
>> other up one at a time in a chain.  We could avoid the first
>> double-bounce by not setting the latches until after we've released
>> the partition lock.  We could avoid the rest of them by not
>> re-acquiring the partition lock at all, which ... if I'm reading right
>> ... shouldn't actually be necessary in modern PostgreSQL?  Or if there
>> is another reason to re-acquire then maybe the comment should be
>> updated.
> 
> ISTM that the change to not re-aqcuire the lock in ProcSleep is
> independent from the other changes. Let's split that off to a separate
> patch.

I spent some time on splitting that off. I had to start from scratch, 
because commit 2346df6fc373df9c5ab944eebecf7d3036d727de conflicted 
heavily with your patch.

I split ProcSleep() into two functions: JoinWaitQueue does the first 
part of ProcSleep(), adding the process to the wait queue and checking 
for the dontWait and early deadlock cases. What remains in ProcSleep() 
does just the sleeping part. JoinWaitQueue is called with the partition 
lock held, and ProcSleep() is called without it. This way, the partition 
lock is acquired and released in the same function 
(LockAcquireExtended), avoiding awkward "lock is held on enter, but 
might be released on exit depending on the outcome" logic.

This is actually a set of 8 patches. The first 7 are independent tiny 
fixes and refactorings in these functions. See individual commit messages.

> I agree it should be safe. Acquiring a lock just to hold off interrupts
> is overkill anwyway, HOLD_INTERRUPTS() would be enough.
> LockErrorCleanup() uses HOLD_INTERRUPTS() already.
> 
> There are no CHECK_FOR_INTERRUPTS() in GrantAwaitedLock(), so cancel/die
> interrupts can't happen here. But could we add HOLD_INTERRUPTS(), just
> pro forma, to document the assumption? It's a little awkward: you really
> should hold interrupts until the caller has done "awaitedLock = NULL;".
> So it's not quite enough to add a pair of HOLD_ and RESUME_INTERRUPTS()
> at the end of ProcSleep(). You'd need to do the HOLD_INTERRUPTS() in
> ProcSleep() and require the caller to do RESUME_INTERRUPTS(). In a
> sense, ProcSleep downgrades the lock on the partition to just holding
> off interrupts.

I didn't use HOLD/RESUME_INTERRUPTS() after all. Like you did, I'm just 
relying on the fact that there are no CHECK_FOR_INTERRUPTS() calls in 
places where they might cause trouble. Those sections are short, so I 
think it's fine.

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Attachment

pgsql-hackers by date:

From: Masahiko Sawada
Date: 22 July, 21:50:34
Subject: Re: xid_wraparound tests intermittent failure.

From: Greg Sabino Mullane
Date: 22 July, 22:23:50
Subject: Normalize queries starting with SET for pg_stat_statements

Re: Latches vs lwlock contention - Mailing list pgsql-hackers

Attachment

Previous

Next