Latches vs lwlock contention - Mailing list pgsql-hackers

From Thomas Munro
Subject Latches vs lwlock contention
Date
Msg-id CA+hUKGKmO7ze0Z6WXKdrLxmvYa=zVGGXOO30MMktufofVwEm1A@mail.gmail.com
Whole thread Raw
Responses Re: Latches vs lwlock contention
Re: Latches vs lwlock contention
List pgsql-hackers
Hi,

We usually want to release lwlocks, and definitely spinlocks, before
calling SetLatch(), to avoid putting a system call into the locked
region so that we minimise the time held.  There are a few places
where we don't do that, possibly because it's not just a simple latch
to hold a pointer to but rather a set of them that needs to be
collected from some data structure and we don't have infrastructure to
help with that.  There are also cases where we semi-reliably create
lock contention, because the backends that wake up immediately try to
acquire the very same lock.

One example is heavyweight lock wakeups.  If you run BEGIN; LOCK TABLE
t; ... and then N other sessions wait in SELECT * FROM t;, and then
you run ... COMMIT;, you'll see the first session wake all the others
while it still holds the partition lock itself.  They'll all wake up
and begin to re-acquire the same partition lock in exclusive mode,
immediately go back to sleep on *that* wait list, and then wake each
other up one at a time in a chain.  We could avoid the first
double-bounce by not setting the latches until after we've released
the partition lock.  We could avoid the rest of them by not
re-acquiring the partition lock at all, which ... if I'm reading right
... shouldn't actually be necessary in modern PostgreSQL?  Or if there
is another reason to re-acquire then maybe the comment should be
updated.

Presumably no one really does that repeatedly while there is a long
queue of non-conflicting waiters, so I'm not claiming it's a major
improvement, but it's at least a micro-optimisation.

There are some other simpler mechanical changes including synchronous
replication, SERIALIZABLE DEFERRABLE and condition variables (this one
inspired by Yura Sokolov's patches[1]).  Actually I'm not at all sure
about the CV implementation, I feel like a more ambitious change is
needed to make our CVs perform.

See attached sketch patches.  I guess the main thing that may not be
good enough is the use of a fixed sized latch buffer.  Memory
allocation in don't-throw-here environments like the guts of lock code
might be an issue, which is why it just gives up and flushes when
full; maybe it should try to allocate and fall back to flushing only
if that fails.  These sketch patches aren't proposals, just
observations in need of more study.

[1] https://postgr.es/m/1edbb61981fe1d99c3f20e3d56d6c88999f4227c.camel%40postgrespro.ru

Attachment

pgsql-hackers by date:

Previous
From: Amul Sul
Date:
Subject: Re: [PROPOSAL] : Use of ORDER BY clause in insert.sql
Next
From: David Rowley
Date:
Subject: Re: [PROPOSAL] : Use of ORDER BY clause in insert.sql