Hi,
We usually want to release lwlocks, and definitely spinlocks, before
calling SetLatch(), to avoid putting a system call into the locked
region so that we minimise the time held. There are a few places
where we don't do that, possibly because it's not just a simple latch
to hold a pointer to but rather a set of them that needs to be
collected from some data structure and we don't have infrastructure to
help with that. There are also cases where we semi-reliably create
lock contention, because the backends that wake up immediately try to
acquire the very same lock.
One example is heavyweight lock wakeups. If you run BEGIN; LOCK TABLE
t; ... and then N other sessions wait in SELECT * FROM t;, and then
you run ... COMMIT;, you'll see the first session wake all the others
while it still holds the partition lock itself. They'll all wake up
and begin to re-acquire the same partition lock in exclusive mode,
immediately go back to sleep on *that* wait list, and then wake each
other up one at a time in a chain. We could avoid the first
double-bounce by not setting the latches until after we've released
the partition lock. We could avoid the rest of them by not
re-acquiring the partition lock at all, which ... if I'm reading right
... shouldn't actually be necessary in modern PostgreSQL? Or if there
is another reason to re-acquire then maybe the comment should be
updated.
Presumably no one really does that repeatedly while there is a long
queue of non-conflicting waiters, so I'm not claiming it's a major
improvement, but it's at least a micro-optimisation.
There are some other simpler mechanical changes including synchronous
replication, SERIALIZABLE DEFERRABLE and condition variables (this one
inspired by Yura Sokolov's patches[1]). Actually I'm not at all sure
about the CV implementation, I feel like a more ambitious change is
needed to make our CVs perform.
See attached sketch patches. I guess the main thing that may not be
good enough is the use of a fixed sized latch buffer. Memory
allocation in don't-throw-here environments like the guts of lock code
might be an issue, which is why it just gives up and flushes when
full; maybe it should try to allocate and fall back to flushing only
if that fails. These sketch patches aren't proposals, just
observations in need of more study.
[1] https://postgr.es/m/1edbb61981fe1d99c3f20e3d56d6c88999f4227c.camel%40postgrespro.ru