ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().
Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.
This is safe because people checking visibility of an xid must always run TransactionIdIsInProgress() first to avoid race conditions, which will always return true for the transaction we are currently committing. As a result, we never get concurrent access to the same bits in clog, which would require a barrier.
Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.
Could it be possible to see some performance numbers? For example with a simple pgbench script doing a bunch of tiny transactions, with many concurrent sessions (perhaps hundreds). --