I have tested this patch on a 2-socket machine, but don't see any performance change in the various runs. However, there is no regression either in all cases.
Hm, so if we can't demonstrate a performance win, it's hard to justify risking touching this code. What test case(s) did you use?
I ran pgbench (-M prepared) with synchronous_commit 'on' and 'off' using both logged and unlogged tables. Also ran an internal benchmark which didn't show anything either.
What scale factor and client count? How many cores per socket? It looks like Sokolov was just starting to see gains at 200 clients on 72 cores, using -N transaction.