Greg Smith wrote:
>> 3500 active connections across them. That doesn't work, and what
>> happens
> is exactly the sort of context switch storm you're showing data for.
> Think about it for a minute: how many of those can really be doing
> work at any time? 32, that's how many. Now, you need some multiple
> of the number of cores to try to make sure everybody is always busy,
> but that multiple should be closer to 10X the number of cores rather
> than 100X.
That's surely overly simplistic. There is inherently nothing problematic
about having a lot of compute processes waiting for their timeslice, nor
of having IO- or semaphore-blocked processes waiting, and it doesn't
cause a context switch storm - this is a problem with postgres scalability,
not (inherently) lots of connections. I'm sure most of us evaluating
Postgres
from a background in Sybase or SQLServer would regard 5000
connections as no big deal.
This has the sniff of a badly contended spin-and-yield doesn't it?
I'd guess that if the yield were a sleep for a couple of milliseconds then
the lock holder would run an free everything up.