I've observed the following, and I wonder if anyone has seen it or has a
workaround, before I report it as a bug.
Doing the following will result in the load on the server reaching ~600,
and the server becoming very unresponsive. This seems to be as a result
of the SIGUSR2 which is delivered for async notify to free space in the
event queue thingy (I'm sort of in the dark here).
1) Open many (I used 800) database connections and leave them idle.
2) run: while true; do vacuum {database} ; done
Wait. Observe db has stopped doing useful work.
This causes glitches of up to many minutes, during which the db fails to
respond in any way.
We noticed this on a production system, so it's fairly dire. I know one
workaround is to reduce the number of idle connections, but that will
make our system less responsive.
It looks like a bug to me - anyone got any ideas?
Stephen