On Tue, 2002-05-28 at 12:44, Tom Lane wrote:
> Stephen Robert Norris <srn@commsecure.com.au> writes:
> >> If you're seeing load peaks in excess of what would be observed with
> >> 800 active queries, then I would agree there's something to investigate
> >> here.
>
> > Yep, indeed with 800 backends doing a query every second, nothing
> > happens.
>
> > The machine itself has 1GB of RAM, and uses no swap in the above
> > situation. Instead, system time goes to 99% of CPU. The machine is a
> > dual-CPU athlon 1900 (1.5GHz).
>
> > It _only_ happens with idle connections!
>
> Hmm, you mean if the 800 other connections are *not* idle, you can
> do VACUUMs with impunity? If so, I'd agree we got a bug ...
>
> regards, tom lane
Yes, that's what I'm saying.
If you put a sleep 5 into my while loop above, the problem still happens
(albeit after a longer time).
If you put a delay in, and also make sure that each of the 800
connections does a query (I'm using "select 1;"), then the problem never
happens.
Without the delay, you get a bit of load, but only up to about 10 or so,
rather than 600.
What seems to happen is that all the idle backends get woken up every
now and then to do _something_ (it seemed to coincide with getting the
SIGUSR2 to trigger an async_notify), and that shoots up the load. Making
sure all the backends have done something every now and then seems to
avoid the problem.
I suspect the comment near the async_notify code explains the problem -
where it talks about idle backends having to be woken up to clear out
pending events...
We only notice it on our production system when it's got lots of idle db
connections.
Other people seem to have spotted it (I found references to similar
effects with Google) but nobody seems to have worked out what it is -
the other people typically had the problem with web servers which keep a
pool of db connections.
Just to reiterate; this machine never swaps, it can handle > 1000
queries per second and it doesn't seem to run out of resources...
Stephen