On Fri, 14 Dec 2012 15:39:44 -0500
Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Dec 12, 2012 at 8:29 AM, David Gould <daveg@sonic.net> wrote:
> > We lose noticable performance when we raise fill-factor above 10. Even 20 is
> > slower.
>
> Whoa.
Any interest in a fill-factor patch to place exactly one row per page? That
would be the least contended. There are applications where it might help.
> > During busy times these hosts sometimes fall into a stable state
> > with very high cpu use mostly in s_lock() and LWLockAcquire() and I think
> > PinBuffer plus very high system cpu in the scheduler (I don't have the perf
> > trace in front of me so take this with a grain of salt). In this mode they
> > fall from the normal 7000 queries per second to below 3000.
>
> I have seen signs of something similar to this when running pgbench -S
> tests at high concurrency. I've never been able to track down where
I think I may have seen that with pgbench -S too. I did not have time to
investigate more, but out of a sequence of three minute runs I got most
runs at 300k+ qps and but a couple were around 200k qps.
> the problem is happening. My belief is that once a spinlock starts to
> be contended, there's some kind of death spiral that can't be arrested
> until the workload eases up. But I haven't had much luck identifying
> exactly which spinlock is the problem or if it even is just one...
I agree about the death spiral. I think what happens is all the backends
get synchcronized by waiting and they are more likely to contend again.
-dg
--
David Gould 510 282 0869 daveg@sonic.net
If simplicity worked, the world would be overrun with insects.