Re: pg 8.1.3, AIX, huge box, painfully slow. - Mailing list pgsql-performance

From Tom Lane
Subject Re: pg 8.1.3, AIX, huge box, painfully slow.
Date
Msg-id 17515.1144450340@sss.pgh.pa.us
Whole thread Raw
In response to Re: pg 8.1.3, AIX, huge box, painfully slow.  (Gavin Hamill <gdh@laterooms.com>)
Responses Re: pg 8.1.3, AIX, huge box, painfully slow.  (Gavin Hamill <gdh@laterooms.com>)
Re: pg 8.1.3, AIX, huge box, painfully slow.  (Brad Nicholson <bnichols@ca.afilias.info>)
List pgsql-performance
Gavin Hamill <gdh@laterooms.com> writes:
> On Fri, 07 Apr 2006 17:56:49 -0400
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This is not good.  Did the semop storms coincide with visible
>> slowdown? (I'd assume so, but you didn't actually say...)

> Yes, there's a definate correlation here.. I attached truss to the
> main postmaster..
> ...
> And when I saw a flood of semop's for any particular PID, a second later
> in the 'topas' process list would show that PID at a 100% CPU ...

So apparently we've still got a problem with multiprocess contention for
an LWLock somewhere.  It's not the BufMgrLock because that's gone in 8.1.
It could be one of the finer-grain locks that are still there, or it
could be someplace else.

Are you in a position to try your workload using PG CVS tip?  There's a
nontrivial possibility that we've already fixed this --- a couple months
ago I did some work to reduce contention in the lock manager:

2005-12-11 16:02  tgl

    * src/: backend/access/transam/twophase.c,
    backend/storage/ipc/procarray.c, backend/storage/lmgr/README,
    backend/storage/lmgr/deadlock.c, backend/storage/lmgr/lock.c,
    backend/storage/lmgr/lwlock.c, backend/storage/lmgr/proc.c,
    include/storage/lock.h, include/storage/lwlock.h,
    include/storage/proc.h: Divide the lock manager's shared state into
    'partitions', so as to reduce contention for the former single
    LockMgrLock.  Per my recent proposal.  I set it up for 16
    partitions, but on a pgbench test this gives only a marginal
    further improvement over 4 partitions --- we need to test more
    scenarios to choose the number of partitions.

This is unfortunately not going to help you as far as getting that
machine into production now (unless you're brave enough to run CVS tip
as production, which I certainly am not).  I'm afraid you're most likely
going to have to ship that pSeries back at the end of the month, but
while you've got it it'd be awfully nice if we could use it as a testbed
...

            regards, tom lane

pgsql-performance by date:

Previous
From: Gavin Hamill
Date:
Subject: Re: pg 8.1.3, AIX, huge box, painfully slow.
Next
From: "Luke Lonergan"
Date:
Subject: Re: pg 8.1.3, AIX, huge box, painfully slow.