Re: User concurrency thresholding: where do I look? - Mailing list pgsql-performance
From | Jignesh K. Shah |
---|---|
Subject | Re: User concurrency thresholding: where do I look? |
Date | |
Msg-id | 46A8F9AA.4040507@sun.com Whole thread Raw |
In response to | Re: User concurrency thresholding: where do I look? ("Simon Riggs" <simon@2ndquadrant.com>) |
Responses |
Re: User concurrency thresholding: where do I look?
|
List | pgsql-performance |
BEAUTIFUL!!! Using the Patch I can now go upto 1300 users without dropping.. But now it still repeats at 1300-1350 users.. I corrected the Lock Descriptions based on what I got from lwlock.h and retried the whole scalability again with profiling again.. This time it looks like the ProcArrayLock bash-3.00# echo 600 users 600 users bash-3.00# ./4_lwlock_waits.d 7056 Lock Id Mode Count XidGenLock Exclusive 1 CLogControlLock Shared 2 XidGenLock Shared 2 WALWriteLock Exclusive 4 WALInsertLock Exclusive 8 CLogControlLock Exclusive 9 ProcArrayLock Exclusive 9 Lock Id Combined Time (ns) WALWriteLock 2842300 XidGenLock 4951000 CLogControlLock 11151800 WALInsertLock 13035600 ProcArrayLock 20040000 bash-3.00# echo 700 users 700 users bash-3.00# ./4_lwlock_waits.d 7056 Lock Id Mode Count XidGenLock Exclusive 1 WALWriteLock Exclusive 2 XidGenLock Shared 2 CLogControlLock Shared 3 CLogControlLock Exclusive 8 WALInsertLock Exclusive 9 ProcArrayLock Exclusive 22 Lock Id Combined Time (ns) XidGenLock 4093300 WALWriteLock 4914800 WALInsertLock 7389100 ProcArrayLock 10248200 CLogControlLock 11989400 bash-3.00# echo 800 users 800 users bash-3.00# ./4_lwlock_waits.d 7056 Lock Id Mode Count WALWriteLock Exclusive 1 XidGenLock Shared 2 CLogControlLock Shared 3 CLogControlLock Exclusive 7 WALInsertLock Exclusive 7 ProcArrayLock Exclusive 31 Lock Id Combined Time (ns) WALWriteLock 319100 XidGenLock 5388700 WALInsertLock 9901400 CLogControlLock 13465000 ProcArrayLock 42979700 bash-3.00# echo 900 users 900 users bash-3.00# ./4_lwlock_waits.d 7056 Lock Id Mode Count CLogControlLock Shared 1 XidGenLock Exclusive 1 WALWriteLock Exclusive 2 CLogControlLock Exclusive 6 WALInsertLock Exclusive 9 ProcArrayLock Exclusive 25 Lock Id Combined Time (ns) XidGenLock 3197700 WALWriteLock 3887100 CLogControlLock 15774500 WALInsertLock 38268700 ProcArrayLock 162073100 bash-3.00# ./6_lwlock_stack.d 4 7056 Lock Id Mode Count ProcArrayLock Shared 1 ProcArrayLock Exclusive 67 Lock Id Combined Time (ns) ProcArrayLock 216773800 Lock Id Combined Time (ns) postgres`LWLockAcquire+0x1c8 postgres`GetSnapshotData+0x118 postgres`GetTransactionSnapshot+0x5c postgres`PortalStart+0x150 postgres`exec_bind_message+0x81c postgres`PostgresMain+0x12b8 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 postgres`main+0x394 postgres`_start+0x108 2779000 postgres`LWLockAcquire+0x1c8 postgres`CommitTransaction+0xe0 postgres`CommitTransactionCommand+0x90 postgres`finish_xact_command+0x60 postgres`exec_execute_message+0x3d8 postgres`PostgresMain+0x1300 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 postgres`main+0x394 postgres`_start+0x108 213994800 bash-3.00# echo 1000 users - SLIGHT DROP 1000 users bash-3.00# ./4_lwlock_waits.d 7056 Lock Id Mode Count WALWriteLock Exclusive 1 CLogControlLock Exclusive 2 XidGenLock Shared 2 CLogControlLock Shared 3 WALInsertLock Exclusive 4 ProcArrayLock Exclusive 26 Lock Id Combined Time (ns) WALWriteLock 1807400 XidGenLock 2024000 WALInsertLock 2177500 CLogControlLock 9064700 ProcArrayLock 199216000 bash-3.00# ./6_lwlock_stack.d 4 7056 Lock Id Mode Count ProcArrayLock Shared 3 ProcArrayLock Exclusive 38 Lock Id Combined Time (ns) ProcArrayLock 858238600 Lock Id Combined Time (ns) postgres`LWLockAcquire+0x1c8 postgres`TransactionIdIsInProgress+0x50 postgres`HeapTupleSatisfiesVacuum+0x2ec postgres`_bt_check_unique+0x2a0 postgres`_bt_doinsert+0x98 postgres`btinsert+0x54 postgres`FunctionCall6+0x44 postgres`index_insert+0x90 postgres`ExecInsertIndexTuples+0x1bc postgres`ExecUpdate+0x500 postgres`ExecutePlan+0x704 postgres`ExecutorRun+0x60 postgres`PortalRunMulti+0x2a0 postgres`PortalRun+0x310 postgres`exec_execute_message+0x3a0 postgres`PostgresMain+0x1300 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 postgres`main+0x394 167100 postgres`LWLockAcquire+0x1c8 postgres`GetSnapshotData+0x118 postgres`GetTransactionSnapshot+0x5c postgres`PortalRunMulti+0x22c postgres`PortalRun+0x310 postgres`exec_execute_message+0x3a0 postgres`PostgresMain+0x1300 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 postgres`main+0x394 postgres`_start+0x108 7125900 postgres`LWLockAcquire+0x1c8 postgres`CommitTransaction+0xe0 postgres`CommitTransactionCommand+0x90 postgres`finish_xact_command+0x60 postgres`exec_execute_message+0x3d8 postgres`PostgresMain+0x1300 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 postgres`main+0x394 postgres`_start+0x108 850945600 bash-3.00# echo 1100 users - DROP .... 1100 users bash-3.00# ./4_lwlock_waits.d 7056 Lock Id Mode Count CLogControlLock Shared 1 WALWriteLock Exclusive 1 XidGenLock Exclusive 1 ProcArrayLock Shared 2 WALInsertLock Exclusive 2 XidGenLock Shared 2 CLogControlLock Exclusive 4 ProcArrayLock Exclusive 20 Lock Id Combined Time (ns) WALWriteLock 4179500 XidGenLock 6249400 CLogControlLock 20411100 WALInsertLock 29707600 ProcArrayLock 207923700 bash-3.00# ./6_lwlock_stack.d 4 7056 Lock Id Mode Count ProcArrayLock Exclusive 40 Lock Id Combined Time (ns) ProcArrayLock 692242100 Lock Id Combined Time (ns) postgres`LWLockAcquire+0x1c8 postgres`CommitTransaction+0xe0 postgres`CommitTransactionCommand+0x90 postgres`finish_xact_command+0x60 postgres`exec_execute_message+0x3d8 postgres`PostgresMain+0x1300 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 postgres`main+0x394 postgres`_start+0x108 692242100 bash-3.00# LockID for ProcArrayLock is 4 or the 5 entry in lwlock.h which seems to indicate it is lwlock.h Any tweaks for that? -Jignesh Simon Riggs wrote: > On Thu, 2007-07-26 at 11:27 -0400, Jignesh K. Shah wrote: > > >> However at 900 Users where the big drop in throughput occurs: >> It gives a different top "consumer" of time: >> > > > postgres`LWLockAcquire+0x1c8 > >> postgres`SimpleLruReadPage+0x1ac >> postgres`TransactionIdGetStatus+0x14 >> postgres`TransactionLogFetch+0x58 >> > > TransactionIdGetStatus doesn't directly call SimpleLruReadPage(). > Presumably the compiler has been rearranging things?? > > Looks like you're out of clog buffers. It seems like the clog buffers > aren't big enough to hold clog pages for long enough and the SELECT FOR > SHARE processing is leaving lots of additional read locks that are > increasing the number of clog requests for older xids. > > Try the enclosed patch. > > > ------------------------------------------------------------------------ > > Index: src/include/access/clog.h > =================================================================== > RCS file: /projects/cvsroot/pgsql/src/include/access/clog.h,v > retrieving revision 1.19 > diff -c -r1.19 clog.h > *** src/include/access/clog.h 5 Jan 2007 22:19:50 -0000 1.19 > --- src/include/access/clog.h 26 Jul 2007 15:44:58 -0000 > *************** > *** 29,35 **** > > > /* Number of SLRU buffers to use for clog */ > ! #define NUM_CLOG_BUFFERS 8 > > > extern void TransactionIdSetStatus(TransactionId xid, XidStatus status); > --- 29,35 ---- > > > /* Number of SLRU buffers to use for clog */ > ! #define NUM_CLOG_BUFFERS 64 > > > extern void TransactionIdSetStatus(TransactionId xid, XidStatus status); > > ------------------------------------------------------------------------ > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings >
pgsql-performance by date: