Re: User concurrency thresholding: where do I look? - Mailing list pgsql-performance

From Jignesh K. Shah
Subject Re: User concurrency thresholding: where do I look?
Date
Msg-id 46A8F9AA.4040507@sun.com
Whole thread Raw
In response to Re: User concurrency thresholding: where do I look?  ("Simon Riggs" <simon@2ndquadrant.com>)
Responses Re: User concurrency thresholding: where do I look?  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-performance
BEAUTIFUL!!!

Using the Patch I can now go upto 1300 users without dropping.. But now
it still repeats at 1300-1350 users..

I corrected the Lock Descriptions based on what I got from lwlock.h and
retried the whole scalability again with profiling again.. This time it
looks like the ProcArrayLock


bash-3.00# echo 600 users
600 users
bash-3.00# ./4_lwlock_waits.d 7056

             Lock Id            Mode           Count
          XidGenLock       Exclusive               1
     CLogControlLock          Shared               2
          XidGenLock          Shared               2
        WALWriteLock       Exclusive               4
       WALInsertLock       Exclusive               8
     CLogControlLock       Exclusive               9
       ProcArrayLock       Exclusive               9

             Lock Id   Combined Time (ns)
        WALWriteLock              2842300
          XidGenLock              4951000
     CLogControlLock             11151800
       WALInsertLock             13035600
       ProcArrayLock             20040000

bash-3.00# echo 700 users
700 users
bash-3.00# ./4_lwlock_waits.d 7056

             Lock Id            Mode           Count
          XidGenLock       Exclusive               1
        WALWriteLock       Exclusive               2
          XidGenLock          Shared               2
     CLogControlLock          Shared               3
     CLogControlLock       Exclusive               8
       WALInsertLock       Exclusive               9
       ProcArrayLock       Exclusive              22

             Lock Id   Combined Time (ns)
          XidGenLock              4093300
        WALWriteLock              4914800
       WALInsertLock              7389100
       ProcArrayLock             10248200
     CLogControlLock             11989400

bash-3.00# echo 800 users
800 users
bash-3.00# ./4_lwlock_waits.d 7056

             Lock Id            Mode           Count
        WALWriteLock       Exclusive               1
          XidGenLock          Shared               2
     CLogControlLock          Shared               3
     CLogControlLock       Exclusive               7
       WALInsertLock       Exclusive               7
       ProcArrayLock       Exclusive              31

             Lock Id   Combined Time (ns)
        WALWriteLock               319100
          XidGenLock              5388700
       WALInsertLock              9901400
     CLogControlLock             13465000
       ProcArrayLock             42979700

bash-3.00# echo 900 users
900 users
bash-3.00# ./4_lwlock_waits.d 7056

             Lock Id            Mode           Count
     CLogControlLock          Shared               1
          XidGenLock       Exclusive               1
        WALWriteLock       Exclusive               2
     CLogControlLock       Exclusive               6
       WALInsertLock       Exclusive               9
       ProcArrayLock       Exclusive              25

             Lock Id   Combined Time (ns)
          XidGenLock              3197700
        WALWriteLock              3887100
     CLogControlLock             15774500
       WALInsertLock             38268700
       ProcArrayLock            162073100

bash-3.00# ./6_lwlock_stack.d 4 7056

             Lock Id            Mode           Count
       ProcArrayLock          Shared               1
       ProcArrayLock       Exclusive              67

             Lock Id   Combined Time (ns)
       ProcArrayLock            216773800

             Lock Id   Combined Time (ns)


              postgres`LWLockAcquire+0x1c8
              postgres`GetSnapshotData+0x118
              postgres`GetTransactionSnapshot+0x5c
              postgres`PortalStart+0x150
              postgres`exec_bind_message+0x81c
              postgres`PostgresMain+0x12b8
              postgres`BackendRun+0x278
              postgres`ServerLoop+0x63c
              postgres`PostmasterMain+0xc40
              postgres`main+0x394
              postgres`_start+0x108
          2779000

              postgres`LWLockAcquire+0x1c8
              postgres`CommitTransaction+0xe0
              postgres`CommitTransactionCommand+0x90
              postgres`finish_xact_command+0x60
              postgres`exec_execute_message+0x3d8
              postgres`PostgresMain+0x1300
              postgres`BackendRun+0x278
              postgres`ServerLoop+0x63c
              postgres`PostmasterMain+0xc40
              postgres`main+0x394
              postgres`_start+0x108
        213994800

bash-3.00# echo 1000 users - SLIGHT DROP
1000 users
bash-3.00# ./4_lwlock_waits.d 7056

             Lock Id            Mode           Count
        WALWriteLock       Exclusive               1
     CLogControlLock       Exclusive               2
          XidGenLock          Shared               2
     CLogControlLock          Shared               3
       WALInsertLock       Exclusive               4
       ProcArrayLock       Exclusive              26

             Lock Id   Combined Time (ns)
        WALWriteLock              1807400
          XidGenLock              2024000
       WALInsertLock              2177500
     CLogControlLock              9064700
       ProcArrayLock            199216000

bash-3.00# ./6_lwlock_stack.d 4 7056

             Lock Id            Mode           Count
       ProcArrayLock          Shared               3
       ProcArrayLock       Exclusive              38

             Lock Id   Combined Time (ns)
       ProcArrayLock            858238600

             Lock Id   Combined Time (ns)


              postgres`LWLockAcquire+0x1c8
              postgres`TransactionIdIsInProgress+0x50
              postgres`HeapTupleSatisfiesVacuum+0x2ec
              postgres`_bt_check_unique+0x2a0
              postgres`_bt_doinsert+0x98
              postgres`btinsert+0x54
              postgres`FunctionCall6+0x44
              postgres`index_insert+0x90
              postgres`ExecInsertIndexTuples+0x1bc
              postgres`ExecUpdate+0x500
              postgres`ExecutePlan+0x704
              postgres`ExecutorRun+0x60
              postgres`PortalRunMulti+0x2a0
              postgres`PortalRun+0x310
              postgres`exec_execute_message+0x3a0
              postgres`PostgresMain+0x1300
              postgres`BackendRun+0x278
              postgres`ServerLoop+0x63c
              postgres`PostmasterMain+0xc40
              postgres`main+0x394
           167100

              postgres`LWLockAcquire+0x1c8
              postgres`GetSnapshotData+0x118
              postgres`GetTransactionSnapshot+0x5c
              postgres`PortalRunMulti+0x22c
              postgres`PortalRun+0x310
              postgres`exec_execute_message+0x3a0
              postgres`PostgresMain+0x1300
              postgres`BackendRun+0x278
              postgres`ServerLoop+0x63c
              postgres`PostmasterMain+0xc40
              postgres`main+0x394
              postgres`_start+0x108
          7125900

              postgres`LWLockAcquire+0x1c8
              postgres`CommitTransaction+0xe0
              postgres`CommitTransactionCommand+0x90
              postgres`finish_xact_command+0x60
              postgres`exec_execute_message+0x3d8
              postgres`PostgresMain+0x1300
              postgres`BackendRun+0x278
              postgres`ServerLoop+0x63c
              postgres`PostmasterMain+0xc40
              postgres`main+0x394
              postgres`_start+0x108
        850945600

bash-3.00# echo 1100 users - DROP ....
1100 users
bash-3.00# ./4_lwlock_waits.d 7056

             Lock Id            Mode           Count
     CLogControlLock          Shared               1
        WALWriteLock       Exclusive               1
          XidGenLock       Exclusive               1
       ProcArrayLock          Shared               2
       WALInsertLock       Exclusive               2
          XidGenLock          Shared               2
     CLogControlLock       Exclusive               4
       ProcArrayLock       Exclusive              20

             Lock Id   Combined Time (ns)
        WALWriteLock              4179500
          XidGenLock              6249400
     CLogControlLock             20411100
       WALInsertLock             29707600
       ProcArrayLock            207923700

bash-3.00# ./6_lwlock_stack.d 4 7056

             Lock Id            Mode           Count
       ProcArrayLock       Exclusive              40

             Lock Id   Combined Time (ns)
       ProcArrayLock            692242100

             Lock Id   Combined Time (ns)


              postgres`LWLockAcquire+0x1c8
              postgres`CommitTransaction+0xe0
              postgres`CommitTransactionCommand+0x90
              postgres`finish_xact_command+0x60
              postgres`exec_execute_message+0x3d8
              postgres`PostgresMain+0x1300
              postgres`BackendRun+0x278
              postgres`ServerLoop+0x63c
              postgres`PostmasterMain+0xc40
              postgres`main+0x394
              postgres`_start+0x108
        692242100

bash-3.00#




LockID for ProcArrayLock is 4 or the 5 entry in lwlock.h which seems to
indicate it is lwlock.h
Any tweaks for that?

-Jignesh


Simon Riggs wrote:
> On Thu, 2007-07-26 at 11:27 -0400, Jignesh K. Shah wrote:
>
>
>> However at 900 Users where the big drop in throughput occurs:
>> It gives a different top "consumer" of time:
>>
>
>
>                 postgres`LWLockAcquire+0x1c8
>
>>               postgres`SimpleLruReadPage+0x1ac
>>               postgres`TransactionIdGetStatus+0x14
>>               postgres`TransactionLogFetch+0x58
>>
>
> TransactionIdGetStatus doesn't directly call SimpleLruReadPage().
> Presumably the compiler has been rearranging things??
>
> Looks like you're out of clog buffers. It seems like the clog buffers
> aren't big enough to hold clog pages for long enough and the SELECT FOR
> SHARE processing is leaving lots of additional read locks that are
> increasing the number of clog requests for older xids.
>
> Try the enclosed patch.
>
>
> ------------------------------------------------------------------------
>
> Index: src/include/access/clog.h
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/src/include/access/clog.h,v
> retrieving revision 1.19
> diff -c -r1.19 clog.h
> *** src/include/access/clog.h    5 Jan 2007 22:19:50 -0000    1.19
> --- src/include/access/clog.h    26 Jul 2007 15:44:58 -0000
> ***************
> *** 29,35 ****
>
>
>   /* Number of SLRU buffers to use for clog */
> ! #define NUM_CLOG_BUFFERS    8
>
>
>   extern void TransactionIdSetStatus(TransactionId xid, XidStatus status);
> --- 29,35 ----
>
>
>   /* Number of SLRU buffers to use for clog */
> ! #define NUM_CLOG_BUFFERS    64
>
>
>   extern void TransactionIdSetStatus(TransactionId xid, XidStatus status);
>
> ------------------------------------------------------------------------
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

pgsql-performance by date:

Previous
From: Chris Browne
Date:
Subject: Re: performance of postgresql in replication using slony
Next
From: Robert Lor
Date:
Subject: Re: User concurrency thresholding: where do I look?