ProcArrayLock (The Saga continues) - Mailing list pgsql-performance
From | Jignesh K. Shah |
---|---|
Subject | ProcArrayLock (The Saga continues) |
Date | |
Msg-id | 483F294A.9040108@sun.com Whole thread Raw |
Responses |
Re: ProcArrayLock (The Saga continues)
|
List | pgsql-performance |
Based on feedback after the sessions I did few more tests which might be useful to share One point that was suggested to get each clients do more work and reduce the number of clients.. The igen benchmarks was flexible and what I did was remove all think time from it and repeated the test till the scalability stops (This was done with CVS downloaded yesterday) Note with this no think time concept, each clients can be about 75% CPU busy from what I observed. running it I found the clients scaling up saturates at about 60 now (compared to 500 from the original test). The peak throughput was at about 50 users (using synchrnous_commit=off) Here is the interesting DTrace Lock Ouput state (lock id, mode of lock and time in ns spent waiting for lock in a 10-sec snapshot (Just taking the last few top ones in ascending order): With less than 20 users it is WALInsert at the top: 52 Exclusive 721950129 4 Exclusive 768537190 46 Exclusive 842063837 7 Exclusive 1031851713 With 35 Users: 52 Exclusive 2599074739 4 Exclusive 2647927574 46 Exclusive 2789581991 7 Exclusive 3220008691 At the peak at about 50 users that I saw earlier (PEAK Throughput): 46 Exclusive 3669210393 4 Exclusive 6024966938 52 Exclusive 6529168107 7 Exclusive 9408290367 With about 60 users where the throughput actually starts to drop (throughput drops) 41 Exclusive 4570660567 52 Exclusive 10706741643 46 Exclusive 13152005125 4 Exclusive 13550187806 7 Exclusive 22146882562 With about 100 users ( below the peak value) 42 Exclusive 4238582775 46 Exclusive 6773515243 7 Exclusive 7467346038 52 Exclusive 9846216440 4 Shared 22528501166 4 Exclusive 223043774037 So it seems when both shared and exclusive time for ProcArrayLock wait are the top 2 it is basically saturated in terms of throughput it can handle. Optimizing wait queues will help improve shared which might help Exclusive a bit but eventually Exclusive for ProcArray will limit scaling with as few as 60-70 users. Lock hold times are below (though taken from different run) with 30 users: Lock Id Mode Combined Time (ns) 1616992 Exclusive 1199791629 4 Exclusive 1399371867 34 Exclusive 1426153620 1616978 Exclusive 1528327035 1616990 Exclusive 1546374298 1616988 Exclusive 1553461559 5 Exclusive 2477558484 With 50+ users Lock Id Mode Combined Time (ns) 4 Exclusive 1438509198 1616992 Exclusive 1450973466 1616978 Exclusive 1505626978 1616990 Exclusive 1850432217 1616988 Exclusive 2033226225 34 Exclusive 2098542547 5 Exclusive 3280151374 With 100 users Lock Id Mode Combined Time (ns) 1616992 Exclusive 1206516505 1616988 Exclusive 1486704087 1616990 Exclusive 1521900997 34 Exclusive 1532815803 1616978 Exclusive 1541986895 5 Exclusive 2179043424 5 2395098279 (Why 5 was printing with blank??) Rerunning it with slight variation of the script Lock Id Mode Combined Time (ns) 1616996 0 1167708953 36 0 1291958451 5 4299305160 1344486968 4 0 1347557908 1616978 0 1377931882 34 0 1724752938 5 0 2079012548 Looks like trend of 4's hold time looks similar to previous ones.. though the new kid is 5 with mode <> 0,1 .. not sure if that is causing problems..What mode is "4299305160" for Lock 5 (SInvalLock) ? Anyway at this point the wait time for 4 increases to a point where the database is not scaling anymore any thoughts? -Jignesh
pgsql-performance by date: