Further reduction of bufmgr lock contention - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Further reduction of bufmgr lock contention |
Date | |
Msg-id | 12051.1145638896@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: Further reduction of bufmgr lock contention
Re: Further reduction of bufmgr lock contention Re: Further reduction of bufmgr lock contention |
List | pgsql-hackers |
I've been looking into Gavin Hamill's recent report of poor performance with PG 8.1 on an 8-way IBM PPC64 box. strace'ing backends shows a lot of semop() calls, indicating blocking at the LWLock or lmgr-lock levels, but not a lot of select() delays, suggesting we don't have too much of a problem at the hardware spinlock level. A typical breakdown of different kernel call types is 566 _llseek 10 brk 10 gettimeofday 4 mmap 4 munmap 562 read 4 recv 8 select 3014 semop 12send 1 time 3 write (I'm hoping to get some oprofile results to confirm there's nothing strange going on at the hardware level, but no luck yet on getting oprofile to work on Debian/PPC64 ... anyone know anything about suitable kernels to use for that?) Instrumenting LWLockAcquire (with a patch I had developed last fall, but just now got around to cleaning up and committing to CVS) shows that the contention is practically all for the BufMappingLock: $ grep ^PID postmaster.log | sort +9nr | head -20 PID 23820 lwlock 0: shacq 2446470 exacq 6154 blk 12755 PID 23823 lwlock 0: shacq 2387597 exacq 4297 blk 9255 PID 23824 lwlock 0: shacq 1678694 exacq 4433 blk 8692 PID 23826 lwlock 0: shacq 1221221 exacq 3224 blk 5893 PID 23821 lwlock 0: shacq 1892453 exacq 1665 blk 5766 PID 23835 lwlock 0: shacq 2390685 exacq 1453 blk 5511 PID 23822 lwlock 0: shacq 1669419 exacq 1615 blk 4926 PID 23830 lwlock 0: shacq 1039468 exacq 1248 blk 2946 PID 23832 lwlock 0: shacq 698622 exacq 397 blk 1818 PID 23836 lwlock 0: shacq 544472 exacq 530 blk 1300 PID 23839 lwlock 0: shacq 497505 exacq 46 blk 885 PID 23842 lwlock 0: shacq 305281 exacq 1 blk 720 PID 23840 lwlock 0: shacq 317554 exacq 226 blk 355 PID 23840 lwlock 2: shacq 0 exacq 2872 blk 7 PID 23835 lwlock 2: shacq 0 exacq 3434 blk 6 PID 23835 lwlock 1: shacq 0 exacq 1452 blk 4 PID 23822 lwlock 1: shacq 0 exacq 1614 blk 3 PID 23820 lwlock 2: shacq 0 exacq 3582 blk 2 PID 23821 lwlock 1: shacq 0 exacq 1664 blk 2 PID 23830 lwlock 1: shacq 0 exacq 1247 blk 2 These numbers show that our rewrite of the bufmgr has done a great job of cutting down the amount of potential contention --- most of the traffic on this lock is shared rather than exclusive acquisitions --- but it seems that if you have enough CPUs it's still not good enough. (My best theory as to why Gavin is seeing better performance from a dual Opteron is simply that 2 processors will have 1/4th as much contention as 8 processors.) I have an idea about how to improve matters: I think we could break the buffer tag to buffer mapping hashtable into multiple partitions based on some hash value of the buffer tags, and protect each partition under a separate LWLock, similar to what we did with the lmgr lock table not long ago. Anyone have a comment on this strategy, or a better idea? regards, tom lane
pgsql-hackers by date: