Thread: Newbie performance problem - semop taking most of time ?
Using postgreSQL 9.2 with the following settings: max_connections = 1000 # (change requires restart) shared_buffers = 65536MB # min 128kB work_mem = 16MB # min 64kB effective_io_concurrency = 48 # 1-1000; 0 disables prefetching wal_buffers = 16MB # min 32kB, -1 sets based on shared_buffers commit_delay = 10000 # range 0-100000, in microseconds commit_siblings = 100 # range 1-1000 checkpoint_segments = 256 # in logfile segments, min 1, 16MB each checkpoint_timeout = 10min # range 30s-1h checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0 effective_cache_size = 65536MB Server spec is Xeon 8 Core/16threads, 512Gb memory. Database size is 100G. Underlying SAN is raid 1/0. OS is Linux Redhat 6.2. Kernel 2.6.32 Running hammer ora TPC-C type load. Under 20 user load (no key and think) getting approx 180,000 TPM - which is about half of what I get with another database vendor. tracing the process (strace -r) I get outtput like that below - a lot of the time seems to be doing semop type operations (eg 0.001299 semop(13369414, {{3, -1, 0}}, 1) = 0) Just wondered if anyone could tell me what is going on there and any possibilities that I might have to decrease this wait time ? 0.000176 semop(13369414, {{3, -1, 0}}, 1) = 0 0.000031 semop(13369414, {{3, -1, 0}}, 1) = 0 0.001102 semop(13369414, {{3, -1, 0}}, 1) = 0 0.000049 semop(13369414, {{4, 1, 0}}, 1) = 0 0.000405 semop(13369414, {{3, -1, 0}}, 1) = 0 0.000049 semop(13369414, {{10, 1, 0}}, 1) = 0 0.000337 semop(13369414, {{3, -1, 0}}, 1) = 0 0.000057 semop(13369414, {{3, -1, 0}}, 1) = 0 0.000074 semop(13369414, {{3, -1, 0}}, 1) = 0 0.000779 semop(13369414, {{3, -1, 0}}, 1) = 0 0.000847 sendto(10, "T\0\0\0\37\0\1neword\0\0\0\0\0\0\0\0\0\6\244\377\377\377\377\377\377\0\0"..., 63, 0, NULL, 0) = 63 0.000063 recvfrom(10, "Q\0\0\0(select neword(142,1001,8,23"..., 8192, 0, NULL, NULL) = 41 0.000463 lseek(12, 0, SEEK_END) = 52486144 0.000057 lseek(13, 0, SEEK_END) = 6356992 0.001299 semop(13369414, {{3, -1, 0}}, 1) = 0 0.000864 semop(13402183, {{2, 1, 0}}, 1) = 0 0.000420 semop(13369414, {{3, -1, 0}}, 1) = 0 0.000675 semop(13402183, {{7, 1, 0}}, 1) = 0 0.000445 semop(13369414, {{3, -1, 0}}, 1) = 0 0.000156 semop(13369414, {{3, -1, 0}}, 1) = 0 0.001458 semop(13369414, {{6, 1, 0}}, 1) = 0 Cheers, Mal -- View this message in context: http://postgresql.1045698.n5.nabble.com/Newbie-performance-problem-semop-taking-most-of-time-tp5724566.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
On Wed, Sep 19, 2012 at 5:34 AM, mal.oracledba <mal.oracledba@gmail.com> wrote: > Running hammer ora TPC-C type load. Under 20 user load (no key and think) > getting approx 180,000 TPM - which is about half of what I get with another > database vendor. > > tracing the process (strace -r) I get outtput like that below - a lot of the > time seems to be doing semop type operations (eg 0.001299 semop(13369414, > {{3, -1, 0}}, 1) = 0) > > Just wondered if anyone could tell me what is going on there and any > possibilities that I might have to decrease this wait time ? I'm don't think system-call traces alone are enough to find a performance issue; if using a sufficiently new Linux I'd highly recommend posting the results of the tool 'perf'. Robert Haas writes some of his favorite incantations of it here: http://rhaas.blogspot.com/2012/06/perf-good-bad-ugly.html You might also want to offer some qualitative information...for example, does the problem seem to be contention (wherein there is spare CPU time that should be getting used, but isn't) or maybe just too many cycles are being expended by Postgres vs Your Other Database Vendor. -- fdr
CPU on the server shows approx 60% used/40 % idle vmstat 5 5 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 18 0 0 160465072 455200 352185248 0 0 37 421 2 0 4 1 95 0 0 14 1 0 160321088 455200 352315040 0 0 0 30559 30496 48583 48 4 46 1 0 1 0 0 160153584 455204 352466624 0 0 13 57266 35147 56949 56 5 38 2 0 19 1 0 160030544 455204 352577504 0 0 13 27765 27924 41981 41 3 54 1 0 10 0 0 159862800 455204 352731776 0 0 29 37807 35591 58193 57 5 37 2 0 A snapshot of 'perf' output for one of the users below. It doesnt change much from that profile. PerfTop: 319 irqs/sec kernel: 7.5% exact: 0.0% [1000Hz cycles], (target_pid: 18450) --------------------------------------------------------------------------------------------------------------------------------------------- samples pcnt function DSO _______ _____ ___________________________ ________________________________________________ 235.00 7.5% AllocSetAlloc /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 155.00 5.0% SearchCatCache /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 134.00 4.3% hash_search_with_hash_value /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 128.00 4.1% LWLockAcquire /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 110.00 3.5% _bt_compare /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 106.00 3.4% __memcpy_ssse3_back /lib64/libc-2.12.so 106.00 3.4% XLogInsert /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 92.00 2.9% cmp_numerics /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 73.00 2.3% ExecInitExpr /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 64.00 2.0% MemoryContextAlloc /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 61.00 1.9% GetSnapshotData /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 60.00 1.9% _int_malloc /lib64/libc-2.12.so 58.00 1.9% nocache_index_getattr /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 49.00 1.6% PinBuffer /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres 49.00 1.6% cmp_abs_common /post/PostgreSQL2/postgresql-9.2rc1/bin/postgres -- View this message in context: http://postgresql.1045698.n5.nabble.com/Newbie-performance-problem-semop-taking-most-of-time-tp5724566p5725062.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
"mal.oracledba" <mal.oracledba@gmail.com> writes: > A snapshot of 'perf' output for one of the users below. It doesnt change > much from that profile. You might want to look into whether you could use int or bigint instead of numeric for your indexed columns. That trace is showing 4.5% of runtime directly blamable on cmp_numerics plus its subroutine cmp_abs_common, and I'll bet that a noticeable fraction of the AllocSetAlloc, memcpy, and malloc traffic is attributable to numeric operations too. You could probably not expect to get much more than 5% savings, but still, if you don't need fractions then that's 5% that's just being wasted. The bigger picture here though is a lot of context swaps, and I wonder how much of that is blamable on your having activated commit_delay (especially with such silly settings as you chose ... more is not better there). The fact that XLogInsert shows up high on the profile might indicate that contention for WALInsertLock is a factor, though it's hardly proof of that. If that's the main problem there's not too much that can be done about it in the near term. (There's some work going on to reduce contention for that lock, but it's not done yet, let alone in 9.2.) In a real application you could possibly reduce the problem by rearranging operations, but that would be cheating of course for a benchmark. regards, tom lane