Thread: Newbie performance problem - semop taking most of time ?

Newbie performance problem - semop taking most of time ?

From
"mal.oracledba"
Date:
Using postgreSQL 9.2 with the following settings:

max_connections = 1000                  # (change requires restart)
shared_buffers = 65536MB                # min 128kB
work_mem = 16MB                         # min 64kB
effective_io_concurrency = 48           # 1-1000; 0 disables prefetching
wal_buffers = 16MB                      # min 32kB, -1 sets based on
shared_buffers
commit_delay = 10000                    # range 0-100000, in microseconds
commit_siblings = 100                   # range 1-1000
checkpoint_segments = 256               # in logfile segments, min 1, 16MB
each
checkpoint_timeout = 10min              # range 30s-1h
checkpoint_completion_target = 0.9      # checkpoint target duration, 0.0 -
1.0
effective_cache_size = 65536MB

Server spec is Xeon 8 Core/16threads, 512Gb memory. Database size is 100G.
Underlying SAN is raid 1/0.
OS is Linux Redhat 6.2. Kernel 2.6.32

Running hammer ora TPC-C type load. Under 20 user load (no key and think)
getting approx 180,000 TPM - which is about half of what I get with another
database vendor.

tracing the process (strace -r) I get outtput like that below - a lot of the
time seems to be doing semop type operations (eg 0.001299 semop(13369414,
{{3, -1, 0}}, 1) = 0)

Just wondered if anyone could tell me what is going on there and any
possibilities that I might have to decrease this wait time ?


     0.000176 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.000031 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.001102 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.000049 semop(13369414, {{4, 1, 0}}, 1) = 0
     0.000405 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.000049 semop(13369414, {{10, 1, 0}}, 1) = 0
     0.000337 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.000057 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.000074 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.000779 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.000847 sendto(10,
"T\0\0\0\37\0\1neword\0\0\0\0\0\0\0\0\0\6\244\377\377\377\377\377\377\0\0"...,
63, 0, NULL, 0) = 63
     0.000063 recvfrom(10, "Q\0\0\0(select neword(142,1001,8,23"..., 8192,
0, NULL, NULL) = 41
     0.000463 lseek(12, 0, SEEK_END)    = 52486144
     0.000057 lseek(13, 0, SEEK_END)    = 6356992
     0.001299 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.000864 semop(13402183, {{2, 1, 0}}, 1) = 0
     0.000420 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.000675 semop(13402183, {{7, 1, 0}}, 1) = 0
     0.000445 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.000156 semop(13369414, {{3, -1, 0}}, 1) = 0
     0.001458 semop(13369414, {{6, 1, 0}}, 1) = 0

Cheers,

Mal





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Newbie-performance-problem-semop-taking-most-of-time-tp5724566.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


Re: Newbie performance problem - semop taking most of time ?

From
Daniel Farina
Date:
On Wed, Sep 19, 2012 at 5:34 AM, mal.oracledba <mal.oracledba@gmail.com> wrote:
> Running hammer ora TPC-C type load. Under 20 user load (no key and think)
> getting approx 180,000 TPM - which is about half of what I get with another
> database vendor.
>
> tracing the process (strace -r) I get outtput like that below - a lot of the
> time seems to be doing semop type operations (eg 0.001299 semop(13369414,
> {{3, -1, 0}}, 1) = 0)
>
> Just wondered if anyone could tell me what is going on there and any
> possibilities that I might have to decrease this wait time ?

I'm don't think system-call traces alone are enough to find a
performance issue; if using a sufficiently new Linux I'd highly
recommend posting the results of the tool 'perf'.  Robert Haas writes
some of his favorite incantations of it here:

http://rhaas.blogspot.com/2012/06/perf-good-bad-ugly.html

You might also want to offer some qualitative information...for
example, does the problem seem to be contention (wherein there is
spare CPU time that should be getting used, but isn't) or maybe just
too many cycles are being expended by Postgres vs Your Other Database
Vendor.

--
fdr


Re: Newbie performance problem - semop taking most of time ?

From
"mal.oracledba"
Date:
CPU on the server shows approx 60% used/40 % idle


 vmstat 5 5
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
wa st
18  0      0 160465072 455200 352185248    0    0    37   421    2    0  4
1 95  0  0
14  1      0 160321088 455200 352315040    0    0     0 30559 30496 48583 48
4 46  1  0
 1  0      0 160153584 455204 352466624    0    0    13 57266 35147 56949 56
5 38  2  0
19  1      0 160030544 455204 352577504    0    0    13 27765 27924 41981 41
3 54  1  0
10  0      0 159862800 455204 352731776    0    0    29 37807 35591 58193 57
5 37  2  0



A snapshot of 'perf' output for one of the users below. It doesnt change
much from that profile.



   PerfTop:     319 irqs/sec  kernel: 7.5%  exact:  0.0% [1000Hz cycles],
(target_pid: 18450)

---------------------------------------------------------------------------------------------------------------------------------------------

             samples  pcnt function                    DSO
             _______ _____ ___________________________
________________________________________________

              235.00  7.5% AllocSetAlloc
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
              155.00  5.0% SearchCatCache
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
              134.00  4.3% hash_search_with_hash_value
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
              128.00  4.1% LWLockAcquire
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
              110.00  3.5% _bt_compare
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
              106.00  3.4% __memcpy_ssse3_back         /lib64/libc-2.12.so
              106.00  3.4% XLogInsert
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
               92.00  2.9% cmp_numerics
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
               73.00  2.3% ExecInitExpr
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
               64.00  2.0% MemoryContextAlloc
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
               61.00  1.9% GetSnapshotData
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
               60.00  1.9% _int_malloc                 /lib64/libc-2.12.so
               58.00  1.9% nocache_index_getattr
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
               49.00  1.6% PinBuffer
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres
               49.00  1.6% cmp_abs_common
/post/PostgreSQL2/postgresql-9.2rc1/bin/postgres



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Newbie-performance-problem-semop-taking-most-of-time-tp5724566p5725062.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


Re: Newbie performance problem - semop taking most of time ?

From
Tom Lane
Date:
"mal.oracledba" <mal.oracledba@gmail.com> writes:
> A snapshot of 'perf' output for one of the users below. It doesnt change
> much from that profile.

You might want to look into whether you could use int or bigint instead
of numeric for your indexed columns.  That trace is showing 4.5% of
runtime directly blamable on cmp_numerics plus its subroutine
cmp_abs_common, and I'll bet that a noticeable fraction of the
AllocSetAlloc, memcpy, and malloc traffic is attributable to numeric
operations too.  You could probably not expect to get much more than 5%
savings, but still, if you don't need fractions then that's 5% that's
just being wasted.

The bigger picture here though is a lot of context swaps, and I wonder
how much of that is blamable on your having activated commit_delay
(especially with such silly settings as you chose ... more is not
better there).

The fact that XLogInsert shows up high on the profile might indicate
that contention for WALInsertLock is a factor, though it's hardly proof
of that.  If that's the main problem there's not too much that can be
done about it in the near term.  (There's some work going on to reduce
contention for that lock, but it's not done yet, let alone in 9.2.)
In a real application you could possibly reduce the problem by
rearranging operations, but that would be cheating of course for a
benchmark.

            regards, tom lane