Re: [Testperf-general] Re: First set of OSDL Shared Memscalability results, some wierdness ... - Mailing list pgsql-performance

Josh Berkus <josh@agliodbs.com> writes:
> First off, two test runs with OProfile are available at:
> http://khack.osdl.org/stp/298124/
> http://khack.osdl.org/stp/298121/

Hmm.  The stuff above 1% in the first of these is

Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        app name                 symbol name
8522858  19.7539  vmlinux                  default_idle
3510225   8.1359  vmlinux                  recalc_sigpending_tsk
1874601   4.3449  vmlinux                  .text.lock.signal
1653816   3.8331  postgres                 SearchCatCache
1080908   2.5053  postgres                 AllocSetAlloc
920369    2.1332  postgres                 AtEOXact_Buffers
806218    1.8686  postgres                 OpernameGetCandidates
803125    1.8614  postgres                 StrategyDirtyBufferList
746123    1.7293  vmlinux                  __copy_from_user_ll
651978    1.5111  vmlinux                  __copy_to_user_ll
640511    1.4845  postgres                 XLogInsert
630797    1.4620  vmlinux                  rm_from_queue
607833    1.4088  vmlinux                  next_thread
436682    1.0121  postgres                 LWLockAcquire
419672    0.9727  postgres                 yyparse

In the second test AtEOXact_Buffers is much lower (down around 0.57
percent) but the other suspects are similar.  Since the only difference
in parameters is shared_buffers (36000 vs 9000), it does look like we
are approaching the point where AtEOXact_Buffers is a problem, but so
far it's only a 2% drag.

I suspect the reason recalc_sigpending_tsk is so high is that the
original coding of PG_TRY involved saving and restoring the signal mask,
which led to a whole lot of sigsetmask-type kernel calls.  Is this test
with beta3, or something older?

Another interesting item here is the costs of __copy_from_user_ll/
__copy_to_user_ll:

36000 buffers:
746123    1.7293  vmlinux                  __copy_from_user_ll
651978    1.5111  vmlinux                  __copy_to_user_ll

9000 buffers:
866414    2.0810  vmlinux                  __copy_from_user_ll
852620    2.0479  vmlinux                  __copy_to_user_ll

Presumably the higher costs for 9000 buffers reflect an increased amount
of shuffling of data between kernel and user space.  So 36000 is not
enough to make the working set totally memory-resident, but even if we
drove this cost to zero we'd only be buying a couple percent.

            regards, tom lane

pgsql-performance by date:

Previous
From: Sean Chittenden
Date:
Subject: Re: First set of OSDL Shared Mem scalability results, some wierdness ...
Next
From: Josh Berkus
Date:
Subject: Re: [Testperf-general] Re: First set of OSDL Shared Memscalability results, some wierdness ...