Re: BUG #13493: pl/pgsql doesn't scale with cpus (PG9.3, 9.4) - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #13493: pl/pgsql doesn't scale with cpus (PG9.3, 9.4)
Date
Msg-id 20150708132212.GM10242@alap3.anarazel.de
Whole thread Raw
In response to Re: BUG #13493: pl/pgsql doesn't scale with cpus (PG9.3, 9.4)  (Andres Freund <andres@anarazel.de>)
List pgsql-bugs
On 2015-07-08 14:55:12 +0200, Andres Freund wrote:
> comparing stats between the 4 and 8 client runs shows (removing boring data):

> 4 clients:
>    109,655,985,749      stalled-cycles-frontend   #   54.27% frontend cycles idle     (27.79%)
>     41,664,400,828      branches                  #  458.558 M/sec                    (33.32%)
>        374,426,805      branch-misses             #    0.90% of all branches          (33.32%)

> 8 clients:
>    247,231,674,170      stalled-cycles-frontend   #   67.04% frontend cycles idle     (27.84%)
>     54,503,992,461      branches                  #  329.991 M/sec                    (33.39%)
>        761,911,056      branch-misses             #    1.40% of all branches          (33.38%)

looking at stalled-cycles-frontend, and branch-misses shows:

-e branch-misses
4:
+    7.34%  postgres         plpgsql.so                     [.] plpgsql_exec_function
+    6.33%  postgres         postgres                       [.] standard_ExecutorRun
+    5.49%  postgres         postgres                       [.] SPI_push
+    4.24%  postgres         libc-2.19.so                   [.] __memcpy_sse2_unaligned
+    3.43%  postgres         postgres                       [.] ReleaseCachedPlan
+    2.72%  postgres         plpgsql.so                     [.] exec_stmt
+    2.07%  postgres         postgres                       [.] SPI_pop
+    1.94%  postgres         postgres                       [.] ExecLimit
+    1.94%  postgres         libc-2.19.so                   [.] __strcpy_sse2_unaligned
+    1.59%  postgres         libc-2.19.so                   [.] _int_malloc
+    1.46%  postgres         postgres                       [.] LWLockRelease
+    1.33%  postgres         postgres                       [.] AllocSetAlloc
+    1.17%  postgres         libc-2.19.so                   [.] _int_free
+    1.08%  postgres         libc-2.19.so                   [.] memset
+    1.00%  postgres         postgres                       [.] hash_search_with_hash_value
+    0.99%  postgres         postgres                       [.] GetSnapshotData

8:

+   10.66%  pgbench          libpq.so.5.9                   [.] PQsocket
+    8.40%  pgbench          libpthread-2.19.so             [.] __libc_recv
+    5.03%  postgres         plpgsql.so                     [.] plpgsql_exec_function
+    3.00%  postgres         postgres                       [.] AllocSetAlloc
+    2.86%  postgres         postgres                       [.] standard_ExecutorRun
+    2.54%  postgres         plpgsql.so                     [.] exec_stmt
+    2.45%  postgres         libc-2.19.so                   [.] __memcpy_sse2_unaligned
+    2.27%  postgres         postgres                       [.] GetSnapshotData
+    2.22%  postgres         postgres                       [.] ReleaseCachedPlan
+    2.14%  postgres         postgres                       [.] OverrideSearchPathMatchesCurrent
+    2.06%  postgres         postgres                       [.] SPI_push
+    2.02%  postgres         postgres                       [.] SPI_pop
+    1.92%  postgres         libc-2.19.so                   [.] _int_malloc
+    1.86%  postgres         postgres                       [.] ExecLimit
+    1.77%  postgres         postgres                       [.] SPI_connect
+    1.74%  postgres         libc-2.19.so                   [.] __strcpy_sse2_unaligned

It's interesting to see that the limited size of the trace buffer leads
to previously perfectly predicted functions like PQsocket regularly
causing cache misses now... Interesting to see how GetSnapshotData()
rises in comparison.

OverrideSearchPathMatchesCurrent() is also curious, but perhaps not so
surprising considering it's chasing down a linked list - almost
impossible to predict.


-e stalled-cycles-frontend:
4:
+   21.50%  swapper          [kernel.vmlinux]               [k] intel_idle
+    4.21%  pgbench          libpthread-2.19.so             [.] __libc_recv
+    2.68%  postgres         postgres                       [.] LWLockAcquire
+    2.35%  pgbench          [kernel.vmlinux]               [k] fput
+    2.34%  pgbench          [kernel.vmlinux]               [k] unix_stream_recvmsg
+    2.03%  pgbench          [kernel.vmlinux]               [k] system_call
+    2.02%  pgbench          pgbench                        [.] threadRun
+    1.82%  pgbench          [kernel.vmlinux]               [k] system_call_after_swapgs
+    1.73%  postgres         plpgsql.so                     [.] plpgsql_exec_function
+    1.56%  postgres         postgres                       [.] LWLockRelease
+    1.33%  pgbench          [kernel.vmlinux]               [k] sys_recvfrom
+    1.24%  postgres         libc-2.19.so                   [.] _int_malloc
+    1.16%  pgbench          [vdso]                         [.] 0x00000000000008c9
+    1.06%  pgbench          [kernel.vmlinux]               [k] __fget
+    1.04%  postgres         libc-2.19.so                   [.] _int_free
+    1.02%  pgbench          libpthread-2.19.so             [.] __pthread_enable_asynccancel

8:

+    8.41%  swapper          [kernel.vmlinux]               [k] intel_idle
+    4.35%  pgbench          pgbench                        [.] threadRun
+    3.56%  pgbench          libpthread-2.19.so             [.] __libc_recv
+    2.58%  pgbench          [kernel.vmlinux]               [k] unix_stream_recvmsg
+    1.99%  postgres         plpgsql.so                     [.] plpgsql_exec_function
+    1.98%  postgres         postgres                       [.] AllocSetAlloc
+    1.94%  pgbench          [kernel.vmlinux]               [k] sys_recvfrom
+    1.72%  postgres         postgres                       [.] LWLockAcquire
+    1.66%  pgbench          pgbench                        [.] doCustom
+    1.59%  pgbench          [kernel.vmlinux]               [k] system_call_after_swapgs
+    1.58%  pgbench          [kernel.vmlinux]               [k] system_call
+    1.51%  pgbench          [vdso]                         [.] __vdso_gettimeofday
+    1.50%  pgbench          [kernel.vmlinux]               [k] __fget
+    1.21%  postgres         libc-2.19.so                   [.] memset
+    1.11%  postgres         libc-2.19.so                   [.] _int_malloc
+    1.08%  postgres         postgres                       [.] GetSnapshotData

(intel_idle is executed on a cpu when it's idle. Not surprising that it
shows up prominently, especially when not all cores are busy). It's
interesting to see how the locking functions are less prominent in the
-c8 case, and how overhead of allocation and plpgsql_exec_function
rises.

pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: BUG #13493: pl/pgsql doesn't scale with cpus (PG9.3, 9.4)
Next
From: atulcs178@gmail.com
Date:
Subject: BUG #13494: Postgresql database displays first column data on merging of two columns in the Select statement