On Wed, Aug 17, 2016 at 4:28 PM, Andres Freund <andres@anarazel.de> wrote:
> Could you also provide a strace -ttt -T -c and a cpu cycles flamegraph?
Here is the output from that strace invocation, plus a -p (to attach
to the relevant backend):
strace: -t has no effect with -c
strace: -T has no effect with -c
strace: Process 27986 attached
strace: Process 27986 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
55.75 0.629331 17981 35 16 unlink
17.49 0.197422 0 2505449 write
11.69 0.132000 11000 12 fsync
8.13 0.091799 0 2078837 read
5.32 0.060000 12000 5 ftruncate
0.98 0.011011 24 460 brk
0.64 0.007218 1805 4 munmap
0.00 0.000050 0 6382 lseek
0.00 0.000000 0 58 5 open
0.00 0.000000 0 58 close
0.00 0.000000 0 14 stat
0.00 0.000000 0 4 mmap
0.00 0.000000 0 2 rt_sigprocmask
0.00 0.000000 0 12 6 rt_sigreturn
0.00 0.000000 0 1 select
0.00 0.000000 0 16 sendto
0.00 0.000000 0 2 1 recvfrom
0.00 0.000000 0 16 kill
0.00 0.000000 0 19 semop
0.00 0.000000 0 63 getrusage
0.00 0.000000 0 5 epoll_create
0.00 0.000000 0 9 4 epoll_wait
0.00 0.000000 0 10 epoll_ctl
------ ----------- ----------- --------- --------- ----------------
100.00 1.128831 4591473 32 total
This doesn't seem that interesting, but not sure what you're looking for.
I also attach cycles flamegraph.
trace_sort indicated that the tuplesort CLUSTER takes just under 3
minutes (this includes writing out the new heap, of course).
--
Peter Geoghegan