Re: stats collector process high CPU utilization - Mailing list pgsql-performance

From Benjamin Minshall
Subject Re: stats collector process high CPU utilization
Date
Msg-id 45CB9C4D.30902@intellicon.biz
Whole thread Raw
In response to Re: stats collector process high CPU utilization  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: stats collector process high CPU utilization
List pgsql-performance
Tom Lane wrote:
> Benjamin Minshall <minshall@intellicon.biz> writes:
>> Tom Lane wrote:
>>> Can you gather some info about what it's doing?
>>> strace'ing the stats collector might prove interesting, also if you have
>>> built it with --enable-debug then oprofile results would be helpful.
>
>> I will gather some strace info later today when I have a chance to
>> shutdown the server.
>
> I don't see why you'd need to shut anything down.  Just run
>     strace -p stats-process-ID
> for a few seconds or minutes (enough to gather maybe a few thousand
> lines of output).
>

Seems the problem may be related to a huge global/pgstat.stat file.
Under 8.1.5 it was about 1 MB; now it's 90 MB in 8.2.3.

I ran strace for 60 seconds:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  95.71    1.119004       48652        23           rename
   4.29    0.050128           0    508599           write
   0.00    0.000019           0       249        22 poll
   0.00    0.000000           0        23           open
   0.00    0.000000           0        23           close
   0.00    0.000000           0        34           getppid
   0.00    0.000000           0        23           munmap
   0.00    0.000000           0        23           setitimer
   0.00    0.000000           0        23        22 sigreturn
   0.00    0.000000           0        23           mmap2
   0.00    0.000000           0        23           fstat64
   0.00    0.000000           0       216           recv
------ ----------- ----------- --------- --------- ----------------
100.00    1.169151                509282        44 total

I attached an excerpt of the full strace with the many thousands of
write calls filtered.

--
Benjamin Minshall <minshall@intellicon.biz>
Senior Developer -- Intellicon, Inc.
http://www.intellicon.biz
.
.
.
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0T\274\355"..., 4096) = 4096
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0/O\227\27\230"..., 4096) = 4096
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3363) = 3363
close(3)                                = 0
munmap(0xacdfc000, 4096)                = 0
rename("global/pgstat.tmp", "global/pgstat.stat") = 0
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\1\0\0\0\0\0\0\0\301"..., 1000, 0) = 976
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 500}}, NULL) = 0
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\0\0\0\0\0\0\0\0\322"..., 1000, 0) = 976
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0H\3\0\0:\204\30\0\f\0\0\0\1\0\0\0\0\0\0\0\301\204"..., 1000, 0) = 840
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\1\0\0\0\0\0\0\0\301"..., 1000, 0) = 976
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\0\0\0\0\0\0\0\000"..., 1000, 0) = 976
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0\240\0\0\0:\204\30\0\2\0\0\0\0\0\0\0\0\0\0\0\303"..., 1000, 0) = 160
poll([{fd=5, events=POLLIN|POLLERR}], 1, 2000) = -1 EINTR (Interrupted system call)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
getppid()                               = 22447
open("global/pgstat.tmp", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
fstat64(3, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xacdfc000
write(3, "\226\274\245\1D\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(3, "Z\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
.
.
.

Attachment

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: stats collector process high CPU utilization
Next
From: Tom Lane
Date:
Subject: Re: stats collector process high CPU utilization