On Fri, Mar 11, 2011 at 11:13:43AM -0500, Noah Misch wrote:
> gdb -ex=bt /path/to/bin/postgres $pid </dev/null
hi
so, let me remind what's what.
I wrote a script, that every 15 seconds, checks system for Pg backends in
"PARSE" state. If there are more than 100 of them, script randombly chooses
10 of them, and runs "gdb -batch -quiet -ex=bt /usr/bin/postgres PID" on
them.
Over the weekend I got 2125 such stack traces logged, but only 60 of them
happened when we had such huge unexpected spikes (this db server is quite
busy), with over 400 parsing backends.
These 60 were summarized, and output is available here:
http://www.depesz.com/various/locks.summary.txt
as you can seem, in 48 cases backend process was in semop(), which relates
directly to my previous findings with ps/wchan.
summary format is:
11 0x00000031884d4665 in recv () from /lib64/libc.so.6
#0 0x00000031884d4665 in recv () from /lib64/libc.so.6
#1 0x00000000005366d7 in secure_read ()
#2 0x000000000053d204 in ?? ()
#3 0x000000000053d607 in pq_getbyte ()
#4 0x00000000005afa6d in PostgresMain ()
#5 0x00000000005857b4 in ?? ()
#6 0x000000000058643a in PostmasterMain ()
#7 0x000000000053efde in main ()
means that there were 11 stack traces exacly like the one shown next to number "11"
Any ideas based on the stack traces in the file ( the file itself is 20kB, so I
didn't want to put it in email )
Best regards,
depesz
--
The best thing about modern society is how easy it is to avoid contact with it.
http://depesz.com/