Thread: pl/pgsql function spikes CPU 100%
I've got a client that has a function in a db which had been humming along quite nicely on 2xOpteron 275, PG 8.1.5, 8GB of RAM. Now suddenly many of the functions in the DB if called will spike the CPU to 100%. These are functions that used to finish in 7ms, now run for 20-40 mins. Interestingly, when you strace the backend, it doesn't appear to be doing too much...here's some sample output: select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) semop(3932217, 0x7fbfffd150, 1) = 0 semop(3932217, 0x7fbfffd150, 1) = 0 semop(3932217, 0x7fbfffd150, 1) = 0 semop(3932217, 0x7fbfffd150, 1) = 0 semop(3932217, 0x7fbfffd150, 1) = 0 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) semop(3997755, 0x7fbfffd170, 1) = 0 semop(3932217, 0x7fbfffd150, 1) = 0 Any chance we've stumbled into some corner case bug? We actually moved the DB to a different server thinking perhaps we had gotten to the limit of slow hardware, but in fact it happens on the other server as well. I don't see any ungranted locks in pg_locks, nor are there any other non idle queries this time of the night. I'll see if I can share the function code tomorrow when people are awake again in case we have something especially silly in there. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
Are the stat collector and autovacuum processes in good shape?
--
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)
--
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)
On 3/16/07, Jeff Frost <jeff@frostconsultingllc.com> wrote:
I've got a client that has a function in a db which had been humming along
quite nicely on 2xOpteron 275, PG 8.1.5, 8GB of RAM. Now suddenly many of the
functions in the DB if called will spike the CPU to 100%. These are functions
that used to finish in 7ms, now run for 20-40 mins. Interestingly, when you
strace the backend, it doesn't appear to be doing too much...here's some
sample output:
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(3932217, 0x7fbfffd150, 1) = 0
semop(3932217, 0x7fbfffd150, 1) = 0
semop(3932217, 0x7fbfffd150, 1) = 0
semop(3932217, 0x7fbfffd150, 1) = 0
semop(3932217, 0x7fbfffd150, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(3997755, 0x7fbfffd170, 1) = 0
semop(3932217, 0x7fbfffd150, 1) = 0
Any chance we've stumbled into some corner case bug? We actually moved the DB
to a different server thinking perhaps we had gotten to the limit of slow
hardware, but in fact it happens on the other server as well.
I don't see any ungranted locks in pg_locks, nor are there any other non idle
queries this time of the night.
I'll see if I can share the function code tomorrow when people are awake again
in case we have something especially silly in there.
--
Jeff Frost, Owner < jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate
Everything else except the one postmaster process hum along just fine. I.e. nothing else appears to take much system resources at all. Autovac is set with the 8.2.x default settings. Oh, and the data was ANALYZE'd after it got moved to the new server. Here's the code in case we have something especially silly going on: CREATE OR REPLACE FUNCTION populate_page_view_indices() RETURNS integer AS $BODY$ DECLARE v_page_view_row RECORD; v_cindex INTEGER; v_tindex INTEGER; v_visit_id BIGINT; v_get BOOLEAN; v_row_count INTEGER; BEGIN RAISE NOTICE 'Populating page_view indices ...'; UPDATE visit SET status = 'H' FROM (SELECT visit_id FROM page_view p, visit v WHERE p.visit_id = v.id AND v.status = 'N' GROUP BY visit_id HAVING max(p.stamp) < now() - INTERVAL '1 hour') AS ready WHERE visit.id = ready.visit_id AND visit.status = 'N'; v_cindex := -1; v_tindex := -1; v_visit_id := -1; FOR v_page_view_row IN SELECT p.* FROM page_view p, visit v WHERE p.visit_id = v.id AND v.status = 'H' ORDER BY visit_id, p.stamp LOOP v_get := (v_page_view_row.method = 'GET'); IF (v_visit_id != v_page_view_row.visit_id) THEN v_visit_id := v_page_view_row.visit_id; v_tindex := 1; v_cindex := 1; ELSE v_tindex := v_tindex + 1; IF v_get THEN v_cindex := v_cindex + 1; END IF; END IF; UPDATE page_view SET tindex = v_tindex, cindex = CASE WHEN v_get THEN v_cindex ELSE -1 END WHERE id = v_page_view_row.id; END LOOP; UPDATE visit SET status = 'I' WHERE status = 'H'; GET DIAGNOSTICS v_row_count = ROW_COUNT; RAISE NOTICE 'Done populating page_view indices ...'; RETURN v_row_count; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE; ALTER FUNCTION populate_page_view_indices() OWNER TO postgres; And the other that seems to tickle the problem: CREATE OR REPLACE FUNCTION populate_page_view_clickstreams() RETURNS integer AS $BODY$ DECLARE v_row_count INTEGER; BEGIN RAISE NOTICE 'Populating page_view clickstreams ...'; UPDATE visit SET status = 'K' WHERE status = 'I'; INSERT INTO tlink (id, from_id, to_id) SELECT nextval('hibernate_sequence'), f.id AS from_id, t.id AS to_id FROM page_view f, page_view t, visit v WHERE f.visit_id = t.visit_id AND f.visit_id = v.id AND v.status = 'K' AND f.tindex = t.tindex - 1 ORDER BY f.visit_id, f.tindex; INSERT INTO clink (id, from_id, to_id) SELECT nextval('hibernate_sequence'), f.id AS from_id, t.id AS to_id FROM page_view f, page_view t, visit v WHERE f.visit_id = t.visit_id AND f.visit_id = v.id AND v.status = 'K' AND f.cindex = t.cindex - 1 ORDER BY f.visit_id, f.cindex; UPDATE visit SET status = 'L' WHERE status = 'K'; GET DIAGNOSTICS v_row_count = ROW_COUNT; RAISE NOTICE 'Done populating page_view clickstreams ...'; RETURN v_row_count; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE; ALTER FUNCTION populate_page_view_clickstreams() OWNER TO postgres; On Fri, 16 Mar 2007, Shoaib Mir wrote: > Are the stat collector and autovacuum processes in good shape? > > -- > Shoaib Mir > EnterpriseDB (www.enterprisedb.com) > > On 3/16/07, Jeff Frost <jeff@frostconsultingllc.com> wrote: >> >> I've got a client that has a function in a db which had been humming along >> quite nicely on 2xOpteron 275, PG 8.1.5, 8GB of RAM. Now suddenly many of >> the >> functions in the DB if called will spike the CPU to 100%. These are >> functions >> that used to finish in 7ms, now run for 20-40 mins. Interestingly, when >> you >> strace the backend, it doesn't appear to be doing too much...here's some >> sample output: >> >> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) >> semop(3932217, 0x7fbfffd150, 1) = 0 >> semop(3932217, 0x7fbfffd150, 1) = 0 >> semop(3932217, 0x7fbfffd150, 1) = 0 >> semop(3932217, 0x7fbfffd150, 1) = 0 >> semop(3932217, 0x7fbfffd150, 1) = 0 >> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) >> semop(3997755, 0x7fbfffd170, 1) = 0 >> semop(3932217, 0x7fbfffd150, 1) = 0 >> >> Any chance we've stumbled into some corner case bug? We actually moved >> the DB >> to a different server thinking perhaps we had gotten to the limit of slow >> hardware, but in fact it happens on the other server as well. >> >> I don't see any ungranted locks in pg_locks, nor are there any other non >> idle >> queries this time of the night. >> >> I'll see if I can share the function code tomorrow when people are awake >> again >> in case we have something especially silly in there. >> >> >> -- >> Jeff Frost, Owner <jeff@frostconsultingllc.com> >> Frost Consulting, LLC http://www.frostconsultingllc.com/ >> Phone: 650-780-7908 FAX: 650-649-1954 >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 7: You can help support the PostgreSQL project by donating at >> >> http://www.postgresql.org/about/donate >> > -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
Jeff Frost <jeff@frostconsultingllc.com> writes: > ... Interestingly, when you > strace the backend, it doesn't appear to be doing too much...here's some > sample output: > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > semop(3932217, 0x7fbfffd150, 1) = 0 > semop(3932217, 0x7fbfffd150, 1) = 0 > semop(3932217, 0x7fbfffd150, 1) = 0 > semop(3932217, 0x7fbfffd150, 1) = 0 > semop(3932217, 0x7fbfffd150, 1) = 0 > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > semop(3997755, 0x7fbfffd170, 1) = 0 > semop(3932217, 0x7fbfffd150, 1) = 0 This looks suspiciously like the sort of trace we saw in the various "context swap storm" threads. The test cases for those generally involved really tight indexscan loops, ie, the backends were spending all their time trying to access shared buffers. If you haven't changed the function or the data, then I concur with the nearby worry about autovacuuming (large buildup of dead tuples could result in this symptom). Or maybe you've got an old open transaction that is blocking cleanup? regards, tom lane
On Fri, 16 Mar 2007, Tom Lane wrote: > Jeff Frost <jeff@frostconsultingllc.com> writes: >> ... Interestingly, when you >> strace the backend, it doesn't appear to be doing too much...here's some >> sample output: > >> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) >> semop(3932217, 0x7fbfffd150, 1) = 0 >> semop(3932217, 0x7fbfffd150, 1) = 0 >> semop(3932217, 0x7fbfffd150, 1) = 0 >> semop(3932217, 0x7fbfffd150, 1) = 0 >> semop(3932217, 0x7fbfffd150, 1) = 0 >> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) >> semop(3997755, 0x7fbfffd170, 1) = 0 >> semop(3932217, 0x7fbfffd150, 1) = 0 > > This looks suspiciously like the sort of trace we saw in the various > "context swap storm" threads. The test cases for those generally > involved really tight indexscan loops, ie, the backends were spending > all their time trying to access shared buffers. If you haven't changed > the function or the data, then I concur with the nearby worry about > autovacuuming (large buildup of dead tuples could result in this symptom). > Or maybe you've got an old open transaction that is blocking cleanup? Tom, I doubt it's a problem with autovacuum as the data in this server was just loaded an hour before the strace above was taken, so there should have been almost no dead tuples, especially since these tables are nearly write once. I.e. they get written to once, then the populate function updates them, then months later they get archived off. There did not appear to be high context switch activity nor any IO wait to mention during the time I was watching the postmaster. If it's worth mentioning, it's running CentOS 4.4 with the kernel-2.6.9-34.EL kernel. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On Fri, 16 Mar 2007, Jeff Frost wrote: > On Fri, 16 Mar 2007, Tom Lane wrote: > >> Jeff Frost <jeff@frostconsultingllc.com> writes: >>> ... Interestingly, when you >>> strace the backend, it doesn't appear to be doing too much...here's some >>> sample output: >> >>> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) >>> semop(3932217, 0x7fbfffd150, 1) = 0 >>> semop(3932217, 0x7fbfffd150, 1) = 0 >>> semop(3932217, 0x7fbfffd150, 1) = 0 >>> semop(3932217, 0x7fbfffd150, 1) = 0 >>> semop(3932217, 0x7fbfffd150, 1) = 0 >>> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) >>> semop(3997755, 0x7fbfffd170, 1) = 0 >>> semop(3932217, 0x7fbfffd150, 1) = 0 >> >> This looks suspiciously like the sort of trace we saw in the various >> "context swap storm" threads. The test cases for those generally >> involved really tight indexscan loops, ie, the backends were spending >> all their time trying to access shared buffers. If you haven't changed >> the function or the data, then I concur with the nearby worry about >> autovacuuming (large buildup of dead tuples could result in this symptom). >> Or maybe you've got an old open transaction that is blocking cleanup? > > Tom, > > I doubt it's a problem with autovacuum as the data in this server was just > loaded an hour before the strace above was taken, so there should have been > almost no dead tuples, especially since these tables are nearly write once. > I.e. they get written to once, then the populate function updates them, then > months later they get archived off. > > There did not appear to be high context switch activity nor any IO wait to > mention during the time I was watching the postmaster. If it's worth > mentioning, it's running CentOS 4.4 with the kernel-2.6.9-34.EL kernel. Oh, and I meant to mention that this query was the only thing in pg_stat_activity during the painful time it was running, and there were no ungranted locks in pg_locks. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On Fri, 16 Mar 2007, Jeff Frost wrote: > On Fri, 16 Mar 2007, Jeff Frost wrote: > >> On Fri, 16 Mar 2007, Tom Lane wrote: >> >>> Jeff Frost <jeff@frostconsultingllc.com> writes: >>>> ... Interestingly, when you >>>> strace the backend, it doesn't appear to be doing too much...here's some >>>> sample output: >>> >>>> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) >>>> semop(3932217, 0x7fbfffd150, 1) = 0 >>>> semop(3932217, 0x7fbfffd150, 1) = 0 >>>> semop(3932217, 0x7fbfffd150, 1) = 0 >>>> semop(3932217, 0x7fbfffd150, 1) = 0 >>>> semop(3932217, 0x7fbfffd150, 1) = 0 >>>> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) >>>> semop(3997755, 0x7fbfffd170, 1) = 0 >>>> semop(3932217, 0x7fbfffd150, 1) = 0 >>> >>> This looks suspiciously like the sort of trace we saw in the various >>> "context swap storm" threads. The test cases for those generally >>> involved really tight indexscan loops, ie, the backends were spending >>> all their time trying to access shared buffers. If you haven't changed >>> the function or the data, then I concur with the nearby worry about >>> autovacuuming (large buildup of dead tuples could result in this symptom). >>> Or maybe you've got an old open transaction that is blocking cleanup? >> >> Tom, >> >> I doubt it's a problem with autovacuum as the data in this server was just >> loaded an hour before the strace above was taken, so there should have been >> almost no dead tuples, especially since these tables are nearly write once. >> I.e. they get written to once, then the populate function updates them, >> then months later they get archived off. >> >> There did not appear to be high context switch activity nor any IO wait to >> mention during the time I was watching the postmaster. If it's worth >> mentioning, it's running CentOS 4.4 with the kernel-2.6.9-34.EL kernel. > > Oh, and I meant to mention that this query was the only thing in > pg_stat_activity during the painful time it was running, and there were no > ungranted locks in pg_locks. Well, finally got this system upgraded to 8.1.8, but unfortunately, that did not resolve this. Is there any reasonable way to see where this function is spending it's time? The on disk size of the tables in question are about 500MB each including the indexes. The machine in question has 8GB of RAM and the total DB working set size is about 7GB. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
Jeff Frost <jeff@frostconsultingllc.com> writes: > Well, finally got this system upgraded to 8.1.8, but unfortunately, that did > not resolve this. Is there any reasonable way to see where this function is > spending it's time? I've grown enamored of oprofile lately --- if you're on a recent Linux system, try that. Solaris' DTrace is said to have equivalent capability. On other systems you could consider plain 'ol gprof. Any of these will require a certain amount of man-page-reading :-( regards, tom lane
On Tue, 24 Apr 2007, Tom Lane wrote: > Jeff Frost <jeff@frostconsultingllc.com> writes: >> Well, finally got this system upgraded to 8.1.8, but unfortunately, that did >> not resolve this. Is there any reasonable way to see where this function is >> spending it's time? > > I've grown enamored of oprofile lately --- if you're on a recent Linux > system, try that. Solaris' DTrace is said to have equivalent > capability. On other systems you could consider plain 'ol gprof. > > Any of these will require a certain amount of man-page-reading :-( Got my oprofile man page reading done. Here's the general opreport: CPU: P4 / Xeon with 2 hyper-threads, speed 3196.26 MHz (estimated) Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 GLOBAL_POWER_E...| samples| %| ------------------ 26902279 96.3837 postgres 612276 2.1936 vmlinux 172733 0.6189 oprofiled 94914 0.3401 libc-2.3.4.so 54608 0.1956 oprofile 22887 0.0820 opreport 10811 0.0387 jbd 7792 0.0279 ext3 7673 0.0275 ld-2.3.4.so 6143 0.0220 libstdc++.so.6.0.3 4336 0.0155 e1000 3253 0.0117 ip_conntrack 2305 0.0083 ip_tables 1957 0.0070 libcrypto.so.0.9.7a 1275 0.0046 hald 1066 0.0038 uhci_hcd 859 0.0031 iptable_nat And the postgres one: CPU: P4 / Xeon with 2 hyper-threads, speed 3196.26 MHz (estimated) Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 samples % symbol name 3894022 12.6488 LWLockAcquire 3535497 11.4842 slot_deform_tuple 3030280 9.8431 LWLockRelease 2279699 7.4050 HeapTupleSatisfiesSnapshot 1782097 5.7887 ExecMakeFunctionResultNoSets 1209262 3.9280 ExecEvalScalarVar 1128461 3.6655 heap_release_fetch 916447 2.9768 LockBuffer 898437 2.9183 _bt_checkkeys 887904 2.8841 btgettuple 728377 2.3660 slot_getattr 648121 2.1053 hash_search 608352 1.9761 _bt_restscan 574699 1.8668 index_getnext 570999 1.8547 _bt_step 517225 1.6801 FunctionCall2 501845 1.6301 _bt_next 452142 1.4687 IndexNext 426455 1.3852 PinBuffer Is any of that helpful? This is the 2.6.9-34.ELsmp centos kernel. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
Jeff Frost <jeff@frostconsultingllc.com> writes: > Got my oprofile man page reading done. Here's the general opreport: > ... > And the postgres one: > samples % symbol name > 3894022 12.6488 LWLockAcquire > 3535497 11.4842 slot_deform_tuple > 3030280 9.8431 LWLockRelease > 2279699 7.4050 HeapTupleSatisfiesSnapshot > 1782097 5.7887 ExecMakeFunctionResultNoSets > 1209262 3.9280 ExecEvalScalarVar > 1128461 3.6655 heap_release_fetch > 916447 2.9768 LockBuffer > 898437 2.9183 _bt_checkkeys > 887904 2.8841 btgettuple > 728377 2.3660 slot_getattr > 648121 2.1053 hash_search > 608352 1.9761 _bt_restscan > 574699 1.8668 index_getnext > 570999 1.8547 _bt_step > 517225 1.6801 FunctionCall2 > 501845 1.6301 _bt_next > 452142 1.4687 IndexNext > 426455 1.3852 PinBuffer It seems to basically be spending its time scanning tuples and applying some function or other (probably a WHERE clause). The btree references suggest that the scan is an indexscan and not a brute-force seqscan, but still I wonder if the problem isn't that the planner has switched to some plan much less efficient than the one it was using before. oprofile is too low-level to give us much info about that, unfortunately. Probably your next step should be to investigate the plans being used for the queries in the slow functions. See the archives for hints, eg http://archives.postgresql.org/pgsql-performance/2006-06/msg00301.php regards, tom lane