Re: autovacuum locks - Mailing list pgsql-hackers
From | Gregg Jaskiewicz |
---|---|
Subject | Re: autovacuum locks |
Date | |
Msg-id | CAJY59_gizufaLQRUXh8z4SUYzBrDVs5Cn3O51spF-GXcyK1wnA@mail.gmail.com Whole thread Raw |
In response to | Re: autovacuum locks (Alvaro Herrera <alvherre@commandprompt.com>) |
Responses |
Re: autovacuum locks
|
List | pgsql-hackers |
On 2 March 2012 11:03, Alvaro Herrera <alvherre@commandprompt.com> wrote: > > Excerpts from Gregg Jaskiewicz's message of vie mar 02 07:44:07 -0300 2012: >> Folks, >> >> I got a system here (8.3.7), that is locked up. Few queries waiting >> for autovacuum aquired locks on a table or two. >> But it looks like autovacuum is also waiting for some semaphore: >> >> >> #0 0x00f07410 in __kernel_vsyscall () >> #1 0x00252d2b in semop () from /lib/libc.so.6 >> #2 0x081ca1ce in PGSemaphoreLock () >> #3 0x081f590e in ProcWaitForSignal () >> #4 0x081e8dea in LockBufferForCleanup () >> #5 0x08160e68 in ?? () >> #6 0x08161482 in lazy_vacuum_rel () >> #7 0x081603b5 in ?? () >> #8 0x081605c4 in vacuum () >> #9 0x081cd154 in ?? () >> #10 0x081cd7f1 in ?? () >> #11 0x081cd911 in StartAutoVacWorker () >> #12 0x081d6d47 in ?? () >> #13 <signal handler called> >> #14 0x00f07410 in __kernel_vsyscall () >> #15 0x00249fcd in ___newselect_nocancel () from /lib/libc.so.6 >> #16 0x081d3964 in ?? () >> #17 0x081d5105 in PostmasterMain () >> #18 0x0818be60 in main () >> >> There's still other transcations going on , and working fine - but two >> tables seem to be cut out by autovacuum, and other queries stuck >> waiting for autovacuum to finish. > > This vacuum is waiting for some other process to release the buffer > lock. These locks are usually very short-lived, but maybe you have some > other backend doing something weird that won't release it. Do you have > idle processes with transactions open, and if you have, what are they > doing? Do you have prepared transactions open? (see the > pg_prepared_xacts view) That view is empty, but I know that the queries that are waiting for locks to be released - are in the middle of two phase commit transaction. It could be that the system is doing something, but it is just so slow - that those locks are starved - and no longer so short lived :) Is there any particular info I could gather to make it easier for you guys to look at ? Here's a backtrace from one of the autovacuums (this time with line # info): #0 0x00f07410 in __kernel_vsyscall () #1 0x00252d2b in semop () from /lib/libc.so.6 #2 0x081ca1ce in PGSemaphoreLock (sema=0xb792bdf8, interruptOK=1 '\001') at pg_sema.c:420 #3 0x081f590e in ProcWaitForSignal () at proc.c:1269 #4 0x081e8dea in LockBufferForCleanup (buffer=24516) at bufmgr.c:2304 #5 0x08160e68 in lazy_vacuum_heap (onerel=0x87573c0, vacrelstats=0x86a4768) at vacuumlazy.c:655 #6 0x08161482 in lazy_scan_heap (onerel=0x87573c0, vacstmt=0xbfbcfb50, bstrategy=0x86fbe98) at vacuumlazy.c:591 #7 lazy_vacuum_rel (onerel=0x87573c0, vacstmt=0xbfbcfb50, bstrategy=0x86fbe98) at vacuumlazy.c:191 #8 0x081603b5 in vacuum_rel (relid=<value optimized out>, vacstmt=0xbfbcfb50, expected_relkind=114 'r', for_wraparound=0 '\000') at vacuum.c:1126 #9 0x081605c4 in vacuum (vacstmt=0xbfbcfb50, relids=0x86fd310, bstrategy=0x86fbe98, for_wraparound=0 '\000', isTopLevel=1 '\001') at vacuum.c:427 #10 0x081cd154 in do_autovacuum () at autovacuum.c:2638 #11 0x081cd7f1 in AutoVacWorkerMain (argc=<value optimized out>, argv=<value optimized out>) at autovacuum.c:1613 #12 0x081cd911 in StartAutoVacWorker () at autovacuum.c:1416 #13 0x081d6d47 in StartAutovacuumWorker (postgres_signal_arg=10) at postmaster.c:4109 #14 sigusr1_handler (postgres_signal_arg=10) at postmaster.c:3845 #15 <signal handler called> #16 0x00f07410 in __kernel_vsyscall () #17 0x00249fcd in ___newselect_nocancel () from /lib/libc.so.6 #18 0x081d3964 in ServerLoop () at postmaster.c:1234 #19 0x081d5105 in PostmasterMain (argc=5, argv=0x8641828) at postmaster.c:1029 #20 0x0818be60 in main (argc=5, argv=0x0) at main.c:188 And the connection that is waiting for locks to be released (the locked up connection). That query is hanging there since ~18:00 BST: #0 0x00f07410 in __kernel_vsyscall () #1 0x00252d2b in semop () from /lib/libc.so.6 #2 0x081ca1ce in PGSemaphoreLock (sema=0xb7929618, interruptOK=1 '\001') at pg_sema.c:420 #3 0x081f5c53 in ProcSleep (locallock=0x86578f0, lockMethodTable=0x834e0f4) at proc.c:880 #4 0x081f4a5d in WaitOnLock (locallock=0x86578f0, owner=0x88676d8) at lock.c:1147 #5 0x081f50af in LockAcquire (locktag=0xbfbcf2b8, lockmode=7, sessionLock=0 '\000', dontWait=0 '\000') at lock.c:792 #6 0x081f27f9 in LockTuple (relation=0xa72f53a8, tid=0xbfbcf334, lockmode=7) at lmgr.c:378 #7 0x0809b19c in heap_update (relation=0xa72f53a8, otid=0xbfbcf3da, newtup=0x86d3be0, ctid=0xbfbcf3d4, update_xmax=0xbfbcf3e0, cid=2, crosscheck=0x0, wait=1 '\001') at heapam.c:2376 #8 0x08167f8b in ExecutePlan (queryDesc=0x870d010, direction=ForwardScanDirection, count=0) at execMain.c:1880 #9 ExecutorRun (queryDesc=0x870d010, direction=ForwardScanDirection, count=0) at execMain.c:270 #10 0x0818051e in _SPI_execute_plan (plan=0x87b8948, Values=0x8e960f8, Nulls=0x8e960e8 " ", snapshot=0x0, crosscheck_snapshot=0x0, read_only=0 '\000', fire_triggers=1 '\001', tcount=0) at spi.c:1845 #11 0x08180796 in SPI_execute_plan (plan=0x87b8948, Values=0x8e960f8, Nulls=0x8e960e8 " ", read_only=0 '\000', tcount=0) at spi.c:374 #12 0x0012d488 in exec_stmt_execsql (estate=0xbfbcf7bc, stmt=0x87a2630) at pl_exec.c:2546 #13 0x0012f0cc in exec_stmt (estate=0xbfbcf7bc, stmts=<value optimized out>) at pl_exec.c:1260 #14 exec_stmts (estate=0xbfbcf7bc, stmts=<value optimized out>) at pl_exec.c:1171 #15 0x00130d19 in exec_stmt_block (estate=0xbfbcf7bc, block=0x87a2778) at pl_exec.c:1114 #16 0x001310e7 in plpgsql_exec_trigger (func=0x8844ae0, trigdata=0xbfbcfb78) at pl_exec.c:636 #17 0x00127082 in plpgsql_call_handler (fcinfo=0xbfbcf934) at pl_handler.c:92 #18 0x0814a63a in ExecCallTriggerFunc (trigdata=0xbfbcfb78, tgindx=8, finfo=0x12800b, instr=0x0, per_tuple_context=0x87f5fd8) at trigger.c:1577 #19 0x0814b69b in ExecBRUpdateTriggers (estate=0x885bd28, relinfo=0x885bdb0, tupleid=0xbfbcfc2a, newtuple=0x8899310) at trigger.c:2000 #20 0x0816820e in ExecutePlan (queryDesc=0x8ab81e8, direction=ForwardScanDirection, count=0) at execMain.c:1835 #21 ExecutorRun (queryDesc=0x8ab81e8, direction=ForwardScanDirection, count=0) at execMain.c:270 #22 0x08203054 in ProcessQuery (plan=0x8ab7f18, params=<value optimized out>, dest=0x8ab81a0, completionTag=0xbfbcfcc8 "") at pquery.c:179 #23 0x08203251 in PortalRunMulti (portal=0x86eb0b0, isTopLevel=<value optimized out>, dest=0x8ab81a0, altdest=0x83bf95c, completionTag=0xbfbcfcc8 "") at pquery.c:1242 #24 0x0820345d in FillPortalStore (portal=0x86eb0b0, isTopLevel=1 '\001') at pquery.c:1045 #25 0x08203a31 in PortalRun (portal=0x86eb0b0, count=2147483647, isTopLevel=1 '\001', dest=0x86a16d0, altdest=0x86a16d0, completionTag=0xbfbcffd8 "") at pquery.c:792 #26 0x08200154 in exec_execute_message (argc=4, argv=<value optimized out>, username=0x8644890 "foo") at postgres.c:1942 #27 PostgresMain (argc=4, argv=<value optimized out>, username=0x8644890 "foo") at postgres.c:3688 #28 0x081d4228 in BackendStartup () at postmaster.c:3207 #29 ServerLoop () at postmaster.c:1274 #30 0x081d5105 in PostmasterMain (argc=5, argv=0x8641828) at postmaster.c:1029 #31 0x0818be60 in main (argc=5, argv=0x4489ffff) at main.c:188 -- GJ
pgsql-hackers by date: