BUG #11444: autovacuum stuck for 5 days and waits on a lock - Mailing list pgsql-bugs
From | alexk@hintbits.com |
---|---|
Subject | BUG #11444: autovacuum stuck for 5 days and waits on a lock |
Date | |
Msg-id | 20140917160216.2486.59930@wrigleys.postgresql.org Whole thread Raw |
Responses |
Re: BUG #11444: autovacuum stuck for 5 days and waits on a
lock
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 11444 Logged by: Alexey Klyukin Email address: alexk@hintbits.com PostgreSQL version: 9.3.5 Operating system: Ubuntu 12.04 3.2.0-52-generic #78-Ubuntu SMP x64 Description: Greetings, We've got a problem with autovacuum process running for already 5 days, without being blocked by any particular process. The table in question is around 500MB. Usually we don't see vacuums running there for more than couple of hours. Other autovacuum processes (on different tables) start and finish without issues. The vacuum and autovacuum settings are left at default values. We don't have debug information compiled on this server, but here's what we have after the gdb bt output, attaching to the autovacuum process. #0 0x00007f423cbf34f7 in semop () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007f423e8581f8 in PGSemaphoreLock () #2 0x00007f423e889a45 in LockBufferForCleanup () #3 0x00007f423e6d9cd6 in ?? () #4 0x00007f423e6da08f in ?? () #5 0x00007f423e6daaf6 in btbulkdelete () #6 0x00007f423e982c7c in FunctionCall4Coll () #7 0x00007f423e6d4a93 in index_bulk_delete () #8 0x00007f423e7c8322 in ?? () #9 0x00007f423e7c937d in lazy_vacuum_rel () #10 0x00007f423e7c6a26 in ?? () #11 0x00007f423e7c78cc in vacuum () #12 0x00007f423e85ac02 in ?? () #13 0x00007f423e85b107 in ?? () #14 0x00007f423e85b1c2 in StartAutoVacWorker () #15 0x00007f423e8683f8 in ?? () #16 <signal handler called> #17 0x00007f423cbea743 in select () from /lib/x86_64-linux-gnu/libc.so.6 #18 0x00007f423e867626 in PostmasterMain () #19 0x00007f423e6a3205 in main () (gdb) The strace -p $pid produced the following output at one time: semop(2523174, {{12, -1, 0}}, 1 semop(2523174, {{12, -1, 0}}, 1 ) = 0 lseek(21, 0, SEEK_END) = 32276480 select(0, NULL, NULL, NULL, {0, 21000}) = 0 (Timeout) semop(2523174, {{12, -1, 0}}, 1 Most of the time it is stuck with semop. The file in the lseek argument is an index on the table autovacuum is trying to vacuum. I've tried running pgbreecheck on this index, but it did not show any issues with it: NOTICE: fast root: block 412 at level 2 NOTICE: setting page parents NOTICE: done setting parents NOTICE: Checking forward scan of level 0, starting at block 1 NOTICE: done checking level 0 NOTICE: checking visited nodes on level 0, starting at block 1 NOTICE: done checking visited nodes on level 0 NOTICE: Checking forward scan of level 1, starting at block 3 NOTICE: done checking level 1 NOTICE: checking visited nodes on level 1, starting at block 3 NOTICE: done checking visited nodes on level 1 NOTICE: Checking forward scan of level 2, starting at block 412 NOTICE: done checking level 2 NOTICE: checking visited nodes on level 2, starting at block 412 NOTICE: done checking visited nodes on level 2 started checking at root 412 There are also UPDATE statements constantly running on this table, in the overlapping manner, so at a single moment there is at least one update running on it. We are investigating why is it done this way, but can it be a reason behind this strange vacuum behavior?
pgsql-bugs by date: