Hi,
On 2014-09-17 16:02:16 +0000, alexk@hintbits.com wrote:
> We've got a problem with autovacuum process running for already 5 days,
> without being blocked by any particular process.
> The table in question is around 500MB. Usually we don't see vacuums running
> there for more than couple of hours.
> Other autovacuum processes (on different tables) start and finish without
> issues.
>
> The vacuum and autovacuum settings are left at default values. We don't have
> debug information compiled on this server, but here's what we have after the
> gdb bt output, attaching to the autovacuum process.
> #0 0x00007f423cbf34f7 in semop () from /lib/x86_64-linux-gnu/libc.so.6
> #1 0x00007f423e8581f8 in PGSemaphoreLock ()
> #2 0x00007f423e889a45 in LockBufferForCleanup ()
> #3 0x00007f423e6d9cd6 in ?? ()
> #4 0x00007f423e6da08f in ?? ()
> #5 0x00007f423e6daaf6 in btbulkdelete ()
What is happening here is that vacuum (of a btree index) is waiting for a so called
'cleanup' lock on a page. That is it waits for the page not being pinned
by any other backend.
> semop(2523174, {{12, -1, 0}}, 1
> semop(2523174, {{12, -1, 0}}, 1
>
>
> ) = 0
> lseek(21, 0, SEEK_END) = 32276480
> select(0, NULL, NULL, NULL, {0, 21000}) = 0 (Timeout)
> semop(2523174, {{12, -1, 0}}, 1
>
> Most of the time it is stuck with semop.
That indicates it's waiting most of the time. But it does make progress.
> There are also UPDATE statements constantly running on this table, in the
> overlapping manner, so at a single moment there is at least one update
> running on it. We are investigating why is it done this way, but can it be a
> reason behind this strange vacuum behavior?
Which quite possibly is caused by this.
I've recently commented on -hackers that this is a very hard to debug
behaviour and should be made more visible, but IIRC we didn't come to a
conclusion...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services