Hung Vacuum in 8.3 - Mailing list pgsql-bugs

From Mark Kirkwood
Subject Hung Vacuum in 8.3
Date
Msg-id 4D6330E9.6030809@catalyst.net.nz
Whole thread Raw
Responses Re: Hung Vacuum in 8.3
List pgsql-bugs
This is 8.3.14 on Debian Lenny x86-64.

I'm seeing a hung vacuum:

postgres=# select procpid, query_start,waiting, current_query from
pg_stat_activity where current_query like '%VACUUM%';
procpid |          query_start          | waiting
|
current_query

--------+-------------------------------+---------+---------------------------------------------------------------------------------------------
7347    | 2011-02-22 06:02:02.400247+13 | f       | VACUUM ANALYZE;

at approx 1300 (so it has been sitting there for approx 7 hours, normal
database vacuum time is 10 minutes). Now according to pg_stat_activity
and pg_locks it is *not* waiting for a lock, but no vacuuming appears to
be going on. strace says:

$ strace -p 7347

Process 7347 attached - interrupt to quit
semop(33456157, 0x7ffff512bad0, 1

Ok, so we are waiting on a semaphore - hmm, why is it not showing up as
waiting on a lock of some kind?

Of interest is this:

postgres=# select procpid, query_start, current_query from
pg_stat_activity order by query_start limit 1;
  procpid |          query_start
|                                                   current_query

---------+-------------------------------+-------------------------------------------------------------------------------------
    25953 | 2011-02-22 04:24:07.417138+13 | SELECT n.node_id, n.node_ --
long query, joining several large tables - text snipped

So this guy had been running from before the vacuum started, so probably
vacuum is wanting to do lazy_truncate_heap() on one of the tables in the
join (no gdb on this box unfortunately). I am however still puzzled
about why no locks are being waited on.

I have canceled the vacuum, but any suggestions for getting more diag
info for next time?

regards

Mark

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #5798: Some weird error with pl/pgsql procedure
Next
From: "Nacho Mezzadra"
Date:
Subject: BUG #5896: When server cannot be started, first it says that it couldn't be started and then Server Started