> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of
> Steve Crawford
> Sent: Monday, July 26, 2004 1:23 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] vacuumdb hanging database cluster
>
>
> When I run:
> vacuumdb --full --all --analyze --quiet
> on my database cluster it will complete in < 2 minutes (this cluster
> is a few million total rows and ~2GB).
>
> After testing, I set this up as an off-hours cron job and it worked
> fine for several days then hung the whole database. After my pager
> pulled me from bed I found the vacuumdb process still running but the
> vacuum process on the first database (alphabetically) was showing it
> was waiting:
> postgres: vacuumdb --full --all --analyze --quiet
> postgres: postgres firstdb [local] VACUUM waiting
>
> A couple hundred processes were showing as "startup waiting" and one
> was "idle in transaction". The process in the "VACUUM waiting" state
> was the only one connected to that database - all other connections
> were to other databases.
>
> CPU and disk utilization were essentially zero. Suspecting a lock
> problem I attempted to use a pre-existing connection to view pg_locks
> but it would not respond.
>
> I killed the vacuum process and all the processes in the "waiting"
> states cleared within a second or two and system returned to normal.
> The pg_locks query also returned but showed no useful info.
>
> I tracked down the process that was "idle in transaction" and
> it was a
> pg_dump process running on another machine. This process does a
> periodic dump of one very small table and should complete in a
> fraction of a second but was still waiting since the previous day -
> apparently without deleterious effects.
>
> There was no useful info in the log.
>
> I've stopped running the vacuum full job via cron till I can
> trust it.
> Any ideas on how to track/prevent this behavior? Server is version
> 7.4.1 and my web searches have proved futile.
I have seen problematic behavior when one vacuum starts after another is
already running.
It might be a good idea to semaphore vacuum operations.
But my experience is with an older version of PostgreSQL, so the
problems you are seeing might be totally unrelated.