Re: [ADMIN] VACUUM ANALYZE Issues - Mailing list pgsql-admin

From Andres Freund
Subject Re: [ADMIN] VACUUM ANALYZE Issues
Date
Msg-id 20170605224725.2jim73rpppgyoooh@alap3.anarazel.de
Whole thread Raw
In response to [ADMIN] VACUUM ANALYZE Issues  (Harold Falkmeyer <hfalkmeyer@gmail.com>)
Responses Re: [ADMIN] VACUUM ANALYZE Issues  (Harold Falkmeyer <hfalkmeyer@gmail.com>)
List pgsql-admin
Hi,

On 2017-06-05 14:20:12 -0700, Harold Falkmeyer wrote:
> My firm manages a somewhat large PostgreSQL database (400'ish tables, 5
> TiB'ish on disk, 200'ish GiB pg_dumps, etc.).  We're presently running
> PostgreSQL 8.4.22 on Linux 2.6.32 with Slony-I 2.0.8 for replication.

First off, you should seriously consider upgrading. The situation has
gotten considerably better since 8.4, particularly in 9.6.  Also the 8.4
branch hasn't received bugfixed (including security fixes) for nearly
three years.

> Over the past several weeks, our weekly VACUUM ANALYZE job has become
> disruptive to other queries, seemingly spinning on one particular table
> (one of our largest in rows, columns, indexes, foreign keys, and general
> use), ultimately requiring that the job be terminated.  With the number of
> INSERTs, UPDATEs, and DELETEs going through our system, it's imperative
> that VACUUM ANALYZEs be completed periodically.  In the past, and given our
> usage patterns, we've opted for weekly VACUUM ANALYZEs in lieu of
> autovacuum, which is presently disabled.

That sounds like it might seriously exascerbate the problem - weekly is
really infrequent for both vacuum and analyze.


> After about 60-90 minutes working on the aforementioned table, and with the
> VACUUM ANALYZE at the top of our longest running queries list, other
> operations attempting to read from or write to on that table either
> complete extremely slowly or are awaiting locks held by the VACUUM
> ANALYZE.  Sampling during the most recent disruptive period, the VACUUM
> ANALYZE pid held 20 locks: 17 RowExclusiveLocks, one AccessExclusiveLock,
> and one ShareUpdateExclusive, all on the aforementioned table, and one
> ExclusiveLock for the containing transaction.

An ExclusiveLock on the transaction is completely normal, and only used
for other sessions to be able to wait for the completion of the the
holder's transaction.

It does not entirely sound like you're suffering from locking problems.
Are there any locks that are not granted (e.g. SELECT * FROM pg_locks
WHERE NOT granted;) during that period?   If so, what commands are
using these locks, you should be able to do normal table manipulations
while vacuum is in progress.


> After sending SIGTERMs to
> all other pids enumerated in pg_locks as holding or wanting locks on that
> table, the VACUUM ANALYZE continued without apparent progress.  At the
> time, the containing process was consuming a rough average of about 6% of
> one CPU and 4.9% memory (no swapping), was holding 641 descriptors, was in
> a seeming loop of lseek, read, memcpy, and memcmp, with occasional semop
> calls.  During this process, we attempted to SIGTERM all other connections
> to ensure no stale transactions, locks, etc. were preventing the VACUUM
> ANALYZE from finishing.  Overall system load and I/O activity was quite low.

> Ultimately, we've felt forced to terminate the VACUUM ANALYZEs.  After
> sending it a SIGTERM and after about 30 seconds of seeing no change, we
> attempted a service postgresql-8.4 stop.  After 30 additional seconds, we
> sent a SIGQUIT to the VACUUM ANALYZE process (which we always try to
> avoid), which terminated it and, in turn, PostgreSQL stopped.  Thereafter,
> the restarts seemed quick and uneventful, with cleanup happening without
> reported exception.  Only once in the last month or so have we seen a
> VACUUM ANALYZE on this table complete.

Well, that'll make the problem worse and worse.  I suspect your setting
might not be appropriately aggressive to be able to collect the
accumulated cruft.  What's your maintenance_work_mem setting, and what
are your vacuum_cost_delay/vacuum_cost_limit settings?

Greetings,

Andres Freund


pgsql-admin by date:

Previous
From: Harold Falkmeyer
Date:
Subject: [ADMIN] VACUUM ANALYZE Issues
Next
From: Sergey Burladyan
Date:
Subject: [ADMIN] Questions about upgrade standby with rsync