Speeding up an in-progress wraparound-preventing vacuum - Mailing list pgsql-general

From Vincent de Phily
Subject Speeding up an in-progress wraparound-preventing vacuum
Date
Msg-id 16509821.RXMXR21guh@moltowork
Whole thread Raw
Responses Re: Speeding up an in-progress wraparound-preventing vacuum
List pgsql-general
Hi List,

I have a "autovacuum: VACUUM ANALYZE public.some_big_table (to prevent
wraparound)" that has been running for over 13 days. The process is consuming
IO so I'm confident it isn't stuck, but it's still taking surprisingly long.
PG 9.1.13 on Debian.

The actual table is 584G on a SAN, plus 324G of indexes on local disk. The
system was IO-starved until about 5 days ago, after offloading half the work
to a different server and waiting for the purging of old data (went from
keeping 4 weeks to 2) to catch up (so probably 2/3rd of the table is currently
bloat, which I'd like to get back asap). Currently about 80% of the IO is
devoted to the vacuum process (on average throughout the day, as extrapolated
from atop output).

I've tried raising autovacuum_vacuum_cost_limit from 500 to 5000, but
apparently the already-running vacuum process didn't pick up the change (I did
send a sighup, and new vacuum processes do run faster). I tried running a
manual vacuum on that table (to benefit from the new settings and from the
more aggressive behavior of manual vacuums), but it's apparently waiting for
the wraparound vacuum to finish first.

My currrent settings:
 autovacuum                      | on        | default
 autovacuum_analyze_scale_factor | 0.1       | default
 autovacuum_analyze_threshold    | 10000     | configuration file
 autovacuum_freeze_max_age       | 200000000 | default
 autovacuum_max_workers          | 3         | default
 autovacuum_naptime              | 60        | default
 autovacuum_vacuum_cost_delay    | 10        | configuration file
 autovacuum_vacuum_cost_limit    | 5000      | configuration file
 autovacuum_vacuum_scale_factor  | 0.002     | configuration file
 autovacuum_vacuum_threshold     | 10000     | configuration file
 maintenance_work_mem            | 524288    | configuration file
The vacuum_scale_factor is tweaked to trigger once a day for most tables, and
should trigger every 3 days on the problem table.



My first question is: is there a way to speedup the currently-running vacuum
process ? I guess killing it to let it be replaced by a more agressively-
configured one would work, but I'd hate to lose 2 weeks of processing (is
there a way to estimate how much more vacuuming work remains to be done ?),
and I'm being a bit more cautious with the wraparound-preventing kind.

Second question is: how come we reached the wraparound threshold on this table
at all ? We've only been keeping 28 days of data in this table, doing
incremental deletes every day (there are no updates at all). I find it very
unlikely that we'd go through 2M transactions in that timespan (that'd need
890 transactions per second, we're well below 100). The pg_class.relfozenxid
on that table is at 680M, while most other tables are around 860M. Could it be
that routine vacuums haven't been able to update the relfrozenxid in a long
long time, or am I missing something else ?



Thanks.

--
Vincent de Phily



pgsql-general by date:

Previous
From: frank
Date:
Subject: Re: Will modifications to unlogged tables also be flused to disk?
Next
From: Adrian Klaver
Date:
Subject: Re: Fwd: Fwd: Problem with pg_dump and decimal mark