Re: 12 hour table vacuums - Mailing list pgsql-performance

From Gregory Stark
Subject Re: 12 hour table vacuums
Date
Msg-id 87y7dtokfu.fsf@oxford.xeocode.com
Whole thread Raw
In response to 12 hour table vacuums  (Ron St-Pierre <ron.pgsql@shaw.ca>)
Responses Re: 12 hour table vacuums
List pgsql-performance
"Ron St-Pierre" <ron.pgsql@shaw.ca> writes:

> We vacuum only a few of our tables nightly, this one is the last one because it
> takes longer to run. I'll probably re-index it soon, but I would appreciate any
> advice on how to speed up the vacuum process (and the db in general).
...
> vacuum_cost_delay = 200

Well speeding up vacuum isn't really useful in itself. In fact you have vacuum
configured to run quite slowly by having vacuum_cost_delay set so high. You
have it set to sleep 200ms every few pages. If you lower that it'll run faster
but take more bandwidth away from the foreground tasks.

> Here's the table information:
> The table has 140,000 rows, 130 columns (mostly NUMERIC), 60 indexes.

For what it's worth NUMERIC columns take more space than you might expect.
Figure a minimum of 12 bytes your rows are at about 1.5k each even if the
non-numeric columns aren't large themselves. What are the other columns?

> We were running autovacuum but it interfered with the updates to we shut it
> off.

Was it just the I/O bandwidth? I'm surprised as your vacuum_cost_delay is
quite high. Manual vacuum doesn't do anything differently from autovacuum,
neither should interfere directly with updates except by taking away
I/O bandwidth.

> We vacuum this table nightly, and it currently takes about 12 hours to
> vacuum it. Not much else is running during this period, nothing that should
> affect the table.

Is this time increasing over time? If once a day isn't enough then you may be
accumulating more and more dead space over time. In which case you may be
better off running it during prime time with a large vacuum_cost_delay (like
the 200 you have configured) rather than trying to get to run fast enough to
fit in the off-peak period.

> deadlock_timeout = 10000

I would not suggest having this quite this high. Raising it from the default
is fine but having a value larger than your patience is likely to give you the
false impression that something is hung if you should ever get a deadlock.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

pgsql-performance by date:

Previous
From: Ron St-Pierre
Date:
Subject: Re: 12 hour table vacuums
Next
From: Bill Moran
Date:
Subject: Re: 12 hour table vacuums