Re: First steps with 8.3 and autovacuum launcher - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: First steps with 8.3 and autovacuum launcher
Date
Msg-id 470CA6C7.7070504@enterprisedb.com
Whole thread Raw
In response to Re: First steps with 8.3 and autovacuum launcher  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: First steps with 8.3 and autovacuum launcher
List pgsql-hackers
Simon Riggs wrote:
> My thoughts are that it doesn't need to. Typically we create objects and
> then fill them. It isn't that frequent that we would load data, then
> delete or update more than 20% of it, then attempt other DDL.

One scenario that comes to mind is a table that's used in OLTP fashion
during day, but it's taken offline for data loading during night. To
speed up the data loading, indexes are dropped before the load and
recreated afterwards.

Even if there's no dead rows in a table, autovacuum will still kick in
to freeze it at some point.

> If a COPY fails it will create dead rows, which should be cleared up by
> an autoVACUUM. If a COPY fails, the user knows to run a VACUUM or a
> re-TRUNCATE before re-attempting a modified COPY. So there is potential
> for more than one VACUUM to be attempted in that case.

I wish the user didn't have to know to do that.

> So there could be an argument for TRUNCATE causing a cancellation of a
> VACUUM, but I don't see the use case for other DDL. Maybe it would be
> easier to make all conflicting lock requestors cancel VACUUM.

Any VACUUM, or just autovacuum?

The only danger I can see is that the autovacuum is always killed and
never gets to finish, leading to degrading performance at first and
shutdown to prevent xid wraparound at the extreme. Doesn't seem likely
under normal circumstances, though. A scenario that comes to mind is
having very lazy autovacuum settings, so that vacuum of the table takes
longer than 24h, and a daily cron job to run REINDEX.

The "priority inheritance" scheme I proposed earlier would work well
with that: instead of killing the autovacuum, set cost delay to zero to
let it finish out of the way ASAP. It has it's own set of problems,
though. An innocent-looking DROP INDEX would cause the autovacuum to go
full steam ahead, hurting performance for others.

> I think it would be helpful if user-initiated VACUUMs waited behind
> another VACUUM that was already in progress on the table and then
> returned immediately as successful when the first VACUUM finishes. That
> would seem better than queuing up behind the first VACUUM and then
> repeating the process. 

I don't think that's a good idea. The second VACUUM wouldn't be a no-op,
it would clean up any dead rows accumulated during the first VACUUM.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Khan, Mahmood Ahram"
Date:
Subject: pgstattuple module
Next
From: Simon Riggs
Date:
Subject: Re: Skytools committed without hackers discussion/review