Re: Minimally avoiding Transaction Wraparound in VLDBs - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Minimally avoiding Transaction Wraparound in VLDBs
Date
Msg-id 5588.1125530675@sss.pgh.pa.us
Whole thread Raw
In response to Minimally avoiding Transaction Wraparound in VLDBs  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Minimally avoiding Transaction Wraparound in VLDBs
Re: Minimally avoiding Transaction Wraparound in VLDBs
List pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> writes:
> A new command is proposed - VACUUM MINIMAL. The *sole* purpose of this
> command is to do the absolute minimum required to avoid transaction id
> wraparound. (Better names welcome....)

I do not see the point.  If you only need to run it every billion
transactions, saving a few cycles seems a bit pointless.

> This does the same thing as VACUUM except it:

> 1. does not VACUUM any table younger than 4 billion XIDs old

So?  Your large tables are likely to be long-lived, so this isn't
actually going to save a thing in a DB that's been around long enough
to have an impending wrap problem.

> 2. does not VACUUM indexes
> RATIONALE: Indexes don't contain XIDs so don't *need* to be vacuumed.

Nonstarter.  If you remove any tuples, you *must* vacuum indexes to
remove the corresponding index entries.  If you don't remove any tuples,
you don't scan the indexes anyway IIRC.

> By taking those two steps, VACUUM MINIMAL will execute fairly quickly
> even on large Data Warehouses.

I don't think this'll save a thing.

> This command only makes sense when you *know* that tables don't need
> vacuuming. That is the case when:
> - you're using autovacuum, since all tables have dead-tuples removed
> whenever this is required - and so indexes will have been covered also

If you're using autovacuum then the problem is already taken care of.
It will be taken care of better in 8.2, if we add per-table tracking
of XID wraparound risk, but it's handled now.  The only way that this
proposal makes any sense is if you are trying not to vacuum at all, ever.

> - you are using table partitioning and the data retention period of your
> data is less than 4 billion transactions.

Again, per-table tracking of wraparound horizon would take care of this
case, more elegantly and more safely.

> The limit is set at 4 billion because with this command we are trying to
> avoid doing work as long as possible.

You do realize that 2 billion is already the wrap horizon, and you can't
wait that long if you're doing this on a routine basis rather than
immediately-when-needed?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Indexing dead tuples
Next
From: Simon Riggs
Date:
Subject: Re: Indexing dead tuples