Re: Minimally avoiding Transaction Wraparound in VLDBs - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Minimally avoiding Transaction Wraparound in VLDBs
Date
Msg-id 4809.1125597794@sss.pgh.pa.us
Whole thread Raw
In response to Re: Minimally avoiding Transaction Wraparound in VLDBs  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> writes:
> My first proposal is to add an extra parameter onto the
> index_bulk_delete() call - ntuples. If ntuples == 0 then btbulkdelete()
> will avoid scanning and return immediately. If a scan occurs, then we
> keep track of how many tuples have been marked deleted and stop the scan
> when we have reached this number.

This seems reasonable.  I think you overstate the value of an early
stop, but it can't hurt given the minimal cost of making the check.

> 1) avoid the scan when there are no dead tuples, and ignore the
> possibility that a VACUUM might be doing a follow-up scan to remove
> previously deleted tuples.

I think this is a pretty bad idea.

Your analysis assumes that the two scans are equally expensive,
which is about as far from true as is possible.  In the first place,
the first scan is in index order while the second is in physical order
(hence can benefit from readahead).  In the second place, the first
scan iterates over each item on each page, where the second does not.
So what you're proposing is to install an unavoidable disk space bloat
mechanism in order to save only a small part of the cost of VACUUM.

(What I'd really like to see is a way to do the first scan in physical
order, but we'd need a different locking approach AFAICS --- see the
notes in nbtree/README.)

Keeping track of whether there's any work pending for the second pass
doesn't seem too unreasonable to me.  It was left undone in the original
coding on the KISS principle, but it could certainly be done.  I'd still
like to see some demonstration first that this part of index vacuuming
is a significant cost compared to the heap and first-pass index costs.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: upgrade path / versioning roles
Next
From: Andrew - Supernews
Date:
Subject: Re: TODO item: set proper permissions on non-system schemas