On Wed, 2005-08-31 at 19:06 -0400, Tom Lane wrote:
> Andrew - Supernews <andrew+nonews@supernews.com> writes:
> > On 2005-08-31, Simon Riggs <simon@2ndquadrant.com> wrote:
> >> During CREATE INDEX we include all tuples, even if they are already dead
> >> when we build an index.
> >>
> >> What purpose does this serve?
> >>
> >> A pre-existing transaction can't see the index,
>
> > Yes, it can; the catalog is read in SnapshotNow rather than in the query
> > snapshot.
Thanks Andrew, didn't see your post to me. I suspected that was the
case, but wasn't sure why... though Tom explains this.
> In fact, it had better be able to, since once the CREATE INDEX commits,
> pre-existing xacts are responsible to insert index entries for anything
> they insert into the table.
So would it be possible to have CREATE INDEX call GetOldestXmin, just as
VACUUM does, so it can work out which rows to ignore? The overhead of
that is fairly low and could actually speed up many index builds by
reducing the number of rows needing to be sorted/manipulated. (The call
to GetOldestXmin would only scan procs for the current databaseid).
Perhaps this could apply only for larger tables, where the sort cost is
likely to be pretty high? That way having the CREATE INDEX ignore dead
tuples would always be cheaper than doing a VACUUM + CREATE INDEX. Why
do two scans when we can do one?
Best Regards, Simon Riggs