CLUSTER versus broken HOT chains - Mailing list pgsql-hackers

From Tom Lane
Subject CLUSTER versus broken HOT chains
Date
Msg-id 6331.1303235482@sss.pgh.pa.us
Whole thread Raw
Responses Re: CLUSTER versus broken HOT chains
List pgsql-hackers
I believe I've worked out what's going on in bug #5985.  The example
script contains an UPDATE on a table, then a creation of an index,
then a CLUSTER on that index, all within one transaction.  If the
UPDATE does any HOT updates, then the index is going to be marked
with indcheckxmin horizon equal to current transaction, because of
our inadequate detection of whether HOT chains are really broken
with respect to a new index.  Then when CLUSTER tries to invoke the
planner to see whether the index should be preferred over a
seqscan-and-sort, plancat.c decides that the index isn't usable yet,
so it doesn't add the index to the relation's IndexOptInfo list,
causing the reported failure "index nnn does not belong to table mmm"
in plan_cluster_use_sort.  This failure is new in 9.1 because we did
not try to use the planner in this way in previous versions, but just
always did an indexscan.

Now, over in cluster.c we find the following interesting bit of
commentary:
   /*    * Disallow if index is left over from a failed CREATE INDEX CONCURRENTLY;    * it might well not contain
entriesfor every heap row, or might not even    * be internally consistent.  (But note that we don't check
indcheckxmin;   * the worst consequence of following broken HOT chains would be that we    * might put recently-dead
tuplesout-of-order in the new table, and there    * is little harm in that.)    */   if
(!OldIndex->rd_index->indisvalid)      ereport(ERROR,               (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
   errmsg("cannot cluster on invalid index \"%s\"",                       RelationGetRelationName(OldIndex))));
 

So this leads me to a few thoughts:

1. Now that we have the seqscan-and-sort code path, it'd be possible to
support CLUSTER on a not-indisvalid index, at least when it's a btree
index.  We just have to force it into the seqscan-and-sort code path.

2. We could deal with a not-usable-because-of-indcheckxmin-horizon
index by forcing an indexscan, which is alleged to be safe by the
above comment, or (if it's btree) by forcing a seqscan-and-sort.
The problem is that we won't know which way is cheaper.  I suspect
however that the seqscan way is usually cheaper and we wouldn't lose
much by forcing that whenever we can.

3. Or we could kluge up the planner so it doesn't ignore "unusable"
indexes when invoked for this purpose.  That seems fairly messy though.

Thoughts?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Aidan Van Dyk
Date:
Subject: Re: pgbench \for or similar loop
Next
From: David Fetter
Date:
Subject: Re: pgbench \for or similar loop