Home > mailing lists

CLUSTER versus broken HOT chains - Mailing list pgsql-hackers

From	Tom Lane
Subject	CLUSTER versus broken HOT chains
Date	April 19, 2011 14:51:31
Msg-id	6331.1303235482@sss.pgh.pa.us Whole thread Raw
Responses	Re: CLUSTER versus broken HOT chains
List	pgsql-hackers

Tree view

I believe I've worked out what's going on in bug #5985. The example
script contains an UPDATE on a table, then a creation of an index,
then a CLUSTER on that index, all within one transaction. If the
UPDATE does any HOT updates, then the index is going to be marked
with indcheckxmin horizon equal to current transaction, because of
our inadequate detection of whether HOT chains are really broken
with respect to a new index. Then when CLUSTER tries to invoke the
planner to see whether the index should be preferred over a
seqscan-and-sort, plancat.c decides that the index isn't usable yet,
so it doesn't add the index to the relation's IndexOptInfo list,
causing the reported failure "index nnn does not belong to table mmm"
in plan_cluster_use_sort. This failure is new in 9.1 because we did
not try to use the planner in this way in previous versions, but just
always did an indexscan.

Now, over in cluster.c we find the following interesting bit of
commentary:
/* * Disallow if index is left over from a failed CREATE INDEX CONCURRENTLY; * it might well not contain
entriesfor every heap row, or might not even * be internally consistent. (But note that we don't check
indcheckxmin; * the worst consequence of following broken HOT chains would be that we * might put recently-dead
tuplesout-of-order in the new table, and there * is little harm in that.) */ if
(!OldIndex->rd_index->indisvalid) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot cluster on invalid index \"%s\"", RelationGetRelationName(OldIndex))));

So this leads me to a few thoughts:

1. Now that we have the seqscan-and-sort code path, it'd be possible to
support CLUSTER on a not-indisvalid index, at least when it's a btree
index. We just have to force it into the seqscan-and-sort code path.

2. We could deal with a not-usable-because-of-indcheckxmin-horizon
index by forcing an indexscan, which is alleged to be safe by the
above comment, or (if it's btree) by forcing a seqscan-and-sort.
The problem is that we won't know which way is cheaper. I suspect
however that the seqscan way is usually cheaper and we wouldn't lose
much by forcing that whenever we can.

3. Or we could kluge up the planner so it doesn't ignore "unusable"
indexes when invoked for this purpose. That seems fairly messy though.

Thoughts?
regards, tom lane

pgsql-hackers by date:

From: Aidan Van Dyk
Date: 19 April 2011, 14:33:39
Subject: Re: pgbench \for or similar loop

From: David Fetter
Date: 19 April 2011, 14:53:24
Subject: Re: pgbench \for or similar loop

CLUSTER versus broken HOT chains - Mailing list pgsql-hackers

Previous

Next