On Mon, Jun 15, 2020 at 09:33:05PM +0800, 李杰(慎追) wrote:
> This is a good idea.
> We should maintain the consistency of the entire partition table.
> However, this is not a small change in the code.
> May be we need to make a new design for DefineIndex function....
Indeed. I have looked at the patch set a bit and here is the related
bit for CIC in 0001, meaning that you handle the basic index creation
for a partition tree within one transaction for each:
+ /*
+ * CIC needs to mark a partitioned table as VALID, which itself
+ * requires setting READY, which is unset for CIC (even though
+ * it's meaningless for an index without storage).
+ */
+ if (concurrent)
+ {
+ PopActiveSnapshot();
+ CommitTransactionCommand();
+ StartTransactionCommand();
+ index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+
+ CommandCounterIncrement();
+ index_set_state_flags(indexRelationId, INDEX_CREATE_SET_VALID);
}
I am afraid that this makes the error handling more complicated, with
risks of having inconsistent partition trees. That's the point you
raised. This one is going to need more thoughts.
> But most importantly, if we use three steps to implement CIC,
> we will spend more time than ordinary tables, especially in high
> concurrency cases. To wait for one of partitions which the users to
> use frequently, the creation of other partition indexes is delayed.
> Is it worth doing this?
CIC is an operation that exists while allowing read and writes to
still happen in parallel, so that's fine IMO if it takes time. Now it
is true that it may not scale well so we should be careful with the
approach taken. Also, I think that the case of REINDEX would require
less work overall because we already have some code in place to gather
multiple indexes from one or more relations and work on these
consistently, all at once.
--
Michael