Re: 回复:回复:how to createindex concurrently on partitioned table - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: 回复:回复:how to createindex concurrently on partitioned table
Date
Msg-id 20200616010221.GA2353@paquier.xyz
Whole thread Raw
In response to 回复:回复:how to create index concurrently on partitioned table  ("李杰(慎追)" <adger.lj@alibaba-inc.com>)
Responses 回复:回复:回复:how to create index concurrently on partitioned table
Re: 回复:how to create index concurrently on partitioned table
List pgsql-hackers
On Mon, Jun 15, 2020 at 09:33:05PM +0800, 李杰(慎追) wrote:
> This is a good idea.
> We should maintain the consistency of the entire partition table.
> However, this is not a small change in the code.
> May be we need to make a new design for DefineIndex function....

Indeed.  I have looked at the patch set a bit and here is the related
bit for CIC in 0001, meaning that you handle the basic index creation
for a partition tree within one transaction for each:
+       /*
+        * CIC needs to mark a partitioned table as VALID, which itself
+        * requires setting READY, which is unset for CIC (even though
+        * it's meaningless for an index without storage).
+        */
+       if (concurrent)
+       {
+           PopActiveSnapshot();
+           CommitTransactionCommand();
+           StartTransactionCommand();
+           index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+
+           CommandCounterIncrement();
+           index_set_state_flags(indexRelationId, INDEX_CREATE_SET_VALID);
        }

I am afraid that this makes the error handling more complicated, with
risks of having inconsistent partition trees.  That's the point you
raised.  This one is going to need more thoughts.

> But most importantly, if we use three steps to implement CIC,
> we will spend more time than ordinary tables, especially in high
> concurrency cases.  To wait for one of partitions which the users to
> use  frequently, the creation of other partition indexes is delayed.
> Is it worth doing this?

CIC is an operation that exists while allowing read and writes to
still happen in parallel, so that's fine IMO if it takes time.  Now it
is true that it may not scale well so we should be careful with the
approach taken.  Also, I think that the case of REINDEX would require
less work overall because we already have some code in place to gather
multiple indexes from one or more relations and work on these
consistently, all at once.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [bug?] Is the format of tables in the documentation broken in PG 13?
Next
From: Andres Freund
Date:
Subject: Re: global barrier & atomics in signal handlers (Re: Atomicoperations within spinlocks)