Re: 回复:how to create index concurrently on partitioned table - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: 回复:how to create index concurrently on partitioned table
Date
Msg-id 20200924081103.GS28585@paquier.xyz
Whole thread Raw
In response to Re: 回复:how to create index concurrently on partitioned table  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: 回复:how to create index concurrently on partitioned table  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On Mon, Sep 14, 2020 at 09:31:03AM -0500, Justin Pryzby wrote:
> Anyway, for now this is rebased on 83158f74d.

I have not thought yet about all the details of CIC and DIC and what
you said upthread, but I have gone through CLUSTER for now, as a
start.  So, the goal of the patch, as I would define it, is to give a
way to CLUSTER to work on a partitioned table using a given
partitioned index.  Hence, we would perform CLUSTER automatically from
the top-most parent for any partitions that have an index on the same
partition tree as the partitioned index defined in USING, switching
indisclustered automatically depending on the index used.

+CREATE TABLE clstrpart3 PARTITION OF clstrpart DEFAULT PARTITION BY RANGE(a);
+CREATE TABLE clstrpart33 PARTITION OF clstrpart3 DEFAULT;
 CREATE INDEX clstrpart_idx ON clstrpart (a);
-ALTER TABLE clstrpart CLUSTER ON clstrpart_idx
So..  For any testing of partitioned trees, we should be careful to
check if relfilenodes have been changed or not as part of an
operation, to see if the operation has actually done something.

From what I can see, attempting to use a CLUSTER on a top-most
partitioned table fails to work on child tables, but isn't the goal of
the patch to make sure that if we attempt to do the operation on a
partitioned table using a partitioned index, then the operation should
be done as well on the partitions with the partition index part of the
same partition tree as the parent partitioned index?  If using CLUSTER
on a new partitioned index with USING, it seems to me that we may want
to check that indisclustered is correctly set for all the partitions
concerned in the regression tests.  It would be good also to check if
we have a partition index tree that maps partially with a partition
table tree (aka no all table partitions have a partition index), where
these don't get clustered because there is no index to work on.

+   MemoryContext old_context = MemoryContextSwitchTo(cluster_context);
+   inhoids = find_all_inheritors(indexOid, NoLock, NULL);
+   MemoryContextSwitchTo(old_context);
Er, isn't that incorrect?  I would have assumed that what should be
saved in the context of cluster is the list of RelToCluster items.
And we should not do any lookup of the partitions in a different
context, because this may do allocations of things we don't really
need to keep around for the context dedicated to CLUSTER.  Isn't
NoLock unsafe here, even knowing that an exclusive lock is taken on
the parent?  It seems to me that at least schema changes should be
prevented with a ShareLock in the first transaction building the list
of elements to cluster.

+       /*
+        * We have a full list of direct and indirect children, so skip
+        * partitioned tables and just handle their children.
+        */
+       if (get_rel_relkind(relid) == RELKIND_PARTITIONED_TABLE)
+           continue;
It would be better to use RELKIND_HAS_STORAGE here.

All this stuff needs to be documented clearly.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: A little enhancement for hashdisk testset
Next
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: Transactions involving multiple postgres foreign servers, take 2