Re: tablecmds: reject CLUSTER ON for partitioned tables earlier - Mailing list pgsql-hackers
| From | Chao Li |
|---|---|
| Subject | Re: tablecmds: reject CLUSTER ON for partitioned tables earlier |
| Date | |
| Msg-id | A1847A7B-7CB8-4A03-8CEA-FF7D3991DEDA@gmail.com Whole thread Raw |
| In response to | Re: tablecmds: reject CLUSTER ON for partitioned tables earlier (Michael Paquier <michael@paquier.xyz>) |
| List | pgsql-hackers |
> On Jan 27, 2026, at 15:48, Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Jan 27, 2026 at 07:13:04AM +0800, Chao Li wrote: >> I added two new test cases in 0002 that trigger the check. >> >> BTW, this is the CF entry: >> https://commitfest.postgresql.org/patch/6415/. You may mark yourself >> as a reviewer, and once you consider the patch is ready to go, would >> you mind change the status to Ready For Committer? > > There is more to this set of changes than it looks at first sight. > > Hence, one question about 0001: can the previous error path in > mark_index_clustered() be reached through a different mean than ALTER > TABLE? If yes, we should have a test for it. If no, it could be > switched to an elog(ERROR) or an assertion. The code paths leading to > the previous error should be analyzed further. Okay, I spent time today investigating mark_index_clustered() today. First, I reset the source tree to 05fb5d6 where the partitioned table check was added to mark_index_clustered(). The commitsubject is "Ignore partitioned indexes where appropriate”. It added the check in 3 functions: * cluster() * mark_index_clustered() * get_relation_info() - not in scope of this discussion At this commit, ALTER TABLE … CLUSTER ON / SET WITHOUT CLUSTER code patch could reach mark_index_clustered(). Other thanthat, mark_index_clustered() was only called by rebuild_relation() when the parameter indexOid is valid; rebuild_relation()was only called by cluster_rel(); and cluster_rel() was called by vacuum_rel() and cluster(). * For the cluster() code patch: because of the check added to cluster() by this commit, partitioned table would return early,thus mark_index_clustered() was actually not called. * For the vacuum_rel() code path: there was already a check for partitioned table to return early, thus cluster_rel() won’tbe called against partitioned tables, so that mark_index_clustered() could not be called either. So, looks like the check added to mark_index_clustered() was only for the ALTER TABLE code path. Then, switching the source tree back to this patch. Now, for the ALTER TABLE code path, 0001 ensures partitioned table won’treach mark_index_clustered(). Other than ALTER TABLE, mark_index_clustered() is only called by rebuild_relation(); rebuild_relation() is only called bycluster_rel(); cluster_rel() is called by vacuum_rel() and cluster(). So, the call paths are the same as commit 05fb5d6. For the cluster() code path, I traced this scenario: ``` evantest=# create table p (id int) partition by range (id); CREATE TABLE evantest=# create table p1 partition of p for values from (0) to (10); CREATE TABLE evantest=# create index p_idx on p (id); CREATE INDEX evantest=# cluster p using p_idx; CLUSTER ``` The code ran into cluster(). Now, cluster() is much complicated than it was in commit 05fb5d6. For a partitioned table, ititerates all leaf partitions to call cluster_rel(): ``` /* * Either we're processing a partitioned table, or we were not given any * table name at all. In either case, obtain a list of relations to * process. * * In the former case, an index name must have been given, so we don't * need to recheck its "indisclustered" bit, but we have to check that it * is an index that we can cluster on. In the latter case, we set the * option bit to have indisclustered verified. * * Rechecking the relation itself is necessary here in all cases. */ params.options |= CLUOPT_RECHECK; if (rel != NULL) { Assert(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE); check_index_is_clusterable(rel, indexOid, AccessShareLock); rtcs = get_tables_to_cluster_partitioned(cluster_context, indexOid); /* close relation, releasing lock on parent table */ table_close(rel, AccessExclusiveLock); } …. cluster_multiple_rels(rtcs, ¶ms); // cluster_rel() is called here ``` So, partitioned table should never reach mark_index_clustered() from the cluster() code patch. For the vacuum_rel() code patch, same as before, partitioned table will return early, cluster_rel() won’t be called at all: ``` /* * Silently ignore partitioned tables as there is no work to be done. The * useful work is on their child partitions, which have been queued up for * us separately. */ if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) { relation_close(rel, lmode); PopActiveSnapshot(); CommitTransactionCommand(); /* It's OK to proceed with ANALYZE on this table */ return true; } ``` In summary, with 0001, the partitioned table check in mark_index_clustered() is no longer needed. But as mark_index_clustered()is an extern-ed function, it might have future callers, I think we can change ereport(ERROR) to anAssert(). I will include the change in next revision. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
pgsql-hackers by date: