Re: tablecmds: reject CLUSTER ON for partitioned tables earlier - Mailing list pgsql-hackers

From Chao Li
Subject Re: tablecmds: reject CLUSTER ON for partitioned tables earlier
Date
Msg-id A1847A7B-7CB8-4A03-8CEA-FF7D3991DEDA@gmail.com
Whole thread Raw
In response to Re: tablecmds: reject CLUSTER ON for partitioned tables earlier  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers

> On Jan 27, 2026, at 15:48, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Jan 27, 2026 at 07:13:04AM +0800, Chao Li wrote:
>> I added two new test cases in 0002 that trigger the check.
>>
>> BTW, this is the CF entry:
>> https://commitfest.postgresql.org/patch/6415/. You may mark yourself
>> as a reviewer, and once you consider the patch is ready to go, would
>> you mind change the status to Ready For Committer?
>
> There is more to this set of changes than it looks at first sight.
>
> Hence, one question about 0001: can the previous error path in
> mark_index_clustered() be reached through a different mean than ALTER
> TABLE?  If yes, we should have a test for it.  If no, it could be
> switched to an elog(ERROR) or an assertion.  The code paths leading to
> the previous error should be analyzed further.

Okay, I spent time today investigating mark_index_clustered() today.

First, I reset the source tree to 05fb5d6 where the partitioned table check was added to mark_index_clustered(). The
commitsubject is "Ignore partitioned indexes where appropriate”. It added the check in 3 functions: 

 * cluster()
 * mark_index_clustered()
 * get_relation_info() - not in scope of this discussion

At this commit, ALTER TABLE … CLUSTER ON / SET WITHOUT CLUSTER code patch could reach mark_index_clustered(). Other
thanthat, mark_index_clustered() was only called by rebuild_relation() when the parameter indexOid is valid;
rebuild_relation()was only called by cluster_rel(); and cluster_rel() was called by vacuum_rel() and cluster().  

 * For the cluster() code patch: because of the check added to cluster() by this commit, partitioned table would return
early,thus mark_index_clustered() was actually not called. 
 * For the vacuum_rel() code path: there was already a check for partitioned table to return early, thus cluster_rel()
won’tbe called against partitioned tables, so that mark_index_clustered() could not be called either. 

So, looks like the check added to mark_index_clustered() was only for the ALTER TABLE code path.

Then, switching the source tree back to this patch. Now, for the ALTER TABLE code path, 0001 ensures partitioned table
won’treach mark_index_clustered(). 

Other than ALTER TABLE, mark_index_clustered() is only called by rebuild_relation(); rebuild_relation() is only called
bycluster_rel(); cluster_rel() is called by vacuum_rel() and cluster(). So, the call paths are the same as commit
05fb5d6.

For the cluster() code path, I traced this scenario:
```
evantest=# create table p (id int) partition by range (id);
CREATE TABLE
evantest=# create table p1 partition of p for values from (0) to (10);
CREATE TABLE
evantest=# create index p_idx on p (id);
CREATE INDEX
evantest=# cluster p using p_idx;
CLUSTER
```
The code ran into cluster(). Now, cluster() is much complicated than it was in commit 05fb5d6. For a partitioned table,
ititerates all leaf partitions to call cluster_rel(): 
```
    /*
     * Either we're processing a partitioned table, or we were not given any
     * table name at all.  In either case, obtain a list of relations to
     * process.
     *
     * In the former case, an index name must have been given, so we don't
     * need to recheck its "indisclustered" bit, but we have to check that it
     * is an index that we can cluster on.  In the latter case, we set the
     * option bit to have indisclustered verified.
     *
     * Rechecking the relation itself is necessary here in all cases.
     */
    params.options |= CLUOPT_RECHECK;
    if (rel != NULL)
    {
        Assert(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
        check_index_is_clusterable(rel, indexOid, AccessShareLock);
        rtcs = get_tables_to_cluster_partitioned(cluster_context, indexOid);

        /* close relation, releasing lock on parent table */
        table_close(rel, AccessExclusiveLock);
    }
        ….
        cluster_multiple_rels(rtcs, ¶ms); // cluster_rel() is called here
```
So, partitioned table should never reach mark_index_clustered() from the cluster() code patch.

For the vacuum_rel() code patch, same as before, partitioned table will return early, cluster_rel() won’t be called at
all:
```
    /*
     * Silently ignore partitioned tables as there is no work to be done.  The
     * useful work is on their child partitions, which have been queued up for
     * us separately.
     */
    if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
    {
        relation_close(rel, lmode);
        PopActiveSnapshot();
        CommitTransactionCommand();
        /* It's OK to proceed with ANALYZE on this table */
        return true;
    }
```

In summary, with 0001, the partitioned table check in mark_index_clustered() is no longer needed. But as
mark_index_clustered()is an extern-ed function, it might have future callers, I think we can change ereport(ERROR) to
anAssert(). I will include the change in next revision. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







pgsql-hackers by date:

Previous
From: Corey Huinker
Date:
Subject: Re: Extended Statistics set/restore/clear functions.
Next
From: Tender Wang
Date:
Subject: Re: Optimize IS DISTINCT FROM with non-nullable inputs