Hi,
On 2022-06-15 17:21:56 -0700, Mark Dilger wrote:
> While developing various Table Access Methods, I have wanted a callback for
> determining if CLUSTER (and VACUUM FULL) should be run against a table
> backed by a given TAM. The current API contains a callback for doing the
> guts of the cluster, but by that time, it's a bit too late to cleanly back
> out. For single relation cluster commands, raising an error from that
> callback is probably not too bad. For multi-relation cluster commands, that
> aborts the clustering of other yet to be processed relations, which doesn't
> seem acceptable.
Why not? What else do you want to do in that case? Silently ignoring
non-clusterable tables doesn't seem right either. What's the use-case for
swallowing the error?
> I tried fixing this with a ProcessUtility_hook, but that
> fires before the multi-relation cluster command has compiled the list of
> relations to cluster, meaning the ProcessUtility_hook doesn't have access to
> the necessary information. (It can be hacked to compile the list of
> relations itself, but that duplicates both code and effort, with the usual
> risks that the code will get out of sync.)
>
> For my personal development, I have declared a new hook, bool
> (*relation_supports_cluster) (Relation rel). It gets called from
> commands/cluster.c in both the single-relation and multi-relation code
> paths, with warning or debug log messages output for relations that decline
> to be clustered, respectively.
Do you actually need to dynamically decide whether CLUSTER is supported?
Otherwise we could just make the existing cluster callback optional and error
out if a table is clustered that doesn't have the callback.
> Before posting a patch, do people think this sounds useful? Would you like
> the hook function signature to differ in some way? Is a simple "yes this
> relation may be clustered" vs. "no this relation may not be clustered"
> interface overly simplistic?
It seems overly complicated, if anything ;)
Greetings,
Andres Freund