On Fri, Dec 29, 2023 at 04:15:35PM +0300, Maxim Orlov wrote:
> Recently, one of our customers came to us with the question: why do reindex
> utility does not support multiple jobs for indices (-i opt)?
> And, of course, it is because we cannot control the concurrent processing
> of multiple indexes on the same relation. This was
> discussed somewhere in [0], I believe. So, customer have to make a shell
> script to do his business and so on.
Yep, that should be the correct thread. As far as I recall, one major
reason was code simplicity because dealing with parallel jobs at table
level is a no-brainer on the client side (see 0003): we know that
relations with physical storage will never interact with each other.
> But. This seems to be not that complicated to split indices by parent
> tables and do reindex in multiple jobs? Or I miss something?
> PFA patch implementing this.
+ appendPQExpBufferStr(&catalog_query,
+ "WITH idx as (\n"
+ " SELECT c.relname, ns.nspname\n"
+ " FROM pg_catalog.pg_class c,\n"
+ " pg_catalog.pg_namespace ns\n"
+ " WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid AND\n"
+ " c.oid OPERATOR(pg_catalog.=) ANY(ARRAY['\n");
The problem may be actually trickier than that, no? Could there be
other factors to take into account for their classification, like
their sizes (typically, we'd want to process the biggest one first, I
guess)?
--
Michael