Re: [Proposal] vacuumdb --schema only - Mailing list pgsql-hackers
From | Nathan Bossart |
---|---|
Subject | Re: [Proposal] vacuumdb --schema only |
Date | |
Msg-id | 20220330212258.GA2254@nathanxps13 Whole thread Raw |
In response to | Re: [Proposal] vacuumdb --schema only (Robert Treat <rob@xzilla.net>) |
Responses |
Re: [Proposal] vacuumdb --schema only
Re: [Proposal] vacuumdb --schema only |
List | pgsql-hackers |
I took a look at the v4 patch. 'git-apply' complains about whitespace errors: 0001-vacuumdb-schema-only-v4.patch:17: tab in indent. <arg choice="plain"> 0001-vacuumdb-schema-only-v4.patch:18: tab in indent. <arg choice="opt"> 0001-vacuumdb-schema-only-v4.patch:19: tab in indent. <group choice="plain"> 0001-vacuumdb-schema-only-v4.patch:20: tab in indent. <arg choice="plain"><option>-n</option></arg> 0001-vacuumdb-schema-only-v4.patch:21: tab in indent. <arg choice="plain"><option>--schema</option></arg> warning: squelched 13 whitespace errors warning: 18 lines add whitespace errors. + printf(_(" -N, --exclude-schema=PATTERN do NOT vacuum tables in the specified schema(s)\n")); I'm personally -1 for the --exclude-schema option. I don't see any existing "exclude" options in vacuumdb, and the uses for such an option seem rather limited. If we can point to specific use-cases for this option, I might be willing to change my vote. + <para> + To clean all tables in the <literal>Foo</literal> and <literal>bar</literal> schemas + only in a database named <literal>xyzzy</literal>: +<screen> +<prompt>$ </prompt><userinput>vacuumdb --schema='"Foo"' --schema='bar' xyzzy</userinput> +</screen></para> nitpicks: I think the phrasing should be "To only clean tables in the...". Also, is there any reason to use a schema name with a capital letter as an example? IMO that just adds unnecessary complexity to the example. +$node->issues_sql_like( + [ 'vacuumdb', '--schema', '"Foo"', 'postgres' ], + qr/VACUUM "Foo".*/, + 'vacuumdb --schema schema only'); IIUC there should only be one table in the schema. Can we avoid matching "*" and check for the exact command instead? I think there should be a few more test cases. For example, we should test using -n and -N at the same time, and we should test what happens when those options are used for missing schemas. + /* + * When filtering on schema name, filter by table is not allowed. + * The schema name can already be set to a fqdn table name. + */ + if (tbl_count && (schemas.head != NULL)) + { + pg_log_error("cannot vacuum all tables in schema(s) and specific table(s) at the same time"); + exit(1); + } I think there might be some useful refactoring we can do that would simplify adding similar options in the future. Specifically, can we have a global variable that stores the type of vacuumdb command (e.g., all, tables, or schemas)? If so, perhaps the tables list could be renamed and reused for schemas (and any other objects that need listing in the future). + if (schemas != NULL && schemas->head != NULL) + { + appendPQExpBufferStr(&catalog_query, + " AND c.relnamespace"); + if (schema_is_exclude == TRI_YES) + appendPQExpBufferStr(&catalog_query, + " OPERATOR(pg_catalog.!=) ALL (ARRAY["); + else if (schema_is_exclude == TRI_NO) + appendPQExpBufferStr(&catalog_query, + " OPERATOR(pg_catalog.=) ANY (ARRAY["); + + for (cell = schemas->head; cell != NULL; cell = cell->next) + { + appendStringLiteralConn(&catalog_query, cell->val, conn); + + if (cell->next != NULL) + appendPQExpBufferStr(&catalog_query, ", "); + } + + /* Finish formatting schema filter */ + appendPQExpBufferStr(&catalog_query, "]::pg_catalog.regnamespace[])\n"); + } IMO we should use a CTE for specified schemas like we do for the specified tables. I wonder if we could even have a mostly-shared CTE code path for all vacuumdb commands with a list of names. - /* - * If no tables were listed, filter for the relevant relation types. If - * tables were given via --table, don't bother filtering by relation type. - * Instead, let the server decide whether a given relation can be - * processed in which case the user will know about it. - */ - if (!tables_listed) + else { + /* + * If no tables were listed, filter for the relevant relation types. If + * tables were given via --table, don't bother filtering by relation type. + * Instead, let the server decide whether a given relation can be + * processed in which case the user will know about it. + */ nitpick: This change seems unnecessary. I noticed upthread that there was some discussion around adding a way to specify a schema in VACUUM and ANALYZE commands. I think this patch is useful even if such an option is eventually added, as we'll still want vacuumdb to obtain the full list of tables to process so that it can effectively parallelize. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: