Re: [Proposal] vacuumdb --schema only - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: [Proposal] vacuumdb --schema only
Date
Msg-id 20220418215615.GA2300265@nathanxps13
Whole thread Raw
In response to Re: [Proposal] vacuumdb --schema only  (Gilles Darold <gilles@migops.com>)
List pgsql-hackers
On Thu, Apr 14, 2022 at 10:27:46PM +0200, Gilles Darold wrote:
> Attached v8 of the patch that tries to address the remarks above, fixes
> patch apply failure to master and replace calls to pg_log_error+exit
> with pg_fatal.

Thanks for the new patch.

> +enum trivalue schema_is_exclude = TRI_DEFAULT;
> +
> +/*
> + * The kind of object use in the command line filter.
> + *   OBJFILTER_NONE: no filter used
> + *   OBJFILTER_SCHEMA: -n | --schema or -N | --exclude-schema
> + *   OBJFILTER_TABLE: -t | --table
> + */
> +enum VacObjectFilter
> +{
> +    OBJFILTER_NONE,
> +    OBJFILTER_TABLE,
> +    OBJFILTER_SCHEMA
> +};
> +
> +enum VacObjectFilter objfilter = OBJFILTER_NONE;

I still think we ought to remove schema_is_exclude in favor of adding
OBJFILTER_SCHEMA_EXCLUDE to the enum.  I think that would simplify some of
the error handling and improve readability.  IMO we should add
OBJFILTER_ALL, too.

> -                    simple_string_list_append(&tables, optarg);
> +                    /* When filtering on schema name, filter by table is not allowed. */
> +                    if (schema_is_exclude != TRI_DEFAULT)
> +                        pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
> +                    simple_string_list_append(&objects, optarg);
> +                    objfilter = OBJFILTER_TABLE;
>                      tbl_count++;
>                      break;
>                  }
> @@ -202,6 +224,32 @@ main(int argc, char *argv[])
>                                        &vacopts.parallel_workers))
>                      exit(1);
>                  break;
> +            case 'n':            /* include schema(s) */
> +                {
> +                    /* When filtering on schema name, filter by table is not allowed. */
> +                    if (tbl_count)
> +                        pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
> +
> +                    if (schema_is_exclude == TRI_YES)
> +                        pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same
time");
> +                    simple_string_list_append(&objects, optarg);
> +                    objfilter = OBJFILTER_SCHEMA;
> +                    schema_is_exclude = TRI_NO;
> +                    break;
> +                }
> +            case 'N':            /* exclude schema(s) */
> +                {
> +                    /* When filtering on schema name, filter by table is not allowed. */
> +                    if (tbl_count)
> +                        pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
> +                    if (schema_is_exclude == TRI_NO)
> +                        pg_fatal("cannot vacuum all tables in schema(s) and exclude specific schema(s) at the same
time");
> +
> +                    simple_string_list_append(&objects, optarg);
> +                    objfilter = OBJFILTER_SCHEMA;
> +                    schema_is_exclude = TRI_YES;
> +                    break;

I was expecting these to check objfilter.  For example:

    case 'N':
        {
            if (objfilter == OBJFILTER_TABLE)
                pg_fatal("...");
            else if (objfilter == OBJFILTER_SCHEMA)
                pg_fatal("...");
            else if (objfilter == OBJFILTER_ALL)
                pg_fatal("...");

            simple_string_list_append(&objects, optarg);
            objfilter = OBJFILTER_SCHEMA_EXCLUDE;
            break;
        }

Another possible improvement could be to move the pg_fatal() calls to a
helper function that generates the message based on the current objfilter
setting and the current option.  I'm envisioning something like
check_objfilter(VacObjFilter curr_objfilter, VacObjFilter curr_option).

> +    /*
> +     * When filtering on schema name, filter by table is not allowed.
> +     * The schema name can already be set to a fqdn table name.
> +     */
> +    if (tbl_count && objfilter == OBJFILTER_SCHEMA && objects.head != NULL)
> +        pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");

Isn't this redundant with the error in the option handling?

> -        if (tables.head != NULL)
> +        if (objfilter == OBJFILTER_SCHEMA && objects.head != NULL)
> +        {
> +            if (schema_is_exclude == TRI_YES)
> +                pg_fatal("cannot exclude from vacuum specific schema(s) in all databases");
> +            else if (schema_is_exclude == TRI_NO)
> +                pg_fatal("cannot vacuum specific schema(s) in all databases");
> +        }
> +
> +        if (objfilter == OBJFILTER_TABLE && objects.head != NULL)
>              pg_fatal("cannot vacuum specific table(s) in all databases");

I think we could move all these into check_objfilter(), too.

nitpick: Why do we need to check that objects.head is not NULL?  Isn't the
objfilter check enough?

>      /*
> -     * If no tables were listed, filter for the relevant relation types.  If
> -     * tables were given via --table, don't bother filtering by relation type.
> -     * Instead, let the server decide whether a given relation can be
> -     * processed in which case the user will know about it.
> +     * If no tables were listed or that a filter by schema is used, filter
> +     * for the relevant relation types.  If tables were given via --table,
> +     * don't bother filtering by relation type.  Instead, let the server
> +     * decide whether a given relation can be processed in which case the
> +     * user will know about it.  If there is a filter by schema the use of
> +     * --table is not possible so we have to filter by relation type too.
>       */
> -    if (!tables_listed)
> +    if (!objects_listed || objfilter == OBJFILTER_SCHEMA)

Do we need to check for objects_listed here?  IIUC we can just check for
objfilter != OBJFILTER_TABLE.

Unless I'm missing something, schema_is_exclude appears to only be used for
error checking and doesn't impact the generated catalog query.  It looks
like the relevant logic disappeared after v4 of the patch.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman
Next
From: Nathan Bossart
Date:
Subject: Re: make MaxBackends available in _PG_init