Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands - Mailing list pgsql-hackers

From Bossart, Nathan
Subject Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands
Date
Msg-id 8CF4C58F-B58B-41EB-814D-08744F95BA6A@amazon.com
Whole thread Raw
In response to Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands  ("Bossart, Nathan" <bossartn@amazon.com>)
Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On 5/16/17, 11:21 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
> On Wed, May 17, 2017 at 7:56 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
>> I think this issue already exists, as this comment in get_rel_oids(…) seems to indicate:
>>
>>     /*
>>      * Since we don't take a lock here, the relation might be gone, or the
>>      * RangeVar might no longer refer to the OID we look up here.  In the
>>      * former case, VACUUM will do nothing; in the latter case, it will
>>      * process the OID we looked up here, rather than the new one. Neither
>>      * is ideal, but there's little practical alternative, since we're
>>      * going to commit this transaction and begin a new one between now
>>      * and then.
>>      */
>>      relid = RangeVarGetRelid(vacrel, NoLock, false);
>>
>> With the patch applied, I believe this statement still holds true.  So if the relation disappears before we get to
vacuum_rel(…),we will simply skip it and move on to the next one.  The vacuum_rel(…) code provides a WARNING in many
cases(e.g. the user does not have privileges to VACUUM the table), but we seem to silently skip the table when it
disappearsbefore the call to vacuum_rel(…).
 
>
> Yes, that's the bits using try_relation_open() which returns NULL if
> the relation is gone. I don't think that we want VACUUM to be noisy
> about that when running it on a database.

Agreed.

>> If we added a WARNING to the effect of “skipping vacuum of <table_name> — relation no longer exists” for this case,
Ithink what you are suggesting would be satisfied.
 
>
> We would do no favor by reporting nothing to the user. Without any
> information, the user triggering a manual vacuum may believe that the
> relation has been vacuumed as it was listed but that would not be the
> case.

Agreed.  Unfortunately, I think this is already possible when you specify a table to be vacuumed.

>> However, ANALYZE has a slight caveat.  While analyze_rel(…) silently skips the relation if it no longer exists like
vacuum_rel(…)does, we do not pre-validate the columns list at all.  So, in an ANALYZE statement with multiple tables
andcolumns specified, it’ll only fail once we get to the undefined column.  To fix this, we could add a check for the
columnlists near get_rel_oids(…) and adjust do_analyze_rel(…) to emit a WARNING and skip any columns that vanish in the
meantime.
>>
>> Does this seem like a sane approach?
>>
>>   1. Emit WARNING when skipping if relation disappears before we get to it.
>>   2. Early in vacuum(…), check that the specified columns exist.
>
> And issue an ERROR, right?

Correct.  This means that both the relations and columns specified would cause an ERROR if they do not exist and a
WARNINGif they disappear before we can actually process them.
 

> +           RelationAndColumns *relinfo = (RelationAndColumns *)
> lfirst(relation);
> +           int per_table_opts = options | relinfo->options;  /*
> VACOPT_ANALYZE may be set per-table */
> +           ListCell *oid;
> I have just noticed this bit in your patch. So you can basically do
> something like that:
> VACUUM (ANALYZE) foo, (FULL) bar;
> Do we really want to go down to this level of control? I would keep
> things a bit more restricted as users may be confused by the different
> lock levels involved by each operation, and make use of the same
> options for all the relations listed. Opinions from others is welcome.

I agree with you here, too.  I stopped short of allowing customers to explicitly provide per-table options, so the
exampleyou provided wouldn’t work here.  This is more applicable for something like the following:
 
       VACUUM (FREEZE, VERBOSE) foo, bar (a);

In this case, the FREEZE and VERBOSE options are used for both tables.  However, we have a column list specified for
‘bar’,and the ANALYZE option is implied when we specify a column list.  So when we process ‘bar’, we need to apply the
ANALYZEoption, but we do not need it for ‘foo’.  For now, that is all that this per-table options variable is used
for.

Nathan


pgsql-hackers by date:

Previous
From: Sokolov Yura
Date:
Subject: Re: [HACKERS] Small improvement to compactify_tuples
Next
From: Heikki Linnakangas
Date:
Subject: [HACKERS] Pulling up more complicated subqueries