Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands - Mailing list pgsql-hackers

From Bossart, Nathan
Subject Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands
Date
Msg-id 3805C3A5-9C3A-4581-BCB9-2C7305095E3E@amazon.com
Whole thread Raw
In response to Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
I’ve attached a v3 patch that addresses your feedback:

Changes:
  - removed extra square brackets in documentation changes
  - removed unnecessary documentation changes for parameter list
  - eliminated one of the new node types
  - renamed ‘rel’ argument to ‘relations’ in vacuum(…)
  - moved relations list to vacuum memory context in vacuum(…)
  - minor addition to VACUUM regression test
  - rebased with master

On 5/15/17, 11:00 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
> Hm. If multiple tables are specified and that some of them take a long
> time, it could be possible that an error still happens if the
> definition of one of those tables changes while VACUUM is in the
> middle of running. And this makes moot the error checks that happened
> at first step. So it seems to me that we definitely should have a
> WARNING if multiple tables are defined anyway, and that to avoid code
> duplication we may want to just do those checks once, before
> processing one of the listed tables. It is true that is would be easy
> to miss a WARNING in the VERBOSE logs, but issuing an ERROR would
> really be frustrating in the middle of a nightly run of VACUUM.

I think this issue already exists, as this comment in get_rel_oids(…) seems to indicate:

    /*
     * Since we don't take a lock here, the relation might be gone, or the
     * RangeVar might no longer refer to the OID we look up here.  In the
     * former case, VACUUM will do nothing; in the latter case, it will
     * process the OID we looked up here, rather than the new one. Neither
     * is ideal, but there's little practical alternative, since we're
     * going to commit this transaction and begin a new one between now
     * and then.
     */
     relid = RangeVarGetRelid(vacrel, NoLock, false);

With the patch applied, I believe this statement still holds true.  So if the relation disappears before we get to
vacuum_rel(…),we will simply skip it and move on to the next one.  The vacuum_rel(…) code provides a WARNING in many
cases(e.g. the user does not have privileges to VACUUM the table), but we seem to silently skip the table when it
disappearsbefore the call to vacuum_rel(…).  If we added a WARNING to the effect of “skipping vacuum of <table_name> —
relationno longer exists” for this case, I think what you are suggesting would be satisfied.
 

However, ANALYZE has a slight caveat.  While analyze_rel(…) silently skips the relation if it no longer exists like
vacuum_rel(…)does, we do not pre-validate the columns list at all.  So, in an ANALYZE statement with multiple tables
andcolumns specified, it’ll only fail once we get to the undefined column.  To fix this, we could add a check for the
columnlists near get_rel_oids(…) and adjust do_analyze_rel(…) to emit a WARNING and skip any columns that vanish in the
meantime.

Does this seem like a sane approach?

  1. Emit WARNING when skipping if relation disappears before we get to it.
  2. Early in vacuum(…), check that the specified columns exist.
  3. Emit WARNING and skip any specified columns that vanish before processing.

Nathan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] PG10 pgindent run
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] PG10 pgindent run