Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Date
Msg-id 407579.1672258403@sss.pgh.pa.us
Whole thread Raw
Responses Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)  (Nathan Bossart <nathandbossart@gmail.com>)
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)  (Nathan Bossart <nathandbossart@gmail.com>)
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)  (Justin Pryzby <pryzby@telsasoft.com>)
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
[ redirecting to -hackers because patch attached ]

Peter Geoghegan <pg@bowt.ie> writes:
> On Fri, Dec 16, 2022 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That is a really good point.  How about teaching VACUUM to track
>> the oldest original relfrozenxid and relminmxid among the table(s)
>> it processed, and skip vac_update_datfrozenxid unless at least one
>> of those matches the database's values?  For extra credit, also
>> skip if we didn't successfully advance the source rel's value.

> Hmm. I think that that would probably work.

I poked into that idea some more and concluded that getting VACUUM to
manage it behind the user's back is not going to work very reliably.
The key problem is explained by this existing comment in autovacuum.c:

     * Even if we didn't vacuum anything, it may still be important to do
     * this, because one indirect effect of vac_update_datfrozenxid() is to
     * update ShmemVariableCache->xidVacLimit.  That might need to be done
     * even if we haven't vacuumed anything, because relations with older
     * relfrozenxid values or other databases with older datfrozenxid values
     * might have been dropped, allowing xidVacLimit to advance.

That is, if the table that's holding back datfrozenxid gets dropped
between VACUUM runs, VACUUM would never think that it might have
advanced the global minimum.

I'm forced to the conclusion that we have to expose some VACUUM
options if we want this to work well.  Attached is a draft patch
that invents SKIP_DATABASE_STATS and ONLY_DATABASE_STATS options
(name bikeshedding welcome) and teaches vacuumdb to use them.

Light testing says that this is a win: even on the regression
database, which isn't all that big, I see a drop in vacuumdb's
runtime from ~260 ms to ~175 ms.  Of course this is a case where
VACUUM doesn't really have anything to do, so it's a best-case
scenario ... but still, I was expecting the effect to be barely
above noise with this many tables, yet it's a good bit more.

            regards, tom lane

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index e14ead8826..443928f286 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -36,6 +36,8 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     PROCESS_TOAST [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
     PARALLEL <replaceable class="parameter">integer</replaceable>
+    SKIP_DATABASE_STATS [ <replaceable class="parameter">boolean</replaceable> ]
+    ONLY_DATABASE_STATS [ <replaceable class="parameter">boolean</replaceable> ]

 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>

@@ -295,6 +297,40 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>

+   <varlistentry>
+    <term><literal>SKIP_DATABASE_STATS</literal></term>
+    <listitem>
+     <para>
+      Specifies that <command>VACUUM</command> should skip updating the
+      database-wide statistics about oldest unfrozen XIDs.  Normally
+      <command>VACUUM</command> will update these statistics once at the
+      end of the command.  However, this can take awhile in a database
+      with a very large number of tables, and it will accomplish nothing
+      unless the table that had contained the oldest unfrozen XID was
+      among those vacuumed.  Moreover, if multiple <command>VACUUM</command>
+      commands are issued in parallel, only one of them can update the
+      database-wide statistics at a time.  Therefore, if an application
+      intends to issue a series of many <command>VACUUM</command>
+      commands, it can be helpful to set this option in all but the last
+      such command; or set it in all the commands and separately
+      issue <literal>VACUUM (ONLY_DATABASE_STATS)</literal> afterwards.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>ONLY_DATABASE_STATS</literal></term>
+    <listitem>
+     <para>
+      Specifies that <command>VACUUM</command> should do nothing except
+      update the database-wide statistics about oldest unfrozen XIDs.
+      When this option is specified,
+      the <replaceable class="parameter">table_and_columns</replaceable>
+      list must be empty, and no other option may be enabled.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ba965b8c7b..055c1a146b 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -114,6 +114,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
     bool        full = false;
     bool        disable_page_skipping = false;
     bool        process_toast = true;
+    bool        skip_database_stats = false;
+    bool        only_database_stats = false;
     ListCell   *lc;

     /* index_cleanup and truncate values unspecified for now */
@@ -200,6 +202,10 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
                     params.nworkers = nworkers;
             }
         }
+        else if (strcmp(opt->defname, "skip_database_stats") == 0)
+            skip_database_stats = defGetBoolean(opt);
+        else if (strcmp(opt->defname, "only_database_stats") == 0)
+            only_database_stats = defGetBoolean(opt);
         else
             ereport(ERROR,
                     (errcode(ERRCODE_SYNTAX_ERROR),
@@ -216,7 +222,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
         (freeze ? VACOPT_FREEZE : 0) |
         (full ? VACOPT_FULL : 0) |
         (disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
-        (process_toast ? VACOPT_PROCESS_TOAST : 0);
+        (process_toast ? VACOPT_PROCESS_TOAST : 0) |
+        (skip_database_stats ? VACOPT_SKIP_DATABASE_STATS : 0) |
+        (only_database_stats ? VACOPT_ONLY_DATABASE_STATS : 0);

     /* sanity checks on options */
     Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -349,6 +357,23 @@ vacuum(List *relations, VacuumParams *params,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("PROCESS_TOAST required with VACUUM FULL")));

+    /* sanity check for ONLY_DATABASE_STATS */
+    if (params->options & VACOPT_ONLY_DATABASE_STATS)
+    {
+        Assert(params->options & VACOPT_VACUUM);
+        if (relations != NIL)
+            ereport(ERROR,
+                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                     errmsg("ONLY_DATABASE_STATS cannot be specified with a list of tables")));
+        /* don't require people to turn off PROCESS_TOAST explicitly */
+        if (params->options & ~(VACOPT_VACUUM |
+                                VACOPT_PROCESS_TOAST |
+                                VACOPT_ONLY_DATABASE_STATS))
+            ereport(ERROR,
+                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                     errmsg("ONLY_DATABASE_STATS cannot be specified with other VACUUM options")));
+    }
+
     /*
      * Create special memory context for cross-transaction storage.
      *
@@ -376,7 +401,12 @@ vacuum(List *relations, VacuumParams *params,
      * Build list of relation(s) to process, putting any new data in
      * vac_context for safekeeping.
      */
-    if (relations != NIL)
+    if (params->options & VACOPT_ONLY_DATABASE_STATS)
+    {
+        /* We don't process any tables in this case */
+        Assert(relations == NIL);
+    }
+    else if (relations != NIL)
     {
         List       *newrels = NIL;
         ListCell   *lc;
@@ -528,11 +558,11 @@ vacuum(List *relations, VacuumParams *params,
         StartTransactionCommand();
     }

-    if ((params->options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
+    if ((params->options & VACOPT_VACUUM) &&
+        !(params->options & VACOPT_SKIP_DATABASE_STATS))
     {
         /*
          * Update pg_database.datfrozenxid, and truncate pg_xact if possible.
-         * (autovacuum.c does this for itself.)
          */
         vac_update_datfrozenxid();
     }
@@ -560,13 +590,14 @@ vacuum_is_permitted_for_relation(Oid relid, Form_pg_class reltuple,

     Assert((options & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);

-    /*
+    /*----------
      * A role has privileges to vacuum or analyze the relation if any of the
      * following are true:
      *   - the role is a superuser
      *   - the role owns the relation
      *   - the role owns the current database and the relation is not shared
      *   - the role has been granted the MAINTAIN privilege on the relation
+     *----------
      */
     if (object_ownercheck(RelationRelationId, relid, GetUserId()) ||
         (object_ownercheck(DatabaseRelationId, MyDatabaseId, GetUserId()) && !reltuple->relisshared) ||
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 0746d80224..b8472b4c21 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2854,8 +2854,13 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
         tab->at_relid = relid;
         tab->at_sharedrel = classForm->relisshared;

-        /* Note that this skips toast relations */
-        tab->at_params.options = (dovacuum ? VACOPT_VACUUM : 0) |
+        /*
+         * Select VACUUM options.  Note we don't say VACOPT_PROCESS_TOAST, so
+         * that vacuum() skips toast relations.  Also note we tell vacuum() to
+         * skip vac_update_datfrozenxid(); we'll do that separately.
+         */
+        tab->at_params.options =
+            (dovacuum ? (VACOPT_VACUUM | VACOPT_SKIP_DATABASE_STATS) : 0) |
             (doanalyze ? VACOPT_ANALYZE : 0) |
             (!wraparound ? VACOPT_SKIP_LOCKED : 0);

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2a3921937c..84e25e1552 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -4552,8 +4552,9 @@ psql_completion(const char *text, int start, int end)
             COMPLETE_WITH("FULL", "FREEZE", "ANALYZE", "VERBOSE",
                           "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
                           "INDEX_CLEANUP", "PROCESS_TOAST",
-                          "TRUNCATE", "PARALLEL");
-        else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|PROCESS_TOAST|TRUNCATE"))
+                          "TRUNCATE", "PARALLEL", "SKIP_DATABASE_STATS",
+                          "ONLY_DATABASE_STATS");
+        else if
(TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|PROCESS_TOAST|TRUNCATE|SKIP_DATABASE_STATS|ONLY_DATABASE_STATS"))
             COMPLETE_WITH("ON", "OFF");
         else if (TailMatches("INDEX_CLEANUP"))
             COMPLETE_WITH("AUTO", "ON", "OFF");
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index e5343774fe..6154bb46b7 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -22,15 +22,15 @@ $node->issues_sql_like(
     'SQL VACUUM run');
 $node->issues_sql_like(
     [ 'vacuumdb', '-f', 'postgres' ],
-    qr/statement: VACUUM \(FULL\).*;/,
+    qr/statement: VACUUM \(SKIP_DATABASE_STATS, FULL\).*;/,
     'vacuumdb -f');
 $node->issues_sql_like(
     [ 'vacuumdb', '-F', 'postgres' ],
-    qr/statement: VACUUM \(FREEZE\).*;/,
+    qr/statement: VACUUM \(SKIP_DATABASE_STATS, FREEZE\).*;/,
     'vacuumdb -F');
 $node->issues_sql_like(
     [ 'vacuumdb', '-zj2', 'postgres' ],
-    qr/statement: VACUUM \(ANALYZE\).*;/,
+    qr/statement: VACUUM \(SKIP_DATABASE_STATS, ANALYZE\).*;/,
     'vacuumdb -zj2');
 $node->issues_sql_like(
     [ 'vacuumdb', '-Z', 'postgres' ],
@@ -38,11 +38,11 @@ $node->issues_sql_like(
     'vacuumdb -Z');
 $node->issues_sql_like(
     [ 'vacuumdb', '--disable-page-skipping', 'postgres' ],
-    qr/statement: VACUUM \(DISABLE_PAGE_SKIPPING\).*;/,
+    qr/statement: VACUUM \(DISABLE_PAGE_SKIPPING, SKIP_DATABASE_STATS\).*;/,
     'vacuumdb --disable-page-skipping');
 $node->issues_sql_like(
     [ 'vacuumdb', '--skip-locked', 'postgres' ],
-    qr/statement: VACUUM \(SKIP_LOCKED\).*;/,
+    qr/statement: VACUUM \(SKIP_DATABASE_STATS, SKIP_LOCKED\).*;/,
     'vacuumdb --skip-locked');
 $node->issues_sql_like(
     [ 'vacuumdb', '--skip-locked', '--analyze-only', 'postgres' ],
@@ -53,32 +53,32 @@ $node->command_fails(
     '--analyze-only and --disable-page-skipping specified together');
 $node->issues_sql_like(
     [ 'vacuumdb', '--no-index-cleanup', 'postgres' ],
-    qr/statement: VACUUM \(INDEX_CLEANUP FALSE\).*;/,
+    qr/statement: VACUUM \(INDEX_CLEANUP FALSE, SKIP_DATABASE_STATS\).*;/,
     'vacuumdb --no-index-cleanup');
 $node->command_fails(
     [ 'vacuumdb', '--analyze-only', '--no-index-cleanup', 'postgres' ],
     '--analyze-only and --no-index-cleanup specified together');
 $node->issues_sql_like(
     [ 'vacuumdb', '--no-truncate', 'postgres' ],
-    qr/statement: VACUUM \(TRUNCATE FALSE\).*;/,
+    qr/statement: VACUUM \(TRUNCATE FALSE, SKIP_DATABASE_STATS\).*;/,
     'vacuumdb --no-truncate');
 $node->command_fails(
     [ 'vacuumdb', '--analyze-only', '--no-truncate', 'postgres' ],
     '--analyze-only and --no-truncate specified together');
 $node->issues_sql_like(
     [ 'vacuumdb', '--no-process-toast', 'postgres' ],
-    qr/statement: VACUUM \(PROCESS_TOAST FALSE\).*;/,
+    qr/statement: VACUUM \(PROCESS_TOAST FALSE, SKIP_DATABASE_STATS\).*;/,
     'vacuumdb --no-process-toast');
 $node->command_fails(
     [ 'vacuumdb', '--analyze-only', '--no-process-toast', 'postgres' ],
     '--analyze-only and --no-process-toast specified together');
 $node->issues_sql_like(
     [ 'vacuumdb', '-P', 2, 'postgres' ],
-    qr/statement: VACUUM \(PARALLEL 2\).*;/,
+    qr/statement: VACUUM \(SKIP_DATABASE_STATS, PARALLEL 2\).*;/,
     'vacuumdb -P 2');
 $node->issues_sql_like(
     [ 'vacuumdb', '-P', 0, 'postgres' ],
-    qr/statement: VACUUM \(PARALLEL 0\).*;/,
+    qr/statement: VACUUM \(SKIP_DATABASE_STATS, PARALLEL 0\).*;/,
     'vacuumdb -P 0');
 $node->command_ok([qw(vacuumdb -Z --table=pg_am dbname=template1)],
     'vacuumdb with connection string');
@@ -119,7 +119,7 @@ $node->command_fails([ 'vacuumdb', '-P', -1, 'postgres' ],
     'negative parallel degree');
 $node->issues_sql_like(
     [ 'vacuumdb', '--analyze', '--table', 'vactable(a, b)', 'postgres' ],
-    qr/statement: VACUUM \(ANALYZE\) public.vactable\(a, b\);/,
+    qr/statement: VACUUM \(SKIP_DATABASE_STATS, ANALYZE\) public.vactable\(a, b\);/,
     'vacuumdb --analyze with complete column list');
 $node->issues_sql_like(
     [ 'vacuumdb', '--analyze-only', '--table', 'vactable(b)', 'postgres' ],
@@ -150,7 +150,7 @@ $node->issues_sql_like(
     'vacuumdb --table --min-xid-age');
 $node->issues_sql_like(
     [ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
-    qr/VACUUM "Foo".bar/,
+    qr/VACUUM \(SKIP_DATABASE_STATS\) "Foo".bar/,
     'vacuumdb --schema');
 $node->issues_sql_like(
     [ 'vacuumdb', '--exclude-schema', '"Foo"', 'postgres' ],
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 272e37d290..698ca8791a 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -44,6 +44,7 @@ typedef struct vacuumingOptions
     bool        force_index_cleanup;
     bool        do_truncate;
     bool        process_toast;
+    bool        skip_database_stats;
 } vacuumingOptions;

 /* object filter options */
@@ -533,6 +534,9 @@ vacuum_one_database(ConnParams *cparams,
         pg_fatal("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
                  "--parallel", "13");

+    /* skip_database_stats is used automatically if server supports it */
+    vacopts->skip_database_stats = (PQserverVersion(conn) >= 160000);
+
     if (!quiet)
     {
         if (stage != ANALYZE_NO_STAGE)
@@ -792,6 +796,12 @@ vacuum_one_database(ConnParams *cparams,
     if (!ParallelSlotsWaitCompletion(sa))
         failed = true;

+    /* If we used SKIP_DATABASE_STATS, mop up with ONLY_DATABASE_STATS */
+    if (vacopts->skip_database_stats && stage == ANALYZE_NO_STAGE && !failed)
+    {
+        executeCommand(conn, "VACUUM (ONLY_DATABASE_STATS);", echo);
+    }
+
 finish:
     ParallelSlotsTerminate(sa);
     pg_free(sa);
@@ -957,6 +967,13 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
                 appendPQExpBuffer(sql, "%sPROCESS_TOAST FALSE", sep);
                 sep = comma;
             }
+            if (vacopts->skip_database_stats)
+            {
+                /* SKIP_DATABASE_STATS is supported since v16 */
+                Assert(serverVersion >= 160000);
+                appendPQExpBuffer(sql, "%sSKIP_DATABASE_STATS", sep);
+                sep = comma;
+            }
             if (vacopts->skip_locked)
             {
                 /* SKIP_LOCKED is supported since v12 */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 2f274f2bec..7c3ca749fd 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -188,6 +188,8 @@ typedef struct VacAttrStats
 #define VACOPT_SKIP_LOCKED 0x20 /* skip if cannot get lock */
 #define VACOPT_PROCESS_TOAST 0x40    /* process the TOAST table, if any */
 #define VACOPT_DISABLE_PAGE_SKIPPING 0x80    /* don't skip any pages */
+#define VACOPT_SKIP_DATABASE_STATS 0x100    /* skip vac_update_datfrozenxid() */
+#define VACOPT_ONLY_DATABASE_STATS 0x200    /* only vac_update_datfrozenxid() */

 /*
  * Values used by index_cleanup and truncate params.
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 0035d158b7..d860be0e20 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -282,6 +282,12 @@ ALTER TABLE vactst ALTER COLUMN t SET STORAGE EXTERNAL;
 VACUUM (PROCESS_TOAST FALSE) vactst;
 VACUUM (PROCESS_TOAST FALSE, FULL) vactst;
 ERROR:  PROCESS_TOAST required with VACUUM FULL
+-- SKIP_DATABASE_STATS option
+VACUUM (SKIP_DATABASE_STATS) vactst;
+-- ONLY_DATABASE_STATS option
+VACUUM (ONLY_DATABASE_STATS);
+VACUUM (ONLY_DATABASE_STATS) vactst;  -- error
+ERROR:  ONLY_DATABASE_STATS cannot be specified with a list of tables
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9faa8a34a6..9da8f3e830 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -237,6 +237,13 @@ ALTER TABLE vactst ALTER COLUMN t SET STORAGE EXTERNAL;
 VACUUM (PROCESS_TOAST FALSE) vactst;
 VACUUM (PROCESS_TOAST FALSE, FULL) vactst;

+-- SKIP_DATABASE_STATS option
+VACUUM (SKIP_DATABASE_STATS) vactst;
+
+-- ONLY_DATABASE_STATS option
+VACUUM (ONLY_DATABASE_STATS);
+VACUUM (ONLY_DATABASE_STATS) vactst;  -- error
+
 DROP TABLE vaccluster;
 DROP TABLE vactst;
 DROP TABLE vacparted;

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: perl 5.36, C99, -Wdeclaration-after-statement -Wshadow=compatible-local
Next
From: Nathan Bossart
Date:
Subject: Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)