Re: doc fixes: vacuum_cleanup_index_scale_factor - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: doc fixes: vacuum_cleanup_index_scale_factor
Date
Msg-id 20180502154339.GC9585@telsasoft.com
Whole thread Raw
In response to Re: doc fixes: vacuum_cleanup_index_scale_factor  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: doc fixes: vacuum_cleanup_index_scale_factor  (Robert Haas <robertmhaas@gmail.com>)
Re: doc fixes: vacuum_cleanup_index_scale_factor  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
On Wed, May 02, 2018 at 10:54:31AM -0400, Robert Haas wrote:
> On Tue, May 1, 2018 at 10:30 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > -        When no tuples were deleted from the heap, B-tree indexes might still
> > -        be scanned during <command>VACUUM</command> cleanup stage by two
> > -        reasons.  The first reason is that B-tree index contains deleted pages
> > -        which can be recycled during cleanup.  The second reason is that B-tree
> > -        index statistics is stalled.  The criterion of stalled index statistics
> > -        is number of inserted tuples since previous statistics collection
> > -        is greater than <varname>vacuum_cleanup_index_scale_factor</varname>
> > -        fraction of total number of heap tuples.
> > +        When no tuples were deleted from the heap, B-tree indexes are still
> > +        scanned during <command>VACUUM</command> cleanup stage unless
> > +        two conditions are met.  First, if a B-tree index contains no deleted pages
> > +        which can be recycled during cleanup.  Second, if B-tree
> > +        index statistics are not stale.  Index statistics are considered stale unless
> > +        <varname>vacuum_cleanup_index_scale_factor</varname> is non-negative, and the
> > +        number of inserted tuples since the previous statistics collection is
> > +        less than that fraction of the total number of heap tuples.
> > +        The default is -1, meaning index scan during cleanup is not skipped.
> 
> I agree that this documentation needs to be rewritten but your rewrite
> doesn't strike me as very good English either.

2nd attempt


index eabe2a9..e305de9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1893,14 +1893,16 @@ include_dir 'conf.d'
       </term>
       <listitem>
        <para>
-        When no tuples were deleted from the heap, B-tree indexes might still
-        be scanned during <command>VACUUM</command> cleanup stage by two
-        reasons.  The first reason is that B-tree index contains deleted pages
-        which can be recycled during cleanup.  The second reason is that B-tree
-        index statistics is stalled.  The criterion of stalled index statistics
-        is number of inserted tuples since previous statistics collection
-        is greater than <varname>vacuum_cleanup_index_scale_factor</varname>
-        fraction of total number of heap tuples.
+        When no tuples were deleted from the heap, B-tree indexes are still
+        scanned during <command>VACUUM</command> cleanup stage unless two
+        conditions are met: the index contains no deleted pages which can be
+        recycled during cleanup; and, the index statistics are not stale.
+        Index statistics are considered stale unless
+        <varname>vacuum_cleanup_index_scale_factor</varname>
+        is set to a non-negative value, and the number of inserted tuples since
+        the previous statistics collection is less than that fraction of the
+        total number of heap tuples.  The default is -1, which means index
+        scans during <command>VACUUM</command> cleanup are not skipped.
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 3bcc56e..22b4a75 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -189,7 +189,7 @@ _bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
     if (metad->btm_version < BTREE_VERSION)
         _bt_upgrademetapage(metapg);
 
-    /* update cleanup-related infromation */
+    /* update cleanup-related information */
     metad->btm_oldest_btpo_xact = oldestBtpoXact;
     metad->btm_last_cleanup_num_heap_tuples = numHeapTuples;
     MarkBufferDirty(metabuf);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index e5dce00..4e86280 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -818,10 +818,10 @@ _bt_vacuum_needs_cleanup(IndexVacuumInfo *info)
         float8        cleanup_scale_factor;
 
         /*
-         * If table receives large enough amount of insertions and no cleanup
-         * was performed, then index might appear to have stalled statistics.
-         * In order to evade that, we perform cleanup when table receives
-         * vacuum_cleanup_index_scale_factor fractions of insertions.
+         * If table receives enough insertions and no cleanup
+         * was performed, then index would appear have stale statistics.
+         * If scale factor is set, we avoid that by performing cleanup if the number of added tuples
+         * exceeds vacuum_cleanup_index_scale_factor fraction of original tuple count.
          */
         relopts = (StdRdOptions *) info->index->rd_options;
         cleanup_scale_factor = (relopts &&
@@ -870,8 +870,8 @@ btbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
                      &oldestBtpoXact);
 
         /*
-         * Update cleanup-related information in metapage. These information
-         * is used only for cleanup but keeping up them to date can avoid
+         * Update cleanup-related information in metapage. This information
+         * is used only for cleanup but keeping them up to date can avoid
          * unnecessary cleanup even after bulkdelete.
          */
         _bt_update_meta_cleanup_info(info->index, oldestBtpoXact,
@@ -899,8 +899,8 @@ btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
      * If btbulkdelete was called, we need not do anything, just return the
      * stats from the latest btbulkdelete call.  If it wasn't called, we might
      * still need to do a pass over the index, to recycle any newly-recyclable
-     * pages and to obtain index statistics.  _bt_vacuum_needs_cleanup checks
-     * is there are newly-recyclable or stalled index statistics.
+     * pages or to obtain index statistics.  _bt_vacuum_needs_cleanup
+     * determines if if either are needed.
      *
      * Since we aren't going to actually delete any leaf items, there's no
      * need to go through all the vacuum-cycle-ID pushups.


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Sort performance cliff with small work_mem
Next
From: Robert Haas
Date:
Subject: Re: Is a modern build system acceptable for older platforms