Re: New vacuum option to do only freezing - Mailing list pgsql-hackers

From Bossart, Nathan
Subject Re: New vacuum option to do only freezing
Date
Msg-id B3FDA4F0-385B-4764-88D2-C07C010328CD@amazon.com
Whole thread Raw
In response to Re: New vacuum option to do only freezing  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: New vacuum option to do only freezing  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Hi,

On 1/8/19, 7:03 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
> Attached the updated version patch incorporated all comments I got.

Thanks for the new patch!

> * Instead of freezing xmax I've changed the behaviour of the new
> option (DISABLE_INDEX_CLEANUP) so that it sets dead tuples as dead
> instead of as unused and skips both index vacuum and index cleanup.
> That is, we remove the storage of dead tuple but leave dead itemids
> for the index lookups. These are removed by the next vacuum execution
> enabling index cleanup. Currently HOT-pruning doesn't set the root of
> the chain as unused even if the whole chain is dead. Since setting
> tuples as dead removes storage the freezing xmax is no longer
> required.

I tested this option with a variety of cases (HOT, indexes, etc.), and
it seems to work well.  I haven't looked too deeply into the
implications of using LP_DEAD this way, but it seems like a reasonable
approach at first glance.

+    <varlistentry>
+    <term><literal>DISABLE_INDEX_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      <command>VACUUM</command> removes dead tuples and prunes HOT-updated
+      tuples chain for live tuples on table. If the table has any dead tuple
+      it removes them from both table and indexes for re-use. With this
+      option <command>VACUUM</command> marks tuples as dead (i.e., it doesn't
+      remove tuple storage) and disables removing dead tuples from indexes.
+      This is suitable for avoiding transaction ID wraparound but not
+      sufficient for avoiding index bloat. This option cannot be used in
+      conjunction with <literal>FULL</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>

I think we should clarify the expected behavior when this option is
used on a table with no indexes.  We probably do not want to fail, as
this could disrupt VACUUM commands that touch several tables.  Also,
we don't need to set tuples as dead instead of unused, which appears
to be what this patch is actually doing:

+                       if (nindexes > 0 && disable_index_cleanup)
+                               lazy_set_tuples_dead(onerel, blkno, buf, vacrelstats);
+                       else
+                               lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);

In this case, perhaps we should generate a log statement that notes
that DISABLE_INDEX_CLEANUP is being ignored and set
disable_index_cleanup to false during processing.

+               if (disable_index_cleanup)
+                       ereport(elevel,
+                                       (errmsg("\"%s\": marked %.0f row versions in %u pages as dead",
+                                                       RelationGetRelationName(onerel),
+                                                       tups_vacuumed, vacuumed_pages)));
+               else
+                       ereport(elevel,
+                                       (errmsg("\"%s\": removed %.0f row versions in %u pages",
+                                                       RelationGetRelationName(onerel),
+                                                       tups_vacuumed, vacuumed_pages)));

Should the first log statement only be generated when there are also
indexes?

+static void
+lazy_set_tuples_dead(Relation onerel, BlockNumber blkno, Buffer buffer,
+                                       LVRelStats *vacrelstats)

This function looks very similar to lazy_vacuum_page().  Perhaps the
two could be combined?

Nathan


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Next
From: Andres Freund
Date:
Subject: Re: reducing the footprint of ScanKeyword (was Re: Large writablevariables)