Re: Eagerly scan all-visible pages to amortize aggressive vacuum - Mailing list pgsql-hackers

From Nazir Bilal Yavuz
Subject Re: Eagerly scan all-visible pages to amortize aggressive vacuum
Date
Msg-id CAN55FZ3ZGh=HCF-0JZ8QFhNnU+w0FUA8gJGw3YqbnOGAt9=Pew@mail.gmail.com
Whole thread Raw
In response to Re: Eagerly scan all-visible pages to amortize aggressive vacuum  (Andres Freund <andres@anarazel.de>)
Responses Re: Eagerly scan all-visible pages to amortize aggressive vacuum
List pgsql-hackers
Hi,

Thank you for working on this!

On Sat, 14 Dec 2024 at 01:53, Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Thu, Nov 7, 2024 at 10:42 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
>
> Thanks for the review!
> Attached v2 should address your feedback and also fixes a few bugs with v1.

Here are couple of code comments:

=== [PATCH v2 07/10] ===

It took me a while to understand that heap_vac_scan_next_block() loops
until rel_pages. What do you think about adding
Assert(vacrel->current_block == rel_pages) before the
pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED,
rel_pages) and having a comment on main loop should process blocks
until rel_pages?

=== [PATCH v2 09/10] ===

+ * If there are no indexes or index scanning is disabled, phase II may be
+ * skipped. If phase I identified very few dead index entries, vacuum may skip
+ * phases II and III. Index vacuuming deletes the dead index entries from the
+ * TID store.

phase III is not mentioned in the previous comments. It could be
better to first explain what phase III is before mentioning it.

+ * After index vacuuming is complete, vacuum scans the blocks of the relation
+ * indicated by the TIDs in the TID store and reaps the dead tuples, freeing
+ * that space for future tuples. Finally, vacuum may truncate the relation if
+ * it has emptied pages at the end.
+ *
+ * After finishing all phases of work, vacuum updates relation statistics in
+ * pg_class and the cumulative statistics subsystem.

There is no clear definition of phase III here as well. I can not
understand what phase III is and which parts the vacuum may skip.

=== [PATCH v2 10/10] ===

+        /*
+         * The number of eagerly scanned blocks an eager vacuum failed to
+         * freeze (due to age) in the current eager scan region. Eager vacuums
+         * reset it to EAGER_SCAN_MAX_FAILS_PER_REGION each time they enter a
+         * new region of the relation. Aggressive vacuums also decrement this
+         * coutner but it is initialized to # blocks in the relation.
+         */

s/coutner/counter

 /* non-export function prototypes */
+
 static void lazy_scan_heap(LVRelState *vacrel);

Extra blank line.

+    if (TransactionIdIsNormal(vacrel->cutoffs.relfrozenxid) &&
+        TransactionIdPrecedesOrEquals(vacrel->cutoffs.relfrozenxid,
+                                      vacrel->cutoffs.FreezeLimit))
+        oldest_unfrozen_requires_freeze = true;
+
+    if (MultiXactIdIsValid(vacrel->cutoffs.relminmxid) &&
+        MultiXactIdPrecedesOrEquals(vacrel->cutoffs.relminmxid,
+                                    vacrel->cutoffs.MultiXactCutoff))
+        oldest_unfrozen_requires_freeze = true;

You may want to short-circuit the second if condition with
!oldest_unfrozen_requires_freeze but it may increase the complexity,
up to you.

+    vacrel->eager_pages.remaining_fails = EAGER_SCAN_MAX_FAILS_PER_REGION *
+        (1 - vacrel->next_eager_scan_region_start / EAGER_SCAN_REGION_SIZE);

This will always return EAGER_SCAN_MAX_FAILS_PER_REGION or 0 because
of the integer dividing.

+    if (aggressive)
+    {
+        vacrel->eagerness = VAC_AGGRESSIVE;
+
+        /*
+         * An aggressive vacuum must scan every all-visible page to safely
+         * advance the relfrozenxid and/or relminmxid. As such, there is no
+         * cap to the number of allowed successes or failures.
+         */
+        vacrel->eager_pages.remaining_fails = vacrel->rel_pages + 1;
+        vacrel->eager_pages.remaining_successes = vacrel->rel_pages + 1;
+        return;
+    }
...
...
+        if (was_eager_scanned)
+        {
+            if (vm_page_frozen)
+            {
+                Assert(vacrel->eager_pages.remaining_successes > 0);
+                vacrel->eager_pages.remaining_successes--;
+
+                if (vacrel->eager_pages.remaining_successes == 0)
+                {
+                    Assert(vacrel->eagerness == VAC_EAGER);

My initial thought was that since *was_eager_scanned* is true,
Assert(vacrel->eagerness == VAC_EAGER) should be under the top if
condition (I assumed that was_eager_scanned is only true for eager
vacuums, not for aggressive vacuums too) but I see your point here.
Since you set vacrel->eager_pages.remaining_successes to
vacrel->rel_pages + 1, vacrel->eager_pages.remaining_successes can not
reach 0 although all pages are processed as successful. I think
comment is needed in both places to explain why
vacrel->eager_pages.[remaining_fails | remaining_successes] is set to
vacrel->rel_pages + 1 and why vacrel->eagerness should be VAC_EAGER
when was_eager_scanned is true and
vacrel->eager_pages.remaining_successes is 0.

--
Regards,
Nazir Bilal Yavuz
Microsoft



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Crash: invalid DSA memory alloc request
Next
From: Peter Geoghegan
Date:
Subject: Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?