Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: New strategies for freezing, advancing relfrozenxid early
Date
Msg-id CAFiTN-srf+PV0aj5oUhT1Ohxw2WbQuKhcu0d-=rA6odYEN5KmQ@mail.gmail.com
Whole thread Raw
In response to Re: New strategies for freezing, advancing relfrozenxid early  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: New strategies for freezing, advancing relfrozenxid early  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Wed, Jan 18, 2023 at 1:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Jan 17, 2023 at 10:05 AM Peter Geoghegan <pg@bowt.ie> wrote:

My final set of comments for 0002

1.
+struct vmsnapshot
+{
+    /* Target heap rel */
+    Relation    rel;
+    /* Scanning strategy used by VACUUM operation */
+    vmstrategy    strat;
+    /* Per-strategy final scanned_pages */
+    BlockNumber rel_pages;
+    BlockNumber scanned_pages_lazy;
+    BlockNumber scanned_pages_eager;

I do not understand much use of maintaining these two
'scanned_pages_lazy' and 'scanned_pages_eager' variables.  I think
just maintaining 'scanned_pages' should be sufficient.  I do not see
in patches also they are really used.  lazy_scan_strategy() is using
these variables but this is getting values of these out parameters
from visibilitymap_snap_acquire().  And visibilitymap_snap_strategy()
is also using this, but it seems there we just need the final result
of 'scanned_pages' instead of these two variables.

2.

+#define MAX_PAGES_YOUNG_TABLEAGE    0.05    /* 5% of rel_pages */
+#define MAX_PAGES_OLD_TABLEAGE        0.70    /* 70% of rel_pages */

Why is the logic behind 5% and 70% are those based on some
experiments?  Should those be tuning parameters so that with real
world use cases if we realise that it would be good if the eager scan
is getting selected more frequently or less frequently then we can
tune those parameters?

3.
+    /*
+     * VACUUM's DISABLE_PAGE_SKIPPING option overrides our decision by forcing
+     * VACUUM to scan every page (VACUUM effectively distrusts rel's VM)
+     */
+    if (force_scan_all)
+        vacrel->vmstrat = VMSNAP_SCAN_ALL;

I think this should be moved as first if case, I mean why to do all
the calculations based on the 'tableagefrac' and
'TABLEAGEFRAC_XXPOINT' if we are forced to scan them all.  I agree the
extra computation we are doing might not really matter compared to the
vacuum work we are going to perform but still seems logical to me to
do the simple check first.

4. Should we move prefetching as a separate patch, instead of merging
with the scanning strategy?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: heapgettup refactoring
Next
From: John Naylor
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum