Re: VACUUM (DISABLE_PAGE_SKIPPING on) - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: VACUUM (DISABLE_PAGE_SKIPPING on)
Date
Msg-id CAD21AoAgdZ=prY1cBUgSeOLkVrL+hPiPUZZE4Y+CwthD5Gpqvw@mail.gmail.com
Whole thread Raw
In response to Re: VACUUM (DISABLE_PAGE_SKIPPING on)  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: VACUUM (DISABLE_PAGE_SKIPPING on)  (Simon Riggs <simon@2ndquadrant.com>)
Re: VACUUM (DISABLE_PAGE_SKIPPING on)  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
On Fri, Nov 20, 2020 at 6:02 AM Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On Wed, 18 Nov 2020 at 02:04, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2020-Nov-17, Simon Riggs wrote:
> >
> > > As an additional optimization, if we do find a row that needs freezing
> > > on a data block, we should simply freeze *all* row versions on the
> > > page, not just the ones below the selected cutoff. This is justified
> > > since writing the block is the biggest cost and it doesn't make much
> > > sense to leave a few rows unfrozen on a block that we are dirtying.
> >
> > Yeah.  We've had earlier proposals to use high and low watermarks: if any
> > tuple is past the high watermark, then freeze all tuples that are past
> > the low watermark.  However this is ancient thinking (prior to
> > HEAP_XMIN_FROZEN) and we don't need the low watermark to be different
> > from zero, since the original xid is retained anyway.
> >
> > So +1 for this idea.
>
> Patch to do this attached, for discussion.

Thank you for the patch!

+                *
+                * Once we decide to dirty the data block we may as well freeze
+                * any tuples that are visible to all, since the additional
+                * cost of freezing multiple tuples is low.

I'm concerned that always freezing all tuples when we're going to make
the page dirty would affect the existing vacuum workload much. The
additional cost of freezing multiple tuples would be low but if we
freeze tuples we would also need to write WAL, which is not negligible
overhead I guess. In the worst case, if a table has dead tuples on all
pages we process them, but with this patch, in addition to that, we
will end up not only freezing all live tuples but also writing
XLOG_HEAP2_FREEZE_PAGE WAL for all pages. So I think it would be
better either to freeze all tuples if we find a tuple that needs to be
frozen or to make this behavior work only if the new VACUUM option is
specified.

Regards,

-- 
Masahiko Sawada
EnterpriseDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Next
From: Peter Geoghegan
Date:
Subject: Re: xid wraparound danger due to INDEX_CLEANUP false