Re: VACUUM (DISABLE_PAGE_SKIPPING on) - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: VACUUM (DISABLE_PAGE_SKIPPING on)
Date
Msg-id CANP8+jJmjCxNLv-mNeMUibARx0WGMKN45Sh=7z6rOqx5Ca-GGA@mail.gmail.com
Whole thread Raw
In response to Re: VACUUM (DISABLE_PAGE_SKIPPING on)  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
On Fri, 20 Nov 2020 at 14:07, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2020-Nov-20, Masahiko Sawada wrote:
>
> > I'm concerned that always freezing all tuples when we're going to make
> > the page dirty would affect the existing vacuum workload much. The
> > additional cost of freezing multiple tuples would be low but if we
> > freeze tuples we would also need to write WAL, which is not negligible
> > overhead I guess. In the worst case, if a table has dead tuples on all
> > pages we process them, but with this patch, in addition to that, we
> > will end up not only freezing all live tuples but also writing
> > XLOG_HEAP2_FREEZE_PAGE WAL for all pages. So I think it would be
> > better either to freeze all tuples if we find a tuple that needs to be
> > frozen or to make this behavior work only if the new VACUUM option is
> > specified.
>
> There are two costs associated with this processing.  One is dirtying
> the page (which means it needs to be written down when evicted), and the
> other is to write WAL records for each change.  The cost for the latter
> is going to be the same in both cases (with this change and without)
> because the same WAL will have to be written -- the only difference is
> *when* do you pay it.  The cost of the former is quite different; with
> Simon's patch we dirty the page once, and without the patch we may dirty
> it several times before it becomes "stable" and no more writes are done
> to it.
>
> (If you have tables whose pages change all the time, there would be no
> difference with or without the patch.)
>
> Dirtying the page less times means less full-page images to WAL, too,
> which can be significant.

Summary of patch effects:

one_freeze_then_max_freeze.v5.patch
doesn't increase/decrease the number of pages that are dirtied by
freezing, but when a page will be dirtied **by any activity** it
maximises the number of rows frozen, so in some cases it may write a
WAL freeze record when it would not have done previously.

one_freeze_then_max_freeze.v6.patch
doesn't increase/decrease the number of pages that are dirtied by
freezing, but when a page will be dirtied **by freezing only** it
maximises the number of rows frozen, so the number of WAL records for
freezing is the same, but they may contain more tuples than before and
will increase the probability that the page is marked all_frozen.

So yes, as you say, the net effect will be to reduce the number of
write I/Os in subsequent vacuums required to move forward
relfrozenxid.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



pgsql-hackers by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: Add session statistics to pg_stat_database
Next
From: David Steele
Date:
Subject: Re: Online verification of checksums