Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers

From John Naylor
Subject Re: New strategies for freezing, advancing relfrozenxid early
Date
Msg-id CAFBsxsEsFeev7Ri3Bo8k=M5sgtf=TLVPJSUaxhYVZL-Epq2rbw@mail.gmail.com
Whole thread Raw
In response to Re: New strategies for freezing, advancing relfrozenxid early  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: New strategies for freezing, advancing relfrozenxid early  (Nikita Malakhov <hukutoc@gmail.com>)
Re: New strategies for freezing, advancing relfrozenxid early  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers

On Wed, Dec 14, 2022 at 6:07 AM Peter Geoghegan <pg@bowt.ie> wrote:
>
> At the suggestion of Jeff, I wrote a Wiki page that shows motivating
> examples for the patch series:
>
> https://wiki.postgresql.org/wiki/Freezing/skipping_strategies_patch:_motivating_examples
>
> These are all cases where VACUUM currently doesn't do the right thing
> around freezing, in a way that is greatly ameliorated by the patch.
> Perhaps this will help other hackers to understand the motivation
> behind some of these mechanisms. There are plenty of details that only
> make sense in the context of a certain kind of table, with certain
> performance characteristics that the design is sensitive to, and seeks
> to take advantage of in one way or another.

Thanks for this. This is the kind of concrete, data-based evidence that I find much more convincing, or at least easy to reason about. I'd actually recommend in the future to open discussion with this kind of analysis -- even before coding, it's possible to indicate what a design is *intended* to achieve. And reviewers can likewise bring up cases of their own in a concrete fashion.

On Wed, Dec 14, 2022 at 12:16 AM Peter Geoghegan <pg@bowt.ie> wrote:

> At the very least, a given VACUUM operation has to choose its freezing
> strategy based on how it expects the table will look when it's done
> vacuuming the table, and how that will impact the next VACUUM against
> the same table. Without that, then vacuuming an append-only table will
> fall into a pattern of setting pages all-visible in one vacuum, and
> then freezing those same pages all-frozen in the very next vacuum
> because there are too many. Which makes little sense; we're far better
> off freezing the pages at the earliest opportunity instead.

That makes sense, but I wonder if we can actually be more specific: One motivating example mentioned is the append-only table. If we detected that case, which I assume we can because autovacuum_vacuum_insert_* GUCs exist, we could use that information as one way to drive eager freezing independently of size. At least in theory -- it's very possible size will be a necessary part of the decision, but it's less clear that it's as useful as a user-tunable knob.

If we then ignored the append-only case when evaluating a freezing policy, maybe other ideas will fall out. I don't have a well-thought out idea about policy or knobs, but it's worth thinking about.

Aside from that, I've only given the patches a brief reading. Having seen the VM snapshot in practice (under "Scanned pages, visibility map snapshot" in the wiki page), it's neat to see fewer pages being scanned. Prefetching not only seems superior to SKIP_PAGES_THRESHOLD, but anticipates asynchronous IO. Keeping only one VM snapshot page in memory makes perfect sense.

I do have a cosmetic, but broad-reaching, nitpick about terms regarding "skipping strategy". That's phrased as a kind of negative -- what we're *not* doing. Many times I had to pause and compute in my head what we're *doing*, i.e. the "scanning strategy". For example, I wonder if the VM strategies would be easier to read as:

VMSNAP_SKIP_ALL_VISIBLE -> VMSNAP_SCAN_LAZY
VMSNAP_SKIP_ALL_FROZEN -> VMSNAP_SCAN_EAGER
VMSNAP_SKIP_NONE -> VMSNAP_SCAN_ALL

Notice here they're listed in order of increasing eagerness. 

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: plpgsq_plugin's stmt_end() is not called when an error is caught
Next
From: Nikita Malakhov
Date:
Subject: Re: New strategies for freezing, advancing relfrozenxid early