Re: Teaching users how they can get the most out of HOT in Postgres 14 - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Teaching users how they can get the most out of HOT in Postgres 14
Date
Msg-id CAH2-WzkFLZewAyiUX59NN+azw2Rf17o07t6aut9w08v_NB246g@mail.gmail.com
Whole thread Raw
In response to Re: Teaching users how they can get the most out of HOT in Postgres 14  (Mark Dilger <mark.dilger@enterprisedb.com>)
List pgsql-hackers
On Sun, Jun 20, 2021 at 9:22 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> I'd want to see some evidence that the GUC is necessary.  (For that matter, why is a per relation setting necessary?)
Is there a reproducible pathological case, perhaps with a pgbench script, to demonstrate the need?  I'm not asking
whetherthere might be some regression, but rather whether somebody wants to construct a worst-case pathological case
andpublish quantitative results about how bad it is. 

One clear argument in favor of the VACUUM option (not so much the
reloption) is that it enables certain testing scenarios.

For example, I was recently using pg_visibility to do a low-level
analysis of how visibility map bits were getting set with a test case
that built on the BenchmarkSQL fair-use TPC-C implementation. The
optimization was something that I noticed in certain scenarios -- I
could have used the option of disabling it at the VACUUM command level
just to get a perfectly clean slate. A small fraction of the pages in
the table to not be set all-visible, which would be inconsequential to
users but was annoying in the context of this particular test
scenario.

The way the optimization works will only ever leave an affected table
in a state where the LP_DEAD items left behind would be highly
unlikely to be counted by ANALYZE. They would not be counted
accurately anyway, either because they're extremely few in number or
because there are relatively many that are concentrated in just a few
heap blocks -- that's how block-based sampling by ANALYZE works.

In short, even if there really was a performance problem implied by
the bypass indexes optimization, it seems unlikely that autovacuum
would run in the first place to take care of it, with or without the
optimization. Even if autovacuum_vacuum_scale_factor were set very
aggressively. VACUUM (really autovacuum) just doesn't tend to work at
that level of precision.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: Teaching users how they can get the most out of HOT in Postgres 14
Next
From: Zhihong Yu
Date:
Subject: Re: Speed up transaction completion faster after many relations are accessed in a transaction