Thread: Teaching users how they can get the most out of HOT in Postgres 14

Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

12 April 2021, 23:11:59

Recent work from commit 5100010e taught VACUUM that it doesn't have to
do index vacuuming in cases where there are practically zero (not
necessarily exactly zero) tuples to delete from indexes. It also
surfaces the information used to decide whether or not we skip index
vacuuming in the logs, via the log_autovacuum_min_duration mechanism.
This log output can be used to get a sense of how effective HOT is
over time.

There is one number of particular interest: the proportion of heap
pages that have one or more LP_DEAD items across successive VACUUMs
(this is expressed as a percentage of the table). The immediate reason
to expose this is that it is crucial to the skipping behavior from
commit 5100010e -- the threshold for skipping is 2% of all heap pages.
But that's certainly not the only reason to pay attention to the
percentage. It can also be used to understand HOT in general. It can
be correlated with workload spikes and stressors that tend to make HOT
less effective.

A number of interesting workload-specific patterns seem to emerge by
focussing on how this number changes/grows over time. I think that
this should be pointed out directly in the docs. What's more, it seems
like a good vehicle for discussing how HOT works in general. Why did
we never really get around to documenting HOT? There should at least
be some handling of how DBAs can get the most out of HOT through
monitoring and through tuning -- especially by lowering heap
fillfactor.

It's very hard to get all UPDATEs to use HOT. It's much easier to get
UPDATEs to mostly use HOT most of the time. How things change over
time seems crucially important.

I'll show one realistic example, just to give some idea of what it
might look like. This is output for 3 successive autovacuums against
the largest TPC-C table:

automatic vacuum of table "postgres.public.bmsql_order_line": index scans: 0
pages: 0 removed, 4668405 remain, 0 skipped due to pins, 696997 skipped frozen
tuples: 324571 removed, 186702373 remain, 333888 are dead but not yet
removable, oldest xmin: 7624965
buffer usage: 3969937 hits, 3931997 misses, 1883255 dirtied
index scan bypassed: 42634 pages from table (0.91% of total) have
324364 dead item identifiers
avg read rate: 62.469 MB/s, avg write rate: 29.920 MB/s
I/O Timings: read=42359.501 write=11867.903
system usage: CPU: user: 43.62 s, system: 38.17 s, elapsed: 491.74 s
WAL usage: 4586766 records, 1850599 full page images, 8499388881 bytes

automatic vacuum of table "postgres.public.bmsql_order_line": index scans: 0
pages: 0 removed, 5976391 remain, 0 skipped due to pins, 2516643 skipped frozen
tuples: 759956 removed, 239787517 remain, 1848431 are dead but not yet
removable, oldest xmin: 18489326
buffer usage: 3432019 hits, 3385757 misses, 2426571 dirtied
index scan bypassed: 103941 pages from table (1.74% of total) have
790233 dead item identifiers
avg read rate: 50.338 MB/s, avg write rate: 36.077 MB/s
I/O Timings: read=49252.721 write=17003.987
system usage: CPU: user: 45.86 s, system: 34.47 s, elapsed: 525.47 s
WAL usage: 5598040 records, 2274556 full page images, 10510281959 bytes

automatic vacuum of table "postgres.public.bmsql_order_line": index scans: 1
pages: 0 removed, 7483804 remain, 1 skipped due to pins, 4208295 skipped frozen
tuples: 972778 removed, 299295048 remain, 1970910 are dead but not yet
removable, oldest xmin: 30987445
buffer usage: 3384994 hits, 4593727 misses, 2891003 dirtied
index scan needed: 174243 pages from table (2.33% of total) had
1325752 dead item identifiers removed
index "bmsql_order_line_pkey": pages: 1250660 in total, 0 newly
deleted, 0 currently deleted, 0 reusable
avg read rate: 60.505 MB/s, avg write rate: 38.078 MB/s
I/O Timings: read=72881.986 write=21872.615
system usage: CPU: user: 65.24 s, system: 42.24 s, elapsed: 593.14 s
WAL usage: 6668353 records, 2684040 full page images, 12374536939 bytes

These autovacuums occur every 60-90 minutes with the workload in
question (with pretty aggressive autovacuum settings). We see that HOT
works rather well here -- but not so well that index vacuuming can be
avoided consistently, which happens in the final autovacuum (it has
"index scans: 1"). There was slow but steady growth in the percentage
of LP_DEAD-containing heap pages over time here, which is common
enough.

The point of HOT is not to avoid having to do index vacuuming, of
course -- that has it backwards. But framing HOT as doing work in
backends so autovacuum doesn't have to do similar work later on is a
good mental model to encourage users to adopt. There are also
significant advantages to reducing the effectiveness of HOT to this
one number -- HOT must be working well if it's close to 0%, almost
always below 2%, with the occasional aberration that sees it go up to
maybe 5%. But if it ever goes too high (in the absence of DELETEs),
you might have trouble on your hands. It might not go down again.

There are other interesting patterns from other tables within the same
database -- including on tables with no UPDATEs, and tables that
generally cannot use HOT due to a need to modify indexed columns. The
particulars with these other tables hint at problems with heap
fragmentation, which is something that users can think of as a
degenerative process -- something that gets progressively worse in
extreme cases (i.e. cases where it matters).

This new percentage metric isn't about HOT per se. It's actually about
the broader question of how effective the system is at keeping the
physical location of each logical row stable over time, for a given
workload. So maybe that's what any new documentation should address.
The documentation would still have plenty to say about HOT, though. It
would also have something to say about bottom-up index deletion, which
can be thought of as avoiding problems when HOT doesn't or can't be
applied very often.

-- 
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Andres Freund

Date:

12 April 2021, 23:30:32

Hi,

On 2021-04-12 16:11:59 -0700, Peter Geoghegan wrote:
> Recent work from commit 5100010e taught VACUUM that it doesn't have to
> do index vacuuming in cases where there are practically zero (not
> necessarily exactly zero) tuples to delete from indexes.

FWIW, I'd not at all be surprised if this causes some issues. Consider
cases where all index lookups are via bitmap scans (which does not
support killtuples) - if value ranges are looked up often the repeated
heap fetches can absolutely kill query performance. I've definitely had
to make autovacuum more aggressive for cases like this or schedule
manual vacuums, and now that's silently not good enough anymore. Yes, 2%
of the table isn't all that much, but often enough all the updates and
lookups concentrate in one value range.

As far as I can see there's no reasonable way to disable this
"optimization", which scares me.

Greetings,

Andres Freund

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

12 April 2021, 23:35:13

On Mon, Apr 12, 2021 at 4:30 PM Andres Freund <andres@anarazel.de> wrote:
> As far as I can see there's no reasonable way to disable this
> "optimization", which scares me.

I'm fine with adding a simple 'off' switch. What I'd like to avoid
doing is making the behavior tunable, since it's likely to change in
Postgres 15 and Postgres 16 anyway.

-- 
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Michael Paquier

Date:

12 April 2021, 23:52:44

On Mon, Apr 12, 2021 at 04:35:13PM -0700, Peter Geoghegan wrote:
> On Mon, Apr 12, 2021 at 4:30 PM Andres Freund <andres@anarazel.de> wrote:
> > As far as I can see there's no reasonable way to disable this
> > "optimization", which scares me.
>
> I'm fine with adding a simple 'off' switch. What I'd like to avoid
> doing is making the behavior tunable, since it's likely to change in
> Postgres 15 and Postgres 16 anyway.

While going through this commit a couple of days ago, I really got to
wonder why you are controlling this stuff with a hardcoded value and I
found that scary, while what you should be using are two GUCs with the
reloptions that come with the feature (?):
- A threshold, as an integer, to define a number of pages.
- A scale factor to define a percentage of pages.

Also, I am a bit confused with the choice of BYPASS_THRESHOLD_PAGES as
parameter name.  For all the other parameters of autovacuum, we use
"threshold" for a fixed number of items, not a percentage of a given
item.
--
Michael

Attachment

signature.asc

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

12 April 2021, 23:53:47

On Mon, Apr 12, 2021 at 4:52 PM Michael Paquier <michael@paquier.xyz> wrote:
> While going through this commit a couple of days ago, I really got to
> wonder why you are controlling this stuff with a hardcoded value and I
> found that scary, while what you should be using are two GUCs with the
> reloptions that come with the feature (?):
> - A threshold, as an integer, to define a number of pages.
> - A scale factor to define a percentage of pages.

Why?


-- 
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Andres Freund

Date:

13 April 2021, 00:37:14

Hi,

On 2021-04-12 16:53:47 -0700, Peter Geoghegan wrote:
> On Mon, Apr 12, 2021 at 4:52 PM Michael Paquier <michael@paquier.xyz> wrote:
> > While going through this commit a couple of days ago, I really got to
> > wonder why you are controlling this stuff with a hardcoded value and I
> > found that scary, while what you should be using are two GUCs with the
> > reloptions that come with the feature (?):
> > - A threshold, as an integer, to define a number of pages.
> > - A scale factor to define a percentage of pages.
> 
> Why?

Well, one argument is that you made a fairly significant behavioural
change, with hard-coded logic for when the optimization kicks in. It's
not at all clear that your constants are the right ones for every
workload. We'll likely on get to know whether they're right in > 1 year
- not having a real out at that point imo is somewhat scary.

That said, adding more and more reloptions has a significant cost, so I
don't think it's clear cut that it's the right decision to add
one. Perhaps vacuum_cleanup_index_scale_factor should just be reused for
BYPASS_THRESHOLD_PAGES?

Greetings,

Andres Freund

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

13 April 2021, 01:12:18

On Mon, Apr 12, 2021 at 5:37 PM Andres Freund <andres@anarazel.de> wrote:
> Well, one argument is that you made a fairly significant behavioural
> change, with hard-coded logic for when the optimization kicks in. It's
> not at all clear that your constants are the right ones for every
> workload.

(Apparently nobody wants to talk about HOT and the documentation.)

The BYPASS_THRESHOLD_PAGES cutoff was chosen conservatively, so that
it would avoid index vacuuming in truly marginal cases -- and it tends
to only delay it there.

A table-level threshold is not the best way of constraining the
problem. In the future, the table threshold should be treated as only
one factor among several. Plus there will be more than a simple yes/no
question to consider. We should eventually be able to do index
vacuuming for some indexes but not others. Bottom-up index deletion
has totally changed things here, because roughly speaking it makes
index bloat proportionate to the number of logical changes to indexed
columns -- you could have one super-bloated index on the table, but
several others that perfectly retain their original size. You still
need to do heap vacuuming eventually, which necessitates vacuuming
indexes too, but the right strategy is probably to vacuum much more
frequently, vacuuming the bloated index each time. You only do a full
round of index vacuuming when the table starts to accumulate way too
many LP_DEAD items. You need a much more sophisticated model for this.
It might also need to hook into autovacuums scheduling.

One of the dangers of high BYPASS_THRESHOLD_PAGES settings is that
it'll work well for some indexes but not others. To a dramatic degree,
even.

That said, nbtree isn't the only index AM, and it is hard to be
completely sure that you've caught everything. So an off switch seems
like a good idea now.

> We'll likely on get to know whether they're right in > 1 year
> - not having a real out at that point imo is somewhat scary.
>
> That said, adding more and more reloptions has a significant cost, so I
> don't think it's clear cut that it's the right decision to add
> one. Perhaps vacuum_cleanup_index_scale_factor should just be reused for
> BYPASS_THRESHOLD_PAGES?

I think that the right way to do this is to generalize INDEX_CLEANUP
to support a mode of operation that disallows vacuumlazy.c from
applying this optimization, as well as any similar optimizations which
will be added in the future.

Even if you don't buy my argument about directly parameterizing
BYPASS_THRESHOLD_PAGES undermining future work, allowing it to be set
much higher than 5% - 10% would be a pretty big footgun. It might
appear to help at first, but risks destabilizing things much later on.

--
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Michael Paquier

Date:

11 May 2021, 07:42:27

On Mon, Apr 12, 2021 at 06:12:18PM -0700, Peter Geoghegan wrote:
> One of the dangers of high BYPASS_THRESHOLD_PAGES settings is that
> it'll work well for some indexes but not others. To a dramatic degree,
> even.
>
> That said, nbtree isn't the only index AM, and it is hard to be
> completely sure that you've caught everything. So an off switch seems
> like a good idea now.

Whatever the solution chosen, the thing I can see we agree on here is
that we need to do something, at least in the shape of an on/off
switch to have an escape path in case of problems.  Peter, could we
get something by beta1 for that?  FWIW, I would use a float GUC to
control that, and not a boolean switch, but I am just one voice here,
and that's not a feature I worked on.
--
Michael

Attachment

signature.asc

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Michael Paquier

Date:

13 May 2021, 07:27:47

On Tue, May 11, 2021 at 04:42:27PM +0900, Michael Paquier wrote:
> Whatever the solution chosen, the thing I can see we agree on here is
> that we need to do something, at least in the shape of an on/off
> switch to have an escape path in case of problems.  Peter, could we
> get something by beta1 for that?  FWIW, I would use a float GUC to
> control that, and not a boolean switch, but I am just one voice here,
> and that's not a feature I worked on.

So, I have been thinking more about this item, and a boolean switch
still sounded weird to me, so attached is a patch to have two GUCs,
one for manual VACUUM and autovacuum like any other parameters, to
control this behavior, with a default set at 2% of the number of
relation pages with dead items needed to do the index cleanup work.

Even if we switch the parameter type used here, the easiest and most
consistent way to pass down the parameter is just to use VacuumParams
set within ExecVacuum() and the autovacuum code path.  The docs need
more work, I guess.

Thoughts?
--
Michael

Attachment

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Justin Pryzby

Date:

13 May 2021, 12:06:55

On Thu, May 13, 2021 at 04:27:47PM +0900, Michael Paquier wrote:
> On Tue, May 11, 2021 at 04:42:27PM +0900, Michael Paquier wrote:
> > Whatever the solution chosen, the thing I can see we agree on here is
> > that we need to do something, at least in the shape of an on/off
> > switch to have an escape path in case of problems.  Peter, could we
> > get something by beta1 for that?  FWIW, I would use a float GUC to
> > control that, and not a boolean switch, but I am just one voice here,
> > and that's not a feature I worked on.
> 
> So, I have been thinking more about this item, and a boolean switch
> still sounded weird to me, so attached is a patch to have two GUCs,
> one for manual VACUUM and autovacuum like any other parameters, to 
> control this behavior, with a default set at 2% of the number of
> relation pages with dead items needed to do the index cleanup work.
> 
> Even if we switch the parameter type used here, the easiest and most
> consistent way to pass down the parameter is just to use VacuumParams
> set within ExecVacuum() and the autovacuum code path.  The docs need
> more work, I guess.
> 
> Thoughts?

> +        cleanup_index_scale_factor = autovacuum_cleanup_index_scale >= 0 ?
> +            autovacuum_cleanup_index_scale : VacuumCostDelay;

CostDelay is surely not what you meant.

> +     <title>Vacuum parameters for Indexes</title>
> +     <para>
> +      During the execution of <xref linkend="sql-vacuum"/>
> +      and <xref linkend="sql-analyze"/>

"and analyze" is wrong?

> +         This parameter can only be set in the <filename>postgresql.conf</filename>
> +         file or on the server command line.

It's SIGHUP

> +        This parameter can only be set in the <filename>postgresql.conf</filename>
> +        file or on the server command line.

Same

+       {
                       
 
+               {"vacuum_cleanup_index_scale_factor", PGC_SIGHUP, VACUUM_INDEX,
                       
 
+                       gettext_noop("Fraction of relation pages, with at least one dead item, required to clean up
indexes."),                
 
+                       NULL
                       
 
+               },
                       
 
+               &VacuumCleanupIndexScale,
                       
 
+               0.02, 0.0, 0.05,
                       
 
+               NULL, NULL, NULL
                       
 
+       },
                       
 

Why is the allowed range from 0 to 0.05?  Why not 0.10 or 1.0 ?
The old GUC of the same name had max 1e10.
I think a reduced range and a redefinition of the GUC would need to be called
out as an incompatibility.

Also, the old GUC (removed at 9f3665fbf) had:
-               {"vacuum_cleanup_index_scale_factor", PGC_USERSET, CLIENT_CONN_STATEMENT,

I think USERSET and STATEMENT were right ?

Alternately, what if this were in the DEVELOPER category, which makes this
easier to remove in v15.

-- 
Justin

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

13 May 2021, 20:27:44

On Thu, May 13, 2021 at 5:06 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> Why is the allowed range from 0 to 0.05?  Why not 0.10 or 1.0 ?
> The old GUC of the same name had max 1e10.

It also had a completely different purpose and default.

> I think a reduced range and a redefinition of the GUC would need to be called
> out as an incompatibility.

The justification from Michael for this approach is that not having
this level of control would be weird, at least to him. But that
justification itself seems weird to me; why start from the premise
that you need a knob (as opposed to an off switch) at all? Why not
start with the way the mechanism works (or is intended to work) in
practice? Most individual tables will *never* have VACUUM apply the
optimization with *any* reasonable threshold value, so we only need to
consider the subset of tables/workloads where it *might* make sense to
skip index vacuuming. This is more qualitative than quantitative.

It makes zero sense to treat the threshold as a universal scale --
this is one reason why I don't want to expose a true tunable knob to
users. Though the threshold-driven/BYPASS_THRESHOLD_PAGES design is
not exactly something with stable behavior for a given table, it
almost works like that in practice: tables tend to usually skip index
vacuuming, or never skip it even once. There is a clear bifurcation
along this line when you view how VACUUM behaves with a variety of
different tables using the new autovacuum logging stuff.

Almost all of the benefit of the optimization is available with the
current BYPASS_THRESHOLD_PAGES threshold (2% of heap pages have
LP_DEAD items), which has less risk than a higher threshold. I don't
think it matters much if we have the occasional unnecessary round of
index vacuuming on account of not applying the optimization. The truly
important benefit of the optimization is to not do unnecessary index
vacuuming all the time in the extreme cases where it's really hard to
justify. There is currently zero evidence that anything higher than 2%
will ever help anybody to an appreciably degree. (There is also no
evidence that the optimization will ever need to be disabled, but I
accept the need to be conservative and offer an off switch -- the
precautionary principle applies when talking about new harms.)

Not having to scan every index on every VACUUM, but only every 5th or
so VACUUM is a huge improvement. But going from every 5th VACUUM to
every 10th VACUUM? That's at best a tiny additional improvement in
exchange for what I'd guess is a roughly linear increase in risk
(maybe a greater-than-linear increase, even). That's an awful deal.

-- 
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Michael Paquier

Date:

14 May 2021, 02:14:39

On Thu, May 13, 2021 at 01:27:44PM -0700, Peter Geoghegan wrote:
> Almost all of the benefit of the optimization is available with the
> current BYPASS_THRESHOLD_PAGES threshold (2% of heap pages have
> LP_DEAD items), which has less risk than a higher threshold. I don't
> think it matters much if we have the occasional unnecessary round of
> index vacuuming on account of not applying the optimization. The truly
> important benefit of the optimization is to not do unnecessary index
> vacuuming all the time in the extreme cases where it's really hard to
> justify. There is currently zero evidence that anything higher than 2%
> will ever help anybody to an appreciably degree. (There is also no
> evidence that the optimization will ever need to be disabled, but I
> accept the need to be conservative and offer an off switch -- the
> precautionary principle applies when talking about new harms.)
>
> Not having to scan every index on every VACUUM, but only every 5th or
> so VACUUM is a huge improvement. But going from every 5th VACUUM to
> every 10th VACUUM? That's at best a tiny additional improvement in
> exchange for what I'd guess is a roughly linear increase in risk
> (maybe a greater-than-linear increase, even). That's an awful deal.

Perhaps that's an awful deal, but based on which facts can you really
say that this new behavior of needing at least 2% of relation pages
with some dead items to clean up indexes is not a worse deal in some
cases?  This may cause more problems for the in-core index AMs, as
much as it could impact any out-of-core index AM, no?  What about
other values like 1%, or even 5%?  My guess is that there would be an
ask to have more control on that, though that stands as my opinion.

Saying that, as long as there is a way to disable that for the users
with autovacuum and manual vacuums, I'd be fine.  It is worth noting
that adding an GUC to control this optimization would make the code
more confusing, as there is already do_index_cleanup, a
vacuum_index_cleanup reloption, and specifying vacuum_index_cleanup to
TRUE may cause the index cleanup to not actually kick if the 2% bar is
not reached.
--
Michael

Attachment

signature.asc

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

14 May 2021, 02:56:05

On Thu, May 13, 2021 at 7:14 PM Michael Paquier <michael@paquier.xyz> wrote:
> Perhaps that's an awful deal, but based on which facts can you really
> say that this new behavior of needing at least 2% of relation pages
> with some dead items to clean up indexes is not a worse deal in some
> cases?

If I thought that it simply wasn't possible then I wouldn't have
accepted the need to make it possible to disable. This is a
cost/benefit decision problem, which must be made based on imperfect
information -- there are no absolute certainties. But I'm certain
about one thing: there is a large practical difference between the
optimization causing terrible performance in certain scenarios and the
optimization causing slightly suboptimal performance in certain
scenarios. A tiny risk of the former scenario is *much* worse than a
relatively large risk of the latter scenario. There needs to be a
sense of proportion about risk.

> This may cause more problems for the in-core index AMs, as
> much as it could impact any out-of-core index AM, no?

I don't understand what you mean here.

> What about
> other values like 1%, or even 5%?  My guess is that there would be an
> ask to have more control on that, though that stands as my opinion.

How did you arrive at that guess? Why do you believe that? This is the
second time I've asked.

> Saying that, as long as there is a way to disable that for the users
> with autovacuum and manual vacuums, I'd be fine.  It is worth noting
> that adding an GUC to control this optimization would make the code
> more confusing, as there is already do_index_cleanup, a
> vacuum_index_cleanup reloption, and specifying vacuum_index_cleanup to
> TRUE may cause the index cleanup to not actually kick if the 2% bar is
> not reached.

I don't intend to add a GUC. A reloption should suffice.

Your interpretation of what specifying vacuum_index_cleanup (the
VACUUM command option) represents doesn't seem particularly justified
to me. To me it just means "index cleanup and vacuuming are not
explicitly disabled, the default behavior". It's an option largely
intended for emergencies, and largely superseded by the failsafe
mechanism. This interpretation is justified by well established
precedent: it has long been possible for VACUUM to skip heap page
pruning and even heap page vacuuming just because a super-exclusive
lock could not be acquired (though the latter case no longer happens
due to the same work inside vacuumlazy.c) -- which also implies
skipping some index vacuuming, without it ever being apparent to the
user.

-- 
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Masahiko Sawada

Date:

18 May 2021, 14:28:43

(I had missed this discussion due to the mismatched thread subject..)

On Fri, May 14, 2021 at 11:14 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, May 13, 2021 at 01:27:44PM -0700, Peter Geoghegan wrote:
> > Almost all of the benefit of the optimization is available with the
> > current BYPASS_THRESHOLD_PAGES threshold (2% of heap pages have
> > LP_DEAD items), which has less risk than a higher threshold. I don't
> > think it matters much if we have the occasional unnecessary round of
> > index vacuuming on account of not applying the optimization. The truly
> > important benefit of the optimization is to not do unnecessary index
> > vacuuming all the time in the extreme cases where it's really hard to
> > justify. There is currently zero evidence that anything higher than 2%
> > will ever help anybody to an appreciably degree. (There is also no
> > evidence that the optimization will ever need to be disabled, but I
> > accept the need to be conservative and offer an off switch -- the
> > precautionary principle applies when talking about new harms.)
> >
> > Not having to scan every index on every VACUUM, but only every 5th or
> > so VACUUM is a huge improvement. But going from every 5th VACUUM to
> > every 10th VACUUM? That's at best a tiny additional improvement in
> > exchange for what I'd guess is a roughly linear increase in risk
> > (maybe a greater-than-linear increase, even). That's an awful deal.
>
> Perhaps that's an awful deal, but based on which facts can you really
> say that this new behavior of needing at least 2% of relation pages
> with some dead items to clean up indexes is not a worse deal in some
> cases?  This may cause more problems for the in-core index AMs, as
> much as it could impact any out-of-core index AM, no?  What about
> other values like 1%, or even 5%?  My guess is that there would be an
> ask to have more control on that, though that stands as my opinion.

I'm concerned how users can tune that scale type parameter that can be
configurable between 0.0 and 0.05. I think that users basically don't
pay attention to how many blocks are updated by UPDATE/DELETE. Unlike
old vacuum_cleanup_index_scale_factor, increasing this parameter would
directly affect index bloats. If the user can accept more index bloat
to speed up (auto)vacuum, they can use vacuum_index_cleanup instead.

I prefer to have an on/off switch just in case. I remember I also
commented the same thing before. We’ve discussed a way to control
whether or not to enable the skipping optimization by adding a new
mode to INDEX_CLEANUP option, as Peter mentioned. For example, we can
use the new mode “auto” (or “smart”) mode by default, enabling all
skipping optimizations, and specifying “on” disables them. Or we can
add “force” mode to disable all skipping optimizations while leaving
the existing modes as they are. Anyway, I think it’s not a good idea
to add a new GUC parameter that we’re not sure how to tune.

IIUC skipping index vacuum when less than 2% of relation pages with at
least one LP_DEAD is a table’s optimization rather than a btree
index’s optimization. Since we’re not likely to set many pages
all-visible or collect many dead tuples in that case, we can skip
index vacuuming and table vacuuming. IIUC this case, fortunately, goes
well together btree indexes’ bottom-up deletion. If this skipping
behavior badly affects other indexes AMs, this optimization should be
considered within btree indexes, although we will need a way for index
AMs to consider and tell the vacuum strategy. But I guess this works
well in most cases so having an on/off switch suffice.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

18 May 2021, 21:08:54

On Tue, May 18, 2021 at 7:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I prefer to have an on/off switch just in case. I remember I also
> commented the same thing before. We’ve discussed a way to control
> whether or not to enable the skipping optimization by adding a new
> mode to INDEX_CLEANUP option, as Peter mentioned. For example, we can
> use the new mode “auto” (or “smart”) mode by default, enabling all
> skipping optimizations, and specifying “on” disables them. Or we can
> add “force” mode to disable all skipping optimizations while leaving
> the existing modes as they are. Anyway, I think it’s not a good idea
> to add a new GUC parameter that we’re not sure how to tune.
>
> IIUC skipping index vacuum when less than 2% of relation pages with at
> least one LP_DEAD is a table’s optimization rather than a btree
> index’s optimization.

Right. There *is* an excellent way to tune this behavior: by adjusting
heap fillfactor to make HOT more effective. That was why I started
this thread!

If you leave heap fillfactor at the default of 100, and have lots of
updates (that don't modify indexed columns) and no deletes, then
you're almost certainly not going to have VACUUM skip indexes anyway
-- in practice you're bound to exceed having 2% of pages with an
LP_DEAD item before very long. Tuning heap fillfactor is practically
essential to see a real benefit, regardless of the exact
BYPASS_THRESHOLD_PAGES. (There may be some rare exceptions, but for
the most part this mechanism helps with tables that get many updates
that are expected to use HOT, and will use HOT barring a tiny number
of cases where the new tuple won't' quite fit, etc.)

The idea of tuning the behavior directly (e.g. with a reloption that
lets the user specify a BYPASS_THRESHOLD_PAGES style threshold) is
exactly backwards. The point for the user should not be to skip
indexes during VACUUM. The point for the user is to get lots of
non-HOT updates to *avoid heap fragmentation*, guided by the new
autovacuum instrumentation. That also means that there will be much
less index vacuuming. But that's a pretty minor side-benefit. Why
should the user *expect* largely unnecessary index vacuuming to take
place?

To put it another way, the index bypass mechanism added to
vacuumlazy.c was not intended to add a new good behavior. It was
actually intended to subtract an old bad behavior. The patch is mostly
useful because it allows the user to make VACUUM *more* aggressive
with freezing and VM bit setting (not less aggressive with indexes).
The BYPASS_THRESHOLD_PAGES threshold of 0.02 is a little arbitrary --
but only a little.

> Since we’re not likely to set many pages
> all-visible or collect many dead tuples in that case, we can skip
> index vacuuming and table vacuuming. IIUC this case, fortunately, goes
> well together btree indexes’ bottom-up deletion.

It's true that bottom-up index deletion provides additional insurance
against problems, but I don't think that that insurance is strictly
necessary. It's nice to have insurance, though.

> If this skipping
> behavior badly affects other indexes AMs, this optimization should be
> considered within btree indexes, although we will need a way for index
> AMs to consider and tell the vacuum strategy. But I guess this works
> well in most cases so having an on/off switch suffice.

Right. I doubt that it will actually turn out to be necessary to have
such a switch. But I try to be modest when it comes to predicting what
will be important to some user workload -- it's way too complicated to
have total confidence about something like that. It is a risk to be
managed.

--
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Masahiko Sawada

Date:

24 May 2021, 06:33:32

On Wed, May 19, 2021 at 6:09 AM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Tue, May 18, 2021 at 7:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > If this skipping
> > behavior badly affects other indexes AMs, this optimization should be
> > considered within btree indexes, although we will need a way for index
> > AMs to consider and tell the vacuum strategy. But I guess this works
> > well in most cases so having an on/off switch suffice.
>
> Right. I doubt that it will actually turn out to be necessary to have
> such a switch. But I try to be modest when it comes to predicting what
> will be important to some user workload -- it's way too complicated to
> have total confidence about something like that. It is a risk to be
> managed.

I think the possible side effect of this hard-coded
BYPASS_THRESHOLD_PAGES would be that by default, bulkdelete is not
called for a long term and the index becomes bloat. IOW, we will
enforce users have the index bloat corresponding to 2% of table pages.
The bloat could be serious depending on the index tuple size (e.g.,
index including many columns). The user may have been running
autovacuums aggressively on that table to prevent the index bloat but
it's no longer possible and there is no choice. So I think that for
those (relatively) rare use cases, it's good to provide a way to
somehow control it. Fortunately, an on/off switch is likely to be
useful for controlling other optimizations that could be added in the
future.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

28 May 2021, 00:52:59

On Sun, May 23, 2021 at 11:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I think the possible side effect of this hard-coded
> BYPASS_THRESHOLD_PAGES would be that by default, bulkdelete is not
> called for a long term and the index becomes bloat.

What do you think of the approach taken in the attached POC patch?

The patch makes it possible to disable the optimization by
generalizing the INDEX_CLEANUP reloption to be an enum that looks like
a trinay boolean (not just a plain boolean). INDEX_CLEANUP now accepts
the values 'auto', 'on', and 'off' (plus a variety of alternative
spellings, the usual ones for booleans in Postgres). Now 'auto' is the
default, and 'on' forces the previous behavior inside vacuumlazy.c. It
does not disable the failsafe, though -- INDEX_CLEANUP remains a
fairly mechanical thing.

This approach seems good to me because INDEX_CLEANUP remains
consistent with the original purpose and design of INDEX_CLEANUP --
that was always an option that forced VACUUM to do something special
with indexes. I don't see much downside to this approach, either. As
things stand, INDEX_CLEANUP is mostly superseded by the failsafe, so
we don't really need to talk about wraparound emergencies in the docs
for INDEX_CLEANUP anymore. This seems much more elegant than either
repurposing/reviving cleanup_index_scale_factor (which makes no sense
to me at all) or inventing a new reloption (which would itself be in
tension with INDEX_CLEANUP).

There are some practical issues that make this patch surprisingly
complicated for such a simple problem. For example, I hope that I
haven't missed any subtlety in generalizing a boolean reloption like
this. We've done similar things with GUCs in the past, but this may be
a little different. Another concern with this approach is what it
means for the VACUUM command itself. I haven't added an 'auto'
spelling that is accepted by the VACUUM command in this POC version.
But do I need to at all? Can that just be implied by not having any
INDEX_CLEANUP option? And does StdRdOptions.vacuum_truncate now need
to become a VacOptTernaryValue field too, for consistency with the new
definition of StdRdOptions.vacuum_index_cleanup?

-- 
Peter Geoghegan

Attachment

0001-Generalize-VACUUM-s-INDEX_CLEANUP-option.patch

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Masahiko Sawada

Date:

31 May 2021, 01:30:08

On Fri, May 28, 2021 at 9:53 AM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Sun, May 23, 2021 at 11:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I think the possible side effect of this hard-coded
> > BYPASS_THRESHOLD_PAGES would be that by default, bulkdelete is not
> > called for a long term and the index becomes bloat.
>
> What do you think of the approach taken in the attached POC patch?
>
> The patch makes it possible to disable the optimization by
> generalizing the INDEX_CLEANUP reloption to be an enum that looks like
> a trinay boolean (not just a plain boolean). INDEX_CLEANUP now accepts
> the values 'auto', 'on', and 'off' (plus a variety of alternative
> spellings, the usual ones for booleans in Postgres). Now 'auto' is the
> default, and 'on' forces the previous behavior inside vacuumlazy.c. It
> does not disable the failsafe, though -- INDEX_CLEANUP remains a
> fairly mechanical thing.

+1

>
> This approach seems good to me because INDEX_CLEANUP remains
> consistent with the original purpose and design of INDEX_CLEANUP --
> that was always an option that forced VACUUM to do something special
> with indexes. I don't see much downside to this approach, either. As
> things stand, INDEX_CLEANUP is mostly superseded by the failsafe, so
> we don't really need to talk about wraparound emergencies in the docs
> for INDEX_CLEANUP anymore. This seems much more elegant than either
> repurposing/reviving cleanup_index_scale_factor (which makes no sense
> to me at all) or inventing a new reloption (which would itself be in
> tension with INDEX_CLEANUP).

+1

>
> There are some practical issues that make this patch surprisingly
> complicated for such a simple problem. For example, I hope that I
> haven't missed any subtlety in generalizing a boolean reloption like
> this. We've done similar things with GUCs in the past, but this may be
> a little different.

+/* values from HeapOptIndexCleanupMode */
+relopt_enum_elt_def HeapOptIndexCleanupOptValues[] =
+{
+   {"auto", VACOPT_TERNARY_DEFAULT},
+   {"on", VACOPT_TERNARY_ENABLED},
+   {"off", VACOPT_TERNARY_DISABLED},
+   {"true", VACOPT_TERNARY_ENABLED},
+   {"false", VACOPT_TERNARY_DISABLED},
+   {"1", VACOPT_TERNARY_ENABLED},
+   {"0", VACOPT_TERNARY_DISABLED},
+   {(const char *) NULL}       /* list terminator */
+};

We need to accept "yes" and "no" too? Currently, the parsing of a
boolean type reloption accepts those words.

> Another concern with this approach is what it
> means for the VACUUM command itself. I haven't added an 'auto'
> spelling that is accepted by the VACUUM command in this POC version.
> But do I need to at all? Can that just be implied by not having any
> INDEX_CLEANUP option?

It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
command support AUTO for consistency. Do you have any concerns about
supporting it?

> And does StdRdOptions.vacuum_truncate now need
> to become a VacOptTernaryValue field too, for consistency with the new
> definition of StdRdOptions.vacuum_index_cleanup?

We don't have the bypass optimization for heap truncation, unlike
index vacuuming. So I think we can leave both vacuum_truncate
reloption and TRUNCATE option as boolean parameters.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Michael Paquier

Date:

04 June 2021, 06:15:05

On Mon, May 31, 2021 at 10:30:08AM +0900, Masahiko Sawada wrote:
> On Fri, May 28, 2021 at 9:53 AM Peter Geoghegan <pg@bowt.ie> wrote:
>> Another concern with this approach is what it
>> means for the VACUUM command itself. I haven't added an 'auto'
>> spelling that is accepted by the VACUUM command in this POC version.
>> But do I need to at all? Can that just be implied by not having any
>> INDEX_CLEANUP option?
>
> It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
> command support AUTO for consistency. Do you have any concerns about
> supporting it?

I have read through the patch, and I am surprised to see that this
only makes possible to control the optimization at relation level.
The origin of the complaints is that this index cleanup optimization
has been introduced as a new rule that gets enforced at *system*
level, so I think that we should have an equivalent with a GUC to
control the behavior for the whole system.  With what you are
presenting here, one could only disable the optimization for each
relation, one-by-one.  If this optimization proves to be a problem,
it's just going to be harder to users to go through all the relations
and re-tune autovacuum.  Am I missing something?
--
Michael

Attachment

signature.asc

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Masahiko Sawada

Date:

04 June 2021, 11:12:14

On Fri, Jun 4, 2021 at 3:15 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, May 31, 2021 at 10:30:08AM +0900, Masahiko Sawada wrote:
> > On Fri, May 28, 2021 at 9:53 AM Peter Geoghegan <pg@bowt.ie> wrote:
> >> Another concern with this approach is what it
> >> means for the VACUUM command itself. I haven't added an 'auto'
> >> spelling that is accepted by the VACUUM command in this POC version.
> >> But do I need to at all? Can that just be implied by not having any
> >> INDEX_CLEANUP option?
> >
> > It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
> > command support AUTO for consistency. Do you have any concerns about
> > supporting it?
>
> I have read through the patch, and I am surprised to see that this
> only makes possible to control the optimization at relation level.
> The origin of the complaints is that this index cleanup optimization
> has been introduced as a new rule that gets enforced at *system*
> level, so I think that we should have an equivalent with a GUC to
> control the behavior for the whole system.  With what you are
> presenting here, one could only disable the optimization for each
> relation, one-by-one.  If this optimization proves to be a problem,
> it's just going to be harder to users to go through all the relations
> and re-tune autovacuum.  Am I missing something?

I've not thought to disable that optimization at system level. I think
we can use this option for particular tables whose indexes
unexpectedly bloated much due to this optimization. Similarly, we have
DISABLE_PAGE_SKIPPING option but don’t have a way to disable lazy
vacuum’s page skipping behavior at system level.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

11 June 2021, 21:46:20

On Thu, Jun 3, 2021 at 11:15 PM Michael Paquier <michael@paquier.xyz> wrote:
> I have read through the patch, and I am surprised to see that this
> only makes possible to control the optimization at relation level.
> The origin of the complaints is that this index cleanup optimization
> has been introduced as a new rule that gets enforced at *system*
> level, so I think that we should have an equivalent with a GUC to
> control the behavior for the whole system.

*Why* does it have to work at the system level? I don't understand
what you mean about the system level.

As Masahiko pointed out, adding a GUC isn't what we've done in other
similar cases -- that's how DISABLE_PAGE_SKIPPING works, which was a
defensive option that seems similar enough to what we want to add now.
To give another example, the TRUNCATE VACUUM option (or the related
reloption) can be used to disable relation truncation, a behavior that
sometimes causes *big* issues in production. The truncate behavior is
determined dynamically in most situations -- which is another
similarity to the optimization we've added here.

Why is this fundamentally different to those two things?

> With what you are
> presenting here, one could only disable the optimization for each
> relation, one-by-one.  If this optimization proves to be a problem,
> it's just going to be harder to users to go through all the relations
> and re-tune autovacuum.  Am I missing something?

Why would you expect autovacuum to run even when the optimization is
unavailable (e.g. with Postgres 13)? After all, the specifics of when
the bypass optimization kicks in make it very unlikely that ANALYZE
will ever be able to notice enough dead tuples to trigger an
autovacuum (barring antiwraparound and insert-driven autovacuums).
There will probably be very few LP_DEAD items remaining. Occasionally
there will be somewhat more LP_DEAD items, that happen to be
concentrated in less than 2% of the table's blocks -- but block-based
sampling by ANALYZE is likely to miss most of them and underestimate
the total number. The sampling approach taken by acquire_sample_rows()
ensures this with larger tables. With small tables the chances of the
optimization kicking in are very low, unless perhaps fillfactor has
been tuned very aggressively.

There has never been a guarantee that autovacuum will be triggered
(and do index vacuuming) in cases that have very few LP_DEAD items, no
matter how the system has been tuned. The main reason why making the
optimization behavior controllable is for the VACUUM command.
Principally for hackers. I can imagine myself using the VACUUM option
to disable the optimization when I was interested in testing VACUUM or
space utilization in some specific, narrow way.

Of course it's true that there is still some uncertainty about the
optimization harming production workloads -- that is to be expected
with an enhancement like this one. But there is still no actual
example or test case that shows the optimization doing the wrong
thing, or anything like it. Anything is possible, but I am not
expecting there to be even one user complaint about the feature.
Naturally I don't want to add something as heavyweight as a GUC, given
all that.

-- 
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

11 June 2021, 22:28:01

On Sun, May 30, 2021 at 6:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Another concern with this approach is what it
> > means for the VACUUM command itself. I haven't added an 'auto'
> > spelling that is accepted by the VACUUM command in this POC version.
> > But do I need to at all? Can that just be implied by not having any
> > INDEX_CLEANUP option?
>
> It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
> command support AUTO for consistency. Do you have any concerns about
> supporting it?

I suppose we should have it. But now we have to teach vacuumdb about
this new boolean-like enum too. It's a lot more new code than I would
have preferred, but I suppose that it makes sense.


-- 
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Michael Paquier

Date:

15 June 2021, 00:23:21

On Fri, Jun 11, 2021 at 02:46:20PM -0700, Peter Geoghegan wrote:
> On Thu, Jun 3, 2021 at 11:15 PM Michael Paquier <michael@paquier.xyz> wrote:
>> I have read through the patch, and I am surprised to see that this
>> only makes possible to control the optimization at relation level.
>> The origin of the complaints is that this index cleanup optimization
>> has been introduced as a new rule that gets enforced at *system*
>> level, so I think that we should have an equivalent with a GUC to
>> control the behavior for the whole system.
>
> *Why* does it have to work at the system level? I don't understand
> what you mean about the system level.

I mean that you lack a GUC that allows to enforce to *not* use this
optimization for all relations, for all processes.

> As Masahiko pointed out, adding a GUC isn't what we've done in other
> similar cases -- that's how DISABLE_PAGE_SKIPPING works, which was a
> defensive option that seems similar enough to what we want to add now.
> To give another example, the TRUNCATE VACUUM option (or the related
> reloption) can be used to disable relation truncation, a behavior that
> sometimes causes *big* issues in production. The truncate behavior is
> determined dynamically in most situations -- which is another
> similarity to the optimization we've added here.

> Why is this fundamentally different to those two things?

Because the situation looks completely different to me here.  TRUNCATE
is thought as a option to be able to avoid an exclusive lock when
truncating the relation file size at the end of VACUUM.  More
importantly the default of TRUNCATE is *false*, meaning that we are
never going to skip the truncation unless one specifies it at the
relation level.

Here, what we have is a decision that is enforced to happen by
default, all the time, with the user not knowing about it.  If there
is a bug of an issue with it, users, based on your proposal, would be
forced to change it for each *relation*.  If they miss some of those
relations, they may still run into problems without knowing about it.
The change of default behavior and having no way to control it in
a simple way look incompatible to me.

>> With what you are
>> presenting here, one could only disable the optimization for each
>> relation, one-by-one.  If this optimization proves to be a problem,
>> it's just going to be harder to users to go through all the relations
>> and re-tune autovacuum.  Am I missing something?
>
> Why would you expect autovacuum to run even when the optimization is
> unavailable (e.g. with Postgres 13)? After all, the specifics of when
> the bypass optimization kicks in make it very unlikely that ANALYZE
> will ever be able to notice enough dead tuples to trigger an
> autovacuum (barring antiwraparound and insert-driven autovacuums).
>
> There will probably be very few LP_DEAD items remaining. Occasionally
> there will be somewhat more LP_DEAD items, that happen to be
> concentrated in less than 2% of the table's blocks -- but block-based
> sampling by ANALYZE is likely to miss most of them and underestimate
> the total number. The sampling approach taken by acquire_sample_rows()
> ensures this with larger tables. With small tables the chances of the
> optimization kicking in are very low, unless perhaps fillfactor has
> been tuned very aggressively.
>
> There has never been a guarantee that autovacuum will be triggered
> (and do index vacuuming) in cases that have very few LP_DEAD items, no
> matter how the system has been tuned. The main reason why making the
> optimization behavior controllable is for the VACUUM command.
> Principally for hackers. I can imagine myself using the VACUUM option
> to disable the optimization when I was interested in testing VACUUM or
> space utilization in some specific, narrow way.
>
> Of course it's true that there is still some uncertainty about the
> optimization harming production workloads -- that is to be expected
> with an enhancement like this one. But there is still no actual
> example or test case that shows the optimization doing the wrong
> thing, or anything like it. Anything is possible, but I am not
> expecting there to be even one user complaint about the feature.
> Naturally I don't want to add something as heavyweight as a GUC, given
> all that.

Perhaps.  What I am really scared about is that you are assuming that
enforcing this decision will be *always* fine.  What I am trying to
say here is that it *may not* be fine for everybody, and that there
should be an easy way to turn it off if that proves to be a problem.
I don't quite see how that's an implementation problem, we have
already many reloptions that are controlled with GUCs if the
reloptions have no default.

I think that a more careful choice implementation would have been to
turn this optimization off by default, while having an option to allow
one to turn it on at will.
--
Michael

Attachment

signature.asc

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

15 June 2021, 02:46:17

On Mon, Jun 14, 2021 at 5:23 PM Michael Paquier <michael@paquier.xyz> wrote:
> > *Why* does it have to work at the system level? I don't understand
> > what you mean about the system level.
>
> I mean that you lack a GUC that allows to enforce to *not* use this
> optimization for all relations, for all processes.

You've just explained what a GUC is. This is not helpful.

> > Why is this fundamentally different to those two things?
>
> Because the situation looks completely different to me here.  TRUNCATE
> is thought as a option to be able to avoid an exclusive lock when
> truncating the relation file size at the end of VACUUM.  More
> importantly the default of TRUNCATE is *false*, meaning that we are
> never going to skip the truncation unless one specifies it at the
> relation level.

Maybe it looks different to you because that's not actually true;
VACUUM *does* skip the truncation when it feels like it, regardless of
the value of the reloption. In general there is no strict guarantee of
truncation ever happening -- see lazy_truncate_heap().

Again: Why is this fundamentally different?

> Here, what we have is a decision that is enforced to happen by
> default, all the time, with the user not knowing about it.  If there
> is a bug of an issue with it, users, based on your proposal, would be
> forced to change it for each *relation*.  If they miss some of those
> relations, they may still run into problems without knowing about it.
> The change of default behavior and having no way to control it in
> a simple way look incompatible to me.

You've just explained what a reloption is. Again, this is not helping.

> Perhaps.  What I am really scared about is that you are assuming that
> enforcing this decision will be *always* fine.

I very clearly and repeatedly said that there was uncertainty about
causing issues in rare real world cases. Are you always 100% certain
that your code has no bugs before you commit it?

Should I now add a GUC for every single feature that I commit? You are
just asserting that we must need to add a GUC, without giving any real
reasons -- you're just giving generic reasons that work just as well
in most situations. I'm baffled by this.

> What I am trying to
> say here is that it *may not* be fine for everybody, and that there
> should be an easy way to turn it off if that proves to be a problem.

As I said, I think that the relationship is both necessary and
sufficient. A GUC is a heavyweight solution that seems quite
unnecessary.

> I don't quite see how that's an implementation problem, we have
> already many reloptions that are controlled with GUCs if the
> reloptions have no default.

I never said that there was an implementation problem with a GUC. Just
that it was unnecessary, and not consistent with existing practice.

Does anyone else have an opinion on this? Of course I can easily add a
GUC. But I won't do so in the absence of any real argument in favor of
it.

> I think that a more careful choice implementation would have been to
> turn this optimization off by default, while having an option to allow
> one to turn it on at will.

You have yet to say anything about the implementation.

-- 
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

17 June 2021, 01:53:47

On Sun, May 30, 2021 at 6:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> We need to accept "yes" and "no" too? Currently, the parsing of a
> boolean type reloption accepts those words.

Added those in the attached revision, version 2. This is much closer
to being commitable than v1 was. I plan on committing this in the next
several days.

I probably need to polish the documentation some more, though.

> It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
> command support AUTO for consistency. Do you have any concerns about
> supporting it?

v2 sorts out the mess with VacOptTernaryValue by just adding a new
enum constant to VacOptTernaryValue, called VACOPT_TERNARY_AUTO -- the
enum still has a distinct VACOPT_TERNARY_DEFAULT value. v2 also adds a
new reloption-specific enum, StdRdOptIndexCleanup, which is the
datatype that we actually use inside the StdRdOptions struct. So we
are now able to specify "VACUUM (INDEX_CLEANUP AUTO)" in v2 of the
patch.

v2 also adds a new option to vacuumdb, --force-index-cleanup. This
seemed to make sense because we already have a --no-index-cleanup
option.

> > And does StdRdOptions.vacuum_truncate now need
> > to become a VacOptTernaryValue field too, for consistency with the new
> > definition of StdRdOptions.vacuum_index_cleanup?
>
> We don't have the bypass optimization for heap truncation, unlike
> index vacuuming. So I think we can leave both vacuum_truncate
> reloption and TRUNCATE option as boolean parameters.

Actually FWIW we do have a bypass optimization for TRUNCATE -- it too
has an always-on dynamic behavior -- so it really is like the index
vacuuming thing. In theory it might make sense to have the same "auto,
on, off" thing, just like with index vacuuming in the patch. However,
I haven't done that in the patch because in practice it's a bad idea.
If we offered users the option of truly forcing truncation, then
lazy_truncate_heap() could just insist on truncating. It would have to
just wait for an AEL, no matter how long it took. That would probably
be dangerous because waiting for an AEL without backing out in VACUUM
just isn't a great idea.

Thanks

--
Peter Geoghegan

Attachment

v2-0001-Generalize-VACUUM-s-INDEX_CLEANUP-option.patch

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Masahiko Sawada

Date:

17 June 2021, 09:14:20

On Thu, Jun 17, 2021 at 10:54 AM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Sun, May 30, 2021 at 6:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > We need to accept "yes" and "no" too? Currently, the parsing of a
> > boolean type reloption accepts those words.
>
> Added those in the attached revision, version 2. This is much closer
> to being commitable than v1 was. I plan on committing this in the next
> several days.
>
> I probably need to polish the documentation some more, though.
>
> > It seems to me that it's better to have INDEX_CLEANUP option of VACUUM
> > command support AUTO for consistency. Do you have any concerns about
> > supporting it?
>
> v2 sorts out the mess with VacOptTernaryValue by just adding a new
> enum constant to VacOptTernaryValue, called VACOPT_TERNARY_AUTO -- the
> enum still has a distinct VACOPT_TERNARY_DEFAULT value. v2 also adds a
> new reloption-specific enum, StdRdOptIndexCleanup, which is the
> datatype that we actually use inside the StdRdOptions struct. So we
> are now able to specify "VACUUM (INDEX_CLEANUP AUTO)" in v2 of the
> patch.
>
> v2 also adds a new option to vacuumdb, --force-index-cleanup. This
> seemed to make sense because we already have a --no-index-cleanup
> option.
>
> > > And does StdRdOptions.vacuum_truncate now need
> > > to become a VacOptTernaryValue field too, for consistency with the new
> > > definition of StdRdOptions.vacuum_index_cleanup?
> >
> > We don't have the bypass optimization for heap truncation, unlike
> > index vacuuming. So I think we can leave both vacuum_truncate
> > reloption and TRUNCATE option as boolean parameters.
>
> Actually FWIW we do have a bypass optimization for TRUNCATE -- it too
> has an always-on dynamic behavior -- so it really is like the index
> vacuuming thing. In theory it might make sense to have the same "auto,
> on, off" thing, just like with index vacuuming in the patch. However,
> I haven't done that in the patch because in practice it's a bad idea.
> If we offered users the option of truly forcing truncation, then
> lazy_truncate_heap() could just insist on truncating. It would have to
> just wait for an AEL, no matter how long it took. That would probably
> be dangerous because waiting for an AEL without backing out in VACUUM
> just isn't a great idea.

I agree that it doesn't make sense to force heap truncation.

Thank you for updating the patch! Here are comments on v2 patch:

 typedef enum VacOptTernaryValue
 {
    VACOPT_TERNARY_DEFAULT = 0,
+   VACOPT_TERNARY_AUTO,
    VACOPT_TERNARY_DISABLED,
    VACOPT_TERNARY_ENABLED,
 } VacOptTernaryValue;

VacOptTernaryValue is no longer a ternary value. Can we rename it
something like VacOptValue?

---
+           if (vacopts->force_index_cleanup)
            {
-               /* INDEX_CLEANUP is supported since v12 */
+               /*
+                * "INDEX_CLEANUP TRUE" has been supported since v12. Though
+                * the --force-index-cleanup vacuumdb option was only added in
+                * v14, it still works in the same way on v12+.
+                */
                Assert(serverVersion >= 120000);
+               Assert(!vacopts->no_index_cleanup);
                appendPQExpBuffer(sql, "%sINDEX_CLEANUP FALSE", sep);
                sep = comma;
            }

We should specify TRUE instead.

---
--force-index-cleanup option isn't shown in the help message.

---
I think we also improve the tab completion for INDEX_CLEANUP option.

---
@@ -32,7 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [
<replaceable class="paramet
     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
     DISABLE_PAGE_SKIPPING [ <replaceable
class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
-    INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
+    INDEX_CLEANUP [ <replaceable class="parameter">enum</replaceable> ]
     PROCESS_TOAST [ <replaceable class="parameter">boolean</replaceable> ]
     TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
     PARALLEL <replaceable class="parameter">integer</replaceable>

How about listing the available values of INDEX_CLEANUP here instead
of enum? For example, we do a similar thing in the description of
FORMAT option of EXPLAIN command. It would be easier to perceive all
available values.

---
+     <varlistentry>
+      <term><option>--no-index-cleanup</option></term>
+      <listitem>
+       <para>

It should be --force-index-cleanup.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

18 June 2021, 02:26:07

On Thu, Jun 17, 2021 at 2:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Thank you for updating the patch! Here are comments on v2 patch:

Thanks for the review!

Attached is v3, which has all the changes that you suggested (plus the
doc stuff from Justin).

I also renamed the "default" VacOptTernaryValue (actually now called
VacOptValue) value -- it seems clearer to call this "unspecified".
Because it represents a state that has nothing to do with the default
of the reloption or GUC. Really, it means "VACUUM command didn't have
this specified explicitly" (note that this means that it always starts
out "default" in an autovacuum worker). Unspecified seems much clearer
because it directly expresses "fall back on the reloption, and then
fall back on the reloption's default". I find this much clearer -- it
is unspecified, but will have to *become* specified later, so that
vacuumlazy.c has a truly usable value ("unspecified" is never usable
in vacuumlazy.c).

> VacOptTernaryValue is no longer a ternary value. Can we rename it
> something like VacOptValue?

As I said, done that way.

> We should specify TRUE instead.

Ooops. Fixed.

> --force-index-cleanup option isn't shown in the help message.

Fixed.

> ---
> I think we also improve the tab completion for INDEX_CLEANUP option.

Fixed.

> How about listing the available values of INDEX_CLEANUP here instead
> of enum? For example, we do a similar thing in the description of
> FORMAT option of EXPLAIN command. It would be easier to perceive all
> available values.

That looks much better. Fixed.

> It should be --force-index-cleanup.

Fixed.

--
Peter Geoghegan

Attachment

v3-0001-Generalize-VACUUM-s-INDEX_CLEANUP-option.patch

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

18 June 2021, 02:27:51

On Thu, Jun 17, 2021 at 5:55 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> (Various sgml typos)

Fixed in the v3 I just posted.

> +      removed until index cleanup is completed.  This option has no
> +      effect for tables that do not have an index and is ignored if
> +      the <literal>FULL</literal> option is used.
>
> I'd say "tables that have no indexes,"

That wording wasn't mine (it just happened to be moved around by
reformatting), but I think you're right. I went with your suggestion.

Thanks for taking a look
-- 
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

19 June 2021, 03:05:21

On Thu, Jun 17, 2021 at 7:26 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Thanks for the review!
>
> Attached is v3, which has all the changes that you suggested (plus the
> doc stuff from Justin).

Just pushed a version of that with much improved documentation.

Thanks again
-- 
Peter Geoghegan

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Mark Dilger

Date:

20 June 2021, 16:22:06

> On Jun 14, 2021, at 7:46 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>
> Does anyone else have an opinion on this? Of course I can easily add a
> GUC. But I won't do so in the absence of any real argument in favor of
> it.

I'd want to see some evidence that the GUC is necessary.  (For that matter, why is a per relation setting necessary?)
Isthere a reproducible pathological case, perhaps with a pgbench script, to demonstrate the need?  I'm not asking
whetherthere might be some regression, but rather whether somebody wants to construct a worst-case pathological case
andpublish quantitative results about how bad it is. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Teaching users how they can get the most out of HOT in Postgres 14

From

Peter Geoghegan

Date:

20 June 2021, 16:55:36

On Sun, Jun 20, 2021 at 9:22 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> I'd want to see some evidence that the GUC is necessary. (For that matter, why is a per relation setting necessary?)
Is there a reproducible pathological case, perhaps with a pgbench script, to demonstrate the need? I'm not asking
whetherthere might be some regression, but rather whether somebody wants to construct a worst-case pathological case
andpublish quantitative results about how bad it is.

One clear argument in favor of the VACUUM option (not so much the
reloption) is that it enables certain testing scenarios.

For example, I was recently using pg_visibility to do a low-level
analysis of how visibility map bits were getting set with a test case
that built on the BenchmarkSQL fair-use TPC-C implementation. The
optimization was something that I noticed in certain scenarios -- I
could have used the option of disabling it at the VACUUM command level
just to get a perfectly clean slate. A small fraction of the pages in
the table to not be set all-visible, which would be inconsequential to
users but was annoying in the context of this particular test
scenario.

The way the optimization works will only ever leave an affected table
in a state where the LP_DEAD items left behind would be highly
unlikely to be counted by ANALYZE. They would not be counted
accurately anyway, either because they're extremely few in number or
because there are relatively many that are concentrated in just a few
heap blocks -- that's how block-based sampling by ANALYZE works.

In short, even if there really was a performance problem implied by
the bypass indexes optimization, it seems unlikely that autovacuum
would run in the first place to take care of it, with or without the
optimization. Even if autovacuum_vacuum_scale_factor were set very
aggressively. VACUUM (really autovacuum) just doesn't tend to work at
that level of precision.

--
Peter Geoghegan