Thread: Teaching users how they can get the most out of HOT in Postgres 14
Recent work from commit 5100010e taught VACUUM that it doesn't have to do index vacuuming in cases where there are practically zero (not necessarily exactly zero) tuples to delete from indexes. It also surfaces the information used to decide whether or not we skip index vacuuming in the logs, via the log_autovacuum_min_duration mechanism. This log output can be used to get a sense of how effective HOT is over time. There is one number of particular interest: the proportion of heap pages that have one or more LP_DEAD items across successive VACUUMs (this is expressed as a percentage of the table). The immediate reason to expose this is that it is crucial to the skipping behavior from commit 5100010e -- the threshold for skipping is 2% of all heap pages. But that's certainly not the only reason to pay attention to the percentage. It can also be used to understand HOT in general. It can be correlated with workload spikes and stressors that tend to make HOT less effective. A number of interesting workload-specific patterns seem to emerge by focussing on how this number changes/grows over time. I think that this should be pointed out directly in the docs. What's more, it seems like a good vehicle for discussing how HOT works in general. Why did we never really get around to documenting HOT? There should at least be some handling of how DBAs can get the most out of HOT through monitoring and through tuning -- especially by lowering heap fillfactor. It's very hard to get all UPDATEs to use HOT. It's much easier to get UPDATEs to mostly use HOT most of the time. How things change over time seems crucially important. I'll show one realistic example, just to give some idea of what it might look like. This is output for 3 successive autovacuums against the largest TPC-C table: automatic vacuum of table "postgres.public.bmsql_order_line": index scans: 0 pages: 0 removed, 4668405 remain, 0 skipped due to pins, 696997 skipped frozen tuples: 324571 removed, 186702373 remain, 333888 are dead but not yet removable, oldest xmin: 7624965 buffer usage: 3969937 hits, 3931997 misses, 1883255 dirtied index scan bypassed: 42634 pages from table (0.91% of total) have 324364 dead item identifiers avg read rate: 62.469 MB/s, avg write rate: 29.920 MB/s I/O Timings: read=42359.501 write=11867.903 system usage: CPU: user: 43.62 s, system: 38.17 s, elapsed: 491.74 s WAL usage: 4586766 records, 1850599 full page images, 8499388881 bytes automatic vacuum of table "postgres.public.bmsql_order_line": index scans: 0 pages: 0 removed, 5976391 remain, 0 skipped due to pins, 2516643 skipped frozen tuples: 759956 removed, 239787517 remain, 1848431 are dead but not yet removable, oldest xmin: 18489326 buffer usage: 3432019 hits, 3385757 misses, 2426571 dirtied index scan bypassed: 103941 pages from table (1.74% of total) have 790233 dead item identifiers avg read rate: 50.338 MB/s, avg write rate: 36.077 MB/s I/O Timings: read=49252.721 write=17003.987 system usage: CPU: user: 45.86 s, system: 34.47 s, elapsed: 525.47 s WAL usage: 5598040 records, 2274556 full page images, 10510281959 bytes automatic vacuum of table "postgres.public.bmsql_order_line": index scans: 1 pages: 0 removed, 7483804 remain, 1 skipped due to pins, 4208295 skipped frozen tuples: 972778 removed, 299295048 remain, 1970910 are dead but not yet removable, oldest xmin: 30987445 buffer usage: 3384994 hits, 4593727 misses, 2891003 dirtied index scan needed: 174243 pages from table (2.33% of total) had 1325752 dead item identifiers removed index "bmsql_order_line_pkey": pages: 1250660 in total, 0 newly deleted, 0 currently deleted, 0 reusable avg read rate: 60.505 MB/s, avg write rate: 38.078 MB/s I/O Timings: read=72881.986 write=21872.615 system usage: CPU: user: 65.24 s, system: 42.24 s, elapsed: 593.14 s WAL usage: 6668353 records, 2684040 full page images, 12374536939 bytes These autovacuums occur every 60-90 minutes with the workload in question (with pretty aggressive autovacuum settings). We see that HOT works rather well here -- but not so well that index vacuuming can be avoided consistently, which happens in the final autovacuum (it has "index scans: 1"). There was slow but steady growth in the percentage of LP_DEAD-containing heap pages over time here, which is common enough. The point of HOT is not to avoid having to do index vacuuming, of course -- that has it backwards. But framing HOT as doing work in backends so autovacuum doesn't have to do similar work later on is a good mental model to encourage users to adopt. There are also significant advantages to reducing the effectiveness of HOT to this one number -- HOT must be working well if it's close to 0%, almost always below 2%, with the occasional aberration that sees it go up to maybe 5%. But if it ever goes too high (in the absence of DELETEs), you might have trouble on your hands. It might not go down again. There are other interesting patterns from other tables within the same database -- including on tables with no UPDATEs, and tables that generally cannot use HOT due to a need to modify indexed columns. The particulars with these other tables hint at problems with heap fragmentation, which is something that users can think of as a degenerative process -- something that gets progressively worse in extreme cases (i.e. cases where it matters). This new percentage metric isn't about HOT per se. It's actually about the broader question of how effective the system is at keeping the physical location of each logical row stable over time, for a given workload. So maybe that's what any new documentation should address. The documentation would still have plenty to say about HOT, though. It would also have something to say about bottom-up index deletion, which can be thought of as avoiding problems when HOT doesn't or can't be applied very often. -- Peter Geoghegan
Hi, On 2021-04-12 16:11:59 -0700, Peter Geoghegan wrote: > Recent work from commit 5100010e taught VACUUM that it doesn't have to > do index vacuuming in cases where there are practically zero (not > necessarily exactly zero) tuples to delete from indexes. FWIW, I'd not at all be surprised if this causes some issues. Consider cases where all index lookups are via bitmap scans (which does not support killtuples) - if value ranges are looked up often the repeated heap fetches can absolutely kill query performance. I've definitely had to make autovacuum more aggressive for cases like this or schedule manual vacuums, and now that's silently not good enough anymore. Yes, 2% of the table isn't all that much, but often enough all the updates and lookups concentrate in one value range. As far as I can see there's no reasonable way to disable this "optimization", which scares me. Greetings, Andres Freund
On Mon, Apr 12, 2021 at 4:30 PM Andres Freund <andres@anarazel.de> wrote: > As far as I can see there's no reasonable way to disable this > "optimization", which scares me. I'm fine with adding a simple 'off' switch. What I'd like to avoid doing is making the behavior tunable, since it's likely to change in Postgres 15 and Postgres 16 anyway. -- Peter Geoghegan
On Mon, Apr 12, 2021 at 04:35:13PM -0700, Peter Geoghegan wrote: > On Mon, Apr 12, 2021 at 4:30 PM Andres Freund <andres@anarazel.de> wrote: > > As far as I can see there's no reasonable way to disable this > > "optimization", which scares me. > > I'm fine with adding a simple 'off' switch. What I'd like to avoid > doing is making the behavior tunable, since it's likely to change in > Postgres 15 and Postgres 16 anyway. While going through this commit a couple of days ago, I really got to wonder why you are controlling this stuff with a hardcoded value and I found that scary, while what you should be using are two GUCs with the reloptions that come with the feature (?): - A threshold, as an integer, to define a number of pages. - A scale factor to define a percentage of pages. Also, I am a bit confused with the choice of BYPASS_THRESHOLD_PAGES as parameter name. For all the other parameters of autovacuum, we use "threshold" for a fixed number of items, not a percentage of a given item. -- Michael
Attachment
On Mon, Apr 12, 2021 at 4:52 PM Michael Paquier <michael@paquier.xyz> wrote: > While going through this commit a couple of days ago, I really got to > wonder why you are controlling this stuff with a hardcoded value and I > found that scary, while what you should be using are two GUCs with the > reloptions that come with the feature (?): > - A threshold, as an integer, to define a number of pages. > - A scale factor to define a percentage of pages. Why? -- Peter Geoghegan
Hi, On 2021-04-12 16:53:47 -0700, Peter Geoghegan wrote: > On Mon, Apr 12, 2021 at 4:52 PM Michael Paquier <michael@paquier.xyz> wrote: > > While going through this commit a couple of days ago, I really got to > > wonder why you are controlling this stuff with a hardcoded value and I > > found that scary, while what you should be using are two GUCs with the > > reloptions that come with the feature (?): > > - A threshold, as an integer, to define a number of pages. > > - A scale factor to define a percentage of pages. > > Why? Well, one argument is that you made a fairly significant behavioural change, with hard-coded logic for when the optimization kicks in. It's not at all clear that your constants are the right ones for every workload. We'll likely on get to know whether they're right in > 1 year - not having a real out at that point imo is somewhat scary. That said, adding more and more reloptions has a significant cost, so I don't think it's clear cut that it's the right decision to add one. Perhaps vacuum_cleanup_index_scale_factor should just be reused for BYPASS_THRESHOLD_PAGES? Greetings, Andres Freund
On Mon, Apr 12, 2021 at 5:37 PM Andres Freund <andres@anarazel.de> wrote: > Well, one argument is that you made a fairly significant behavioural > change, with hard-coded logic for when the optimization kicks in. It's > not at all clear that your constants are the right ones for every > workload. (Apparently nobody wants to talk about HOT and the documentation.) The BYPASS_THRESHOLD_PAGES cutoff was chosen conservatively, so that it would avoid index vacuuming in truly marginal cases -- and it tends to only delay it there. A table-level threshold is not the best way of constraining the problem. In the future, the table threshold should be treated as only one factor among several. Plus there will be more than a simple yes/no question to consider. We should eventually be able to do index vacuuming for some indexes but not others. Bottom-up index deletion has totally changed things here, because roughly speaking it makes index bloat proportionate to the number of logical changes to indexed columns -- you could have one super-bloated index on the table, but several others that perfectly retain their original size. You still need to do heap vacuuming eventually, which necessitates vacuuming indexes too, but the right strategy is probably to vacuum much more frequently, vacuuming the bloated index each time. You only do a full round of index vacuuming when the table starts to accumulate way too many LP_DEAD items. You need a much more sophisticated model for this. It might also need to hook into autovacuums scheduling. One of the dangers of high BYPASS_THRESHOLD_PAGES settings is that it'll work well for some indexes but not others. To a dramatic degree, even. That said, nbtree isn't the only index AM, and it is hard to be completely sure that you've caught everything. So an off switch seems like a good idea now. > We'll likely on get to know whether they're right in > 1 year > - not having a real out at that point imo is somewhat scary. > > That said, adding more and more reloptions has a significant cost, so I > don't think it's clear cut that it's the right decision to add > one. Perhaps vacuum_cleanup_index_scale_factor should just be reused for > BYPASS_THRESHOLD_PAGES? I think that the right way to do this is to generalize INDEX_CLEANUP to support a mode of operation that disallows vacuumlazy.c from applying this optimization, as well as any similar optimizations which will be added in the future. Even if you don't buy my argument about directly parameterizing BYPASS_THRESHOLD_PAGES undermining future work, allowing it to be set much higher than 5% - 10% would be a pretty big footgun. It might appear to help at first, but risks destabilizing things much later on. -- Peter Geoghegan
On Mon, Apr 12, 2021 at 06:12:18PM -0700, Peter Geoghegan wrote: > One of the dangers of high BYPASS_THRESHOLD_PAGES settings is that > it'll work well for some indexes but not others. To a dramatic degree, > even. > > That said, nbtree isn't the only index AM, and it is hard to be > completely sure that you've caught everything. So an off switch seems > like a good idea now. Whatever the solution chosen, the thing I can see we agree on here is that we need to do something, at least in the shape of an on/off switch to have an escape path in case of problems. Peter, could we get something by beta1 for that? FWIW, I would use a float GUC to control that, and not a boolean switch, but I am just one voice here, and that's not a feature I worked on. -- Michael
Attachment
On Tue, May 11, 2021 at 04:42:27PM +0900, Michael Paquier wrote: > Whatever the solution chosen, the thing I can see we agree on here is > that we need to do something, at least in the shape of an on/off > switch to have an escape path in case of problems. Peter, could we > get something by beta1 for that? FWIW, I would use a float GUC to > control that, and not a boolean switch, but I am just one voice here, > and that's not a feature I worked on. So, I have been thinking more about this item, and a boolean switch still sounded weird to me, so attached is a patch to have two GUCs, one for manual VACUUM and autovacuum like any other parameters, to control this behavior, with a default set at 2% of the number of relation pages with dead items needed to do the index cleanup work. Even if we switch the parameter type used here, the easiest and most consistent way to pass down the parameter is just to use VacuumParams set within ExecVacuum() and the autovacuum code path. The docs need more work, I guess. Thoughts? -- Michael
Attachment
On Thu, May 13, 2021 at 04:27:47PM +0900, Michael Paquier wrote: > On Tue, May 11, 2021 at 04:42:27PM +0900, Michael Paquier wrote: > > Whatever the solution chosen, the thing I can see we agree on here is > > that we need to do something, at least in the shape of an on/off > > switch to have an escape path in case of problems. Peter, could we > > get something by beta1 for that? FWIW, I would use a float GUC to > > control that, and not a boolean switch, but I am just one voice here, > > and that's not a feature I worked on. > > So, I have been thinking more about this item, and a boolean switch > still sounded weird to me, so attached is a patch to have two GUCs, > one for manual VACUUM and autovacuum like any other parameters, to > control this behavior, with a default set at 2% of the number of > relation pages with dead items needed to do the index cleanup work. > > Even if we switch the parameter type used here, the easiest and most > consistent way to pass down the parameter is just to use VacuumParams > set within ExecVacuum() and the autovacuum code path. The docs need > more work, I guess. > > Thoughts? > + cleanup_index_scale_factor = autovacuum_cleanup_index_scale >= 0 ? > + autovacuum_cleanup_index_scale : VacuumCostDelay; CostDelay is surely not what you meant. > + <title>Vacuum parameters for Indexes</title> > + <para> > + During the execution of <xref linkend="sql-vacuum"/> > + and <xref linkend="sql-analyze"/> "and analyze" is wrong? > + This parameter can only be set in the <filename>postgresql.conf</filename> > + file or on the server command line. It's SIGHUP > + This parameter can only be set in the <filename>postgresql.conf</filename> > + file or on the server command line. Same + { + {"vacuum_cleanup_index_scale_factor", PGC_SIGHUP, VACUUM_INDEX, + gettext_noop("Fraction of relation pages, with at least one dead item, required to clean up indexes."), + NULL + }, + &VacuumCleanupIndexScale, + 0.02, 0.0, 0.05, + NULL, NULL, NULL + }, Why is the allowed range from 0 to 0.05? Why not 0.10 or 1.0 ? The old GUC of the same name had max 1e10. I think a reduced range and a redefinition of the GUC would need to be called out as an incompatibility. Also, the old GUC (removed at 9f3665fbf) had: - {"vacuum_cleanup_index_scale_factor", PGC_USERSET, CLIENT_CONN_STATEMENT, I think USERSET and STATEMENT were right ? Alternately, what if this were in the DEVELOPER category, which makes this easier to remove in v15. -- Justin
On Thu, May 13, 2021 at 5:06 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > Why is the allowed range from 0 to 0.05? Why not 0.10 or 1.0 ? > The old GUC of the same name had max 1e10. It also had a completely different purpose and default. > I think a reduced range and a redefinition of the GUC would need to be called > out as an incompatibility. The justification from Michael for this approach is that not having this level of control would be weird, at least to him. But that justification itself seems weird to me; why start from the premise that you need a knob (as opposed to an off switch) at all? Why not start with the way the mechanism works (or is intended to work) in practice? Most individual tables will *never* have VACUUM apply the optimization with *any* reasonable threshold value, so we only need to consider the subset of tables/workloads where it *might* make sense to skip index vacuuming. This is more qualitative than quantitative. It makes zero sense to treat the threshold as a universal scale -- this is one reason why I don't want to expose a true tunable knob to users. Though the threshold-driven/BYPASS_THRESHOLD_PAGES design is not exactly something with stable behavior for a given table, it almost works like that in practice: tables tend to usually skip index vacuuming, or never skip it even once. There is a clear bifurcation along this line when you view how VACUUM behaves with a variety of different tables using the new autovacuum logging stuff. Almost all of the benefit of the optimization is available with the current BYPASS_THRESHOLD_PAGES threshold (2% of heap pages have LP_DEAD items), which has less risk than a higher threshold. I don't think it matters much if we have the occasional unnecessary round of index vacuuming on account of not applying the optimization. The truly important benefit of the optimization is to not do unnecessary index vacuuming all the time in the extreme cases where it's really hard to justify. There is currently zero evidence that anything higher than 2% will ever help anybody to an appreciably degree. (There is also no evidence that the optimization will ever need to be disabled, but I accept the need to be conservative and offer an off switch -- the precautionary principle applies when talking about new harms.) Not having to scan every index on every VACUUM, but only every 5th or so VACUUM is a huge improvement. But going from every 5th VACUUM to every 10th VACUUM? That's at best a tiny additional improvement in exchange for what I'd guess is a roughly linear increase in risk (maybe a greater-than-linear increase, even). That's an awful deal. -- Peter Geoghegan
On Thu, May 13, 2021 at 01:27:44PM -0700, Peter Geoghegan wrote: > Almost all of the benefit of the optimization is available with the > current BYPASS_THRESHOLD_PAGES threshold (2% of heap pages have > LP_DEAD items), which has less risk than a higher threshold. I don't > think it matters much if we have the occasional unnecessary round of > index vacuuming on account of not applying the optimization. The truly > important benefit of the optimization is to not do unnecessary index > vacuuming all the time in the extreme cases where it's really hard to > justify. There is currently zero evidence that anything higher than 2% > will ever help anybody to an appreciably degree. (There is also no > evidence that the optimization will ever need to be disabled, but I > accept the need to be conservative and offer an off switch -- the > precautionary principle applies when talking about new harms.) > > Not having to scan every index on every VACUUM, but only every 5th or > so VACUUM is a huge improvement. But going from every 5th VACUUM to > every 10th VACUUM? That's at best a tiny additional improvement in > exchange for what I'd guess is a roughly linear increase in risk > (maybe a greater-than-linear increase, even). That's an awful deal. Perhaps that's an awful deal, but based on which facts can you really say that this new behavior of needing at least 2% of relation pages with some dead items to clean up indexes is not a worse deal in some cases? This may cause more problems for the in-core index AMs, as much as it could impact any out-of-core index AM, no? What about other values like 1%, or even 5%? My guess is that there would be an ask to have more control on that, though that stands as my opinion. Saying that, as long as there is a way to disable that for the users with autovacuum and manual vacuums, I'd be fine. It is worth noting that adding an GUC to control this optimization would make the code more confusing, as there is already do_index_cleanup, a vacuum_index_cleanup reloption, and specifying vacuum_index_cleanup to TRUE may cause the index cleanup to not actually kick if the 2% bar is not reached. -- Michael
Attachment
On Thu, May 13, 2021 at 7:14 PM Michael Paquier <michael@paquier.xyz> wrote: > Perhaps that's an awful deal, but based on which facts can you really > say that this new behavior of needing at least 2% of relation pages > with some dead items to clean up indexes is not a worse deal in some > cases? If I thought that it simply wasn't possible then I wouldn't have accepted the need to make it possible to disable. This is a cost/benefit decision problem, which must be made based on imperfect information -- there are no absolute certainties. But I'm certain about one thing: there is a large practical difference between the optimization causing terrible performance in certain scenarios and the optimization causing slightly suboptimal performance in certain scenarios. A tiny risk of the former scenario is *much* worse than a relatively large risk of the latter scenario. There needs to be a sense of proportion about risk. > This may cause more problems for the in-core index AMs, as > much as it could impact any out-of-core index AM, no? I don't understand what you mean here. > What about > other values like 1%, or even 5%? My guess is that there would be an > ask to have more control on that, though that stands as my opinion. How did you arrive at that guess? Why do you believe that? This is the second time I've asked. > Saying that, as long as there is a way to disable that for the users > with autovacuum and manual vacuums, I'd be fine. It is worth noting > that adding an GUC to control this optimization would make the code > more confusing, as there is already do_index_cleanup, a > vacuum_index_cleanup reloption, and specifying vacuum_index_cleanup to > TRUE may cause the index cleanup to not actually kick if the 2% bar is > not reached. I don't intend to add a GUC. A reloption should suffice. Your interpretation of what specifying vacuum_index_cleanup (the VACUUM command option) represents doesn't seem particularly justified to me. To me it just means "index cleanup and vacuuming are not explicitly disabled, the default behavior". It's an option largely intended for emergencies, and largely superseded by the failsafe mechanism. This interpretation is justified by well established precedent: it has long been possible for VACUUM to skip heap page pruning and even heap page vacuuming just because a super-exclusive lock could not be acquired (though the latter case no longer happens due to the same work inside vacuumlazy.c) -- which also implies skipping some index vacuuming, without it ever being apparent to the user. -- Peter Geoghegan
(I had missed this discussion due to the mismatched thread subject..) On Fri, May 14, 2021 at 11:14 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, May 13, 2021 at 01:27:44PM -0700, Peter Geoghegan wrote: > > Almost all of the benefit of the optimization is available with the > > current BYPASS_THRESHOLD_PAGES threshold (2% of heap pages have > > LP_DEAD items), which has less risk than a higher threshold. I don't > > think it matters much if we have the occasional unnecessary round of > > index vacuuming on account of not applying the optimization. The truly > > important benefit of the optimization is to not do unnecessary index > > vacuuming all the time in the extreme cases where it's really hard to > > justify. There is currently zero evidence that anything higher than 2% > > will ever help anybody to an appreciably degree. (There is also no > > evidence that the optimization will ever need to be disabled, but I > > accept the need to be conservative and offer an off switch -- the > > precautionary principle applies when talking about new harms.) > > > > Not having to scan every index on every VACUUM, but only every 5th or > > so VACUUM is a huge improvement. But going from every 5th VACUUM to > > every 10th VACUUM? That's at best a tiny additional improvement in > > exchange for what I'd guess is a roughly linear increase in risk > > (maybe a greater-than-linear increase, even). That's an awful deal. > > Perhaps that's an awful deal, but based on which facts can you really > say that this new behavior of needing at least 2% of relation pages > with some dead items to clean up indexes is not a worse deal in some > cases? This may cause more problems for the in-core index AMs, as > much as it could impact any out-of-core index AM, no? What about > other values like 1%, or even 5%? My guess is that there would be an > ask to have more control on that, though that stands as my opinion. I'm concerned how users can tune that scale type parameter that can be configurable between 0.0 and 0.05. I think that users basically don't pay attention to how many blocks are updated by UPDATE/DELETE. Unlike old vacuum_cleanup_index_scale_factor, increasing this parameter would directly affect index bloats. If the user can accept more index bloat to speed up (auto)vacuum, they can use vacuum_index_cleanup instead. I prefer to have an on/off switch just in case. I remember I also commented the same thing before. We’ve discussed a way to control whether or not to enable the skipping optimization by adding a new mode to INDEX_CLEANUP option, as Peter mentioned. For example, we can use the new mode “auto” (or “smart”) mode by default, enabling all skipping optimizations, and specifying “on” disables them. Or we can add “force” mode to disable all skipping optimizations while leaving the existing modes as they are. Anyway, I think it’s not a good idea to add a new GUC parameter that we’re not sure how to tune. IIUC skipping index vacuum when less than 2% of relation pages with at least one LP_DEAD is a table’s optimization rather than a btree index’s optimization. Since we’re not likely to set many pages all-visible or collect many dead tuples in that case, we can skip index vacuuming and table vacuuming. IIUC this case, fortunately, goes well together btree indexes’ bottom-up deletion. If this skipping behavior badly affects other indexes AMs, this optimization should be considered within btree indexes, although we will need a way for index AMs to consider and tell the vacuum strategy. But I guess this works well in most cases so having an on/off switch suffice. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Tue, May 18, 2021 at 7:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > I prefer to have an on/off switch just in case. I remember I also > commented the same thing before. We’ve discussed a way to control > whether or not to enable the skipping optimization by adding a new > mode to INDEX_CLEANUP option, as Peter mentioned. For example, we can > use the new mode “auto” (or “smart”) mode by default, enabling all > skipping optimizations, and specifying “on” disables them. Or we can > add “force” mode to disable all skipping optimizations while leaving > the existing modes as they are. Anyway, I think it’s not a good idea > to add a new GUC parameter that we’re not sure how to tune. > > IIUC skipping index vacuum when less than 2% of relation pages with at > least one LP_DEAD is a table’s optimization rather than a btree > index’s optimization. Right. There *is* an excellent way to tune this behavior: by adjusting heap fillfactor to make HOT more effective. That was why I started this thread! If you leave heap fillfactor at the default of 100, and have lots of updates (that don't modify indexed columns) and no deletes, then you're almost certainly not going to have VACUUM skip indexes anyway -- in practice you're bound to exceed having 2% of pages with an LP_DEAD item before very long. Tuning heap fillfactor is practically essential to see a real benefit, regardless of the exact BYPASS_THRESHOLD_PAGES. (There may be some rare exceptions, but for the most part this mechanism helps with tables that get many updates that are expected to use HOT, and will use HOT barring a tiny number of cases where the new tuple won't' quite fit, etc.) The idea of tuning the behavior directly (e.g. with a reloption that lets the user specify a BYPASS_THRESHOLD_PAGES style threshold) is exactly backwards. The point for the user should not be to skip indexes during VACUUM. The point for the user is to get lots of non-HOT updates to *avoid heap fragmentation*, guided by the new autovacuum instrumentation. That also means that there will be much less index vacuuming. But that's a pretty minor side-benefit. Why should the user *expect* largely unnecessary index vacuuming to take place? To put it another way, the index bypass mechanism added to vacuumlazy.c was not intended to add a new good behavior. It was actually intended to subtract an old bad behavior. The patch is mostly useful because it allows the user to make VACUUM *more* aggressive with freezing and VM bit setting (not less aggressive with indexes). The BYPASS_THRESHOLD_PAGES threshold of 0.02 is a little arbitrary -- but only a little. > Since we’re not likely to set many pages > all-visible or collect many dead tuples in that case, we can skip > index vacuuming and table vacuuming. IIUC this case, fortunately, goes > well together btree indexes’ bottom-up deletion. It's true that bottom-up index deletion provides additional insurance against problems, but I don't think that that insurance is strictly necessary. It's nice to have insurance, though. > If this skipping > behavior badly affects other indexes AMs, this optimization should be > considered within btree indexes, although we will need a way for index > AMs to consider and tell the vacuum strategy. But I guess this works > well in most cases so having an on/off switch suffice. Right. I doubt that it will actually turn out to be necessary to have such a switch. But I try to be modest when it comes to predicting what will be important to some user workload -- it's way too complicated to have total confidence about something like that. It is a risk to be managed. -- Peter Geoghegan
On Wed, May 19, 2021 at 6:09 AM Peter Geoghegan <pg@bowt.ie> wrote: > > On Tue, May 18, 2021 at 7:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > If this skipping > > behavior badly affects other indexes AMs, this optimization should be > > considered within btree indexes, although we will need a way for index > > AMs to consider and tell the vacuum strategy. But I guess this works > > well in most cases so having an on/off switch suffice. > > Right. I doubt that it will actually turn out to be necessary to have > such a switch. But I try to be modest when it comes to predicting what > will be important to some user workload -- it's way too complicated to > have total confidence about something like that. It is a risk to be > managed. I think the possible side effect of this hard-coded BYPASS_THRESHOLD_PAGES would be that by default, bulkdelete is not called for a long term and the index becomes bloat. IOW, we will enforce users have the index bloat corresponding to 2% of table pages. The bloat could be serious depending on the index tuple size (e.g., index including many columns). The user may have been running autovacuums aggressively on that table to prevent the index bloat but it's no longer possible and there is no choice. So I think that for those (relatively) rare use cases, it's good to provide a way to somehow control it. Fortunately, an on/off switch is likely to be useful for controlling other optimizations that could be added in the future. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Sun, May 23, 2021 at 11:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > I think the possible side effect of this hard-coded > BYPASS_THRESHOLD_PAGES would be that by default, bulkdelete is not > called for a long term and the index becomes bloat. What do you think of the approach taken in the attached POC patch? The patch makes it possible to disable the optimization by generalizing the INDEX_CLEANUP reloption to be an enum that looks like a trinay boolean (not just a plain boolean). INDEX_CLEANUP now accepts the values 'auto', 'on', and 'off' (plus a variety of alternative spellings, the usual ones for booleans in Postgres). Now 'auto' is the default, and 'on' forces the previous behavior inside vacuumlazy.c. It does not disable the failsafe, though -- INDEX_CLEANUP remains a fairly mechanical thing. This approach seems good to me because INDEX_CLEANUP remains consistent with the original purpose and design of INDEX_CLEANUP -- that was always an option that forced VACUUM to do something special with indexes. I don't see much downside to this approach, either. As things stand, INDEX_CLEANUP is mostly superseded by the failsafe, so we don't really need to talk about wraparound emergencies in the docs for INDEX_CLEANUP anymore. This seems much more elegant than either repurposing/reviving cleanup_index_scale_factor (which makes no sense to me at all) or inventing a new reloption (which would itself be in tension with INDEX_CLEANUP). There are some practical issues that make this patch surprisingly complicated for such a simple problem. For example, I hope that I haven't missed any subtlety in generalizing a boolean reloption like this. We've done similar things with GUCs in the past, but this may be a little different. Another concern with this approach is what it means for the VACUUM command itself. I haven't added an 'auto' spelling that is accepted by the VACUUM command in this POC version. But do I need to at all? Can that just be implied by not having any INDEX_CLEANUP option? And does StdRdOptions.vacuum_truncate now need to become a VacOptTernaryValue field too, for consistency with the new definition of StdRdOptions.vacuum_index_cleanup? -- Peter Geoghegan
Attachment
On Fri, May 28, 2021 at 9:53 AM Peter Geoghegan <pg@bowt.ie> wrote: > > On Sun, May 23, 2021 at 11:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I think the possible side effect of this hard-coded > > BYPASS_THRESHOLD_PAGES would be that by default, bulkdelete is not > > called for a long term and the index becomes bloat. > > What do you think of the approach taken in the attached POC patch? > > The patch makes it possible to disable the optimization by > generalizing the INDEX_CLEANUP reloption to be an enum that looks like > a trinay boolean (not just a plain boolean). INDEX_CLEANUP now accepts > the values 'auto', 'on', and 'off' (plus a variety of alternative > spellings, the usual ones for booleans in Postgres). Now 'auto' is the > default, and 'on' forces the previous behavior inside vacuumlazy.c. It > does not disable the failsafe, though -- INDEX_CLEANUP remains a > fairly mechanical thing. +1 > > This approach seems good to me because INDEX_CLEANUP remains > consistent with the original purpose and design of INDEX_CLEANUP -- > that was always an option that forced VACUUM to do something special > with indexes. I don't see much downside to this approach, either. As > things stand, INDEX_CLEANUP is mostly superseded by the failsafe, so > we don't really need to talk about wraparound emergencies in the docs > for INDEX_CLEANUP anymore. This seems much more elegant than either > repurposing/reviving cleanup_index_scale_factor (which makes no sense > to me at all) or inventing a new reloption (which would itself be in > tension with INDEX_CLEANUP). +1 > > There are some practical issues that make this patch surprisingly > complicated for such a simple problem. For example, I hope that I > haven't missed any subtlety in generalizing a boolean reloption like > this. We've done similar things with GUCs in the past, but this may be > a little different. +/* values from HeapOptIndexCleanupMode */ +relopt_enum_elt_def HeapOptIndexCleanupOptValues[] = +{ + {"auto", VACOPT_TERNARY_DEFAULT}, + {"on", VACOPT_TERNARY_ENABLED}, + {"off", VACOPT_TERNARY_DISABLED}, + {"true", VACOPT_TERNARY_ENABLED}, + {"false", VACOPT_TERNARY_DISABLED}, + {"1", VACOPT_TERNARY_ENABLED}, + {"0", VACOPT_TERNARY_DISABLED}, + {(const char *) NULL} /* list terminator */ +}; We need to accept "yes" and "no" too? Currently, the parsing of a boolean type reloption accepts those words. > Another concern with this approach is what it > means for the VACUUM command itself. I haven't added an 'auto' > spelling that is accepted by the VACUUM command in this POC version. > But do I need to at all? Can that just be implied by not having any > INDEX_CLEANUP option? It seems to me that it's better to have INDEX_CLEANUP option of VACUUM command support AUTO for consistency. Do you have any concerns about supporting it? > And does StdRdOptions.vacuum_truncate now need > to become a VacOptTernaryValue field too, for consistency with the new > definition of StdRdOptions.vacuum_index_cleanup? We don't have the bypass optimization for heap truncation, unlike index vacuuming. So I think we can leave both vacuum_truncate reloption and TRUNCATE option as boolean parameters. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Mon, May 31, 2021 at 10:30:08AM +0900, Masahiko Sawada wrote: > On Fri, May 28, 2021 at 9:53 AM Peter Geoghegan <pg@bowt.ie> wrote: >> Another concern with this approach is what it >> means for the VACUUM command itself. I haven't added an 'auto' >> spelling that is accepted by the VACUUM command in this POC version. >> But do I need to at all? Can that just be implied by not having any >> INDEX_CLEANUP option? > > It seems to me that it's better to have INDEX_CLEANUP option of VACUUM > command support AUTO for consistency. Do you have any concerns about > supporting it? I have read through the patch, and I am surprised to see that this only makes possible to control the optimization at relation level. The origin of the complaints is that this index cleanup optimization has been introduced as a new rule that gets enforced at *system* level, so I think that we should have an equivalent with a GUC to control the behavior for the whole system. With what you are presenting here, one could only disable the optimization for each relation, one-by-one. If this optimization proves to be a problem, it's just going to be harder to users to go through all the relations and re-tune autovacuum. Am I missing something? -- Michael
Attachment
On Fri, Jun 4, 2021 at 3:15 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, May 31, 2021 at 10:30:08AM +0900, Masahiko Sawada wrote: > > On Fri, May 28, 2021 at 9:53 AM Peter Geoghegan <pg@bowt.ie> wrote: > >> Another concern with this approach is what it > >> means for the VACUUM command itself. I haven't added an 'auto' > >> spelling that is accepted by the VACUUM command in this POC version. > >> But do I need to at all? Can that just be implied by not having any > >> INDEX_CLEANUP option? > > > > It seems to me that it's better to have INDEX_CLEANUP option of VACUUM > > command support AUTO for consistency. Do you have any concerns about > > supporting it? > > I have read through the patch, and I am surprised to see that this > only makes possible to control the optimization at relation level. > The origin of the complaints is that this index cleanup optimization > has been introduced as a new rule that gets enforced at *system* > level, so I think that we should have an equivalent with a GUC to > control the behavior for the whole system. With what you are > presenting here, one could only disable the optimization for each > relation, one-by-one. If this optimization proves to be a problem, > it's just going to be harder to users to go through all the relations > and re-tune autovacuum. Am I missing something? I've not thought to disable that optimization at system level. I think we can use this option for particular tables whose indexes unexpectedly bloated much due to this optimization. Similarly, we have DISABLE_PAGE_SKIPPING option but don’t have a way to disable lazy vacuum’s page skipping behavior at system level. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Thu, Jun 3, 2021 at 11:15 PM Michael Paquier <michael@paquier.xyz> wrote: > I have read through the patch, and I am surprised to see that this > only makes possible to control the optimization at relation level. > The origin of the complaints is that this index cleanup optimization > has been introduced as a new rule that gets enforced at *system* > level, so I think that we should have an equivalent with a GUC to > control the behavior for the whole system. *Why* does it have to work at the system level? I don't understand what you mean about the system level. As Masahiko pointed out, adding a GUC isn't what we've done in other similar cases -- that's how DISABLE_PAGE_SKIPPING works, which was a defensive option that seems similar enough to what we want to add now. To give another example, the TRUNCATE VACUUM option (or the related reloption) can be used to disable relation truncation, a behavior that sometimes causes *big* issues in production. The truncate behavior is determined dynamically in most situations -- which is another similarity to the optimization we've added here. Why is this fundamentally different to those two things? > With what you are > presenting here, one could only disable the optimization for each > relation, one-by-one. If this optimization proves to be a problem, > it's just going to be harder to users to go through all the relations > and re-tune autovacuum. Am I missing something? Why would you expect autovacuum to run even when the optimization is unavailable (e.g. with Postgres 13)? After all, the specifics of when the bypass optimization kicks in make it very unlikely that ANALYZE will ever be able to notice enough dead tuples to trigger an autovacuum (barring antiwraparound and insert-driven autovacuums). There will probably be very few LP_DEAD items remaining. Occasionally there will be somewhat more LP_DEAD items, that happen to be concentrated in less than 2% of the table's blocks -- but block-based sampling by ANALYZE is likely to miss most of them and underestimate the total number. The sampling approach taken by acquire_sample_rows() ensures this with larger tables. With small tables the chances of the optimization kicking in are very low, unless perhaps fillfactor has been tuned very aggressively. There has never been a guarantee that autovacuum will be triggered (and do index vacuuming) in cases that have very few LP_DEAD items, no matter how the system has been tuned. The main reason why making the optimization behavior controllable is for the VACUUM command. Principally for hackers. I can imagine myself using the VACUUM option to disable the optimization when I was interested in testing VACUUM or space utilization in some specific, narrow way. Of course it's true that there is still some uncertainty about the optimization harming production workloads -- that is to be expected with an enhancement like this one. But there is still no actual example or test case that shows the optimization doing the wrong thing, or anything like it. Anything is possible, but I am not expecting there to be even one user complaint about the feature. Naturally I don't want to add something as heavyweight as a GUC, given all that. -- Peter Geoghegan
On Sun, May 30, 2021 at 6:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Another concern with this approach is what it > > means for the VACUUM command itself. I haven't added an 'auto' > > spelling that is accepted by the VACUUM command in this POC version. > > But do I need to at all? Can that just be implied by not having any > > INDEX_CLEANUP option? > > It seems to me that it's better to have INDEX_CLEANUP option of VACUUM > command support AUTO for consistency. Do you have any concerns about > supporting it? I suppose we should have it. But now we have to teach vacuumdb about this new boolean-like enum too. It's a lot more new code than I would have preferred, but I suppose that it makes sense. -- Peter Geoghegan
On Fri, Jun 11, 2021 at 02:46:20PM -0700, Peter Geoghegan wrote: > On Thu, Jun 3, 2021 at 11:15 PM Michael Paquier <michael@paquier.xyz> wrote: >> I have read through the patch, and I am surprised to see that this >> only makes possible to control the optimization at relation level. >> The origin of the complaints is that this index cleanup optimization >> has been introduced as a new rule that gets enforced at *system* >> level, so I think that we should have an equivalent with a GUC to >> control the behavior for the whole system. > > *Why* does it have to work at the system level? I don't understand > what you mean about the system level. I mean that you lack a GUC that allows to enforce to *not* use this optimization for all relations, for all processes. > As Masahiko pointed out, adding a GUC isn't what we've done in other > similar cases -- that's how DISABLE_PAGE_SKIPPING works, which was a > defensive option that seems similar enough to what we want to add now. > To give another example, the TRUNCATE VACUUM option (or the related > reloption) can be used to disable relation truncation, a behavior that > sometimes causes *big* issues in production. The truncate behavior is > determined dynamically in most situations -- which is another > similarity to the optimization we've added here. > Why is this fundamentally different to those two things? Because the situation looks completely different to me here. TRUNCATE is thought as a option to be able to avoid an exclusive lock when truncating the relation file size at the end of VACUUM. More importantly the default of TRUNCATE is *false*, meaning that we are never going to skip the truncation unless one specifies it at the relation level. Here, what we have is a decision that is enforced to happen by default, all the time, with the user not knowing about it. If there is a bug of an issue with it, users, based on your proposal, would be forced to change it for each *relation*. If they miss some of those relations, they may still run into problems without knowing about it. The change of default behavior and having no way to control it in a simple way look incompatible to me. >> With what you are >> presenting here, one could only disable the optimization for each >> relation, one-by-one. If this optimization proves to be a problem, >> it's just going to be harder to users to go through all the relations >> and re-tune autovacuum. Am I missing something? > > Why would you expect autovacuum to run even when the optimization is > unavailable (e.g. with Postgres 13)? After all, the specifics of when > the bypass optimization kicks in make it very unlikely that ANALYZE > will ever be able to notice enough dead tuples to trigger an > autovacuum (barring antiwraparound and insert-driven autovacuums). > > There will probably be very few LP_DEAD items remaining. Occasionally > there will be somewhat more LP_DEAD items, that happen to be > concentrated in less than 2% of the table's blocks -- but block-based > sampling by ANALYZE is likely to miss most of them and underestimate > the total number. The sampling approach taken by acquire_sample_rows() > ensures this with larger tables. With small tables the chances of the > optimization kicking in are very low, unless perhaps fillfactor has > been tuned very aggressively. > > There has never been a guarantee that autovacuum will be triggered > (and do index vacuuming) in cases that have very few LP_DEAD items, no > matter how the system has been tuned. The main reason why making the > optimization behavior controllable is for the VACUUM command. > Principally for hackers. I can imagine myself using the VACUUM option > to disable the optimization when I was interested in testing VACUUM or > space utilization in some specific, narrow way. > > Of course it's true that there is still some uncertainty about the > optimization harming production workloads -- that is to be expected > with an enhancement like this one. But there is still no actual > example or test case that shows the optimization doing the wrong > thing, or anything like it. Anything is possible, but I am not > expecting there to be even one user complaint about the feature. > Naturally I don't want to add something as heavyweight as a GUC, given > all that. Perhaps. What I am really scared about is that you are assuming that enforcing this decision will be *always* fine. What I am trying to say here is that it *may not* be fine for everybody, and that there should be an easy way to turn it off if that proves to be a problem. I don't quite see how that's an implementation problem, we have already many reloptions that are controlled with GUCs if the reloptions have no default. I think that a more careful choice implementation would have been to turn this optimization off by default, while having an option to allow one to turn it on at will. -- Michael
Attachment
On Mon, Jun 14, 2021 at 5:23 PM Michael Paquier <michael@paquier.xyz> wrote: > > *Why* does it have to work at the system level? I don't understand > > what you mean about the system level. > > I mean that you lack a GUC that allows to enforce to *not* use this > optimization for all relations, for all processes. You've just explained what a GUC is. This is not helpful. > > Why is this fundamentally different to those two things? > > Because the situation looks completely different to me here. TRUNCATE > is thought as a option to be able to avoid an exclusive lock when > truncating the relation file size at the end of VACUUM. More > importantly the default of TRUNCATE is *false*, meaning that we are > never going to skip the truncation unless one specifies it at the > relation level. Maybe it looks different to you because that's not actually true; VACUUM *does* skip the truncation when it feels like it, regardless of the value of the reloption. In general there is no strict guarantee of truncation ever happening -- see lazy_truncate_heap(). Again: Why is this fundamentally different? > Here, what we have is a decision that is enforced to happen by > default, all the time, with the user not knowing about it. If there > is a bug of an issue with it, users, based on your proposal, would be > forced to change it for each *relation*. If they miss some of those > relations, they may still run into problems without knowing about it. > The change of default behavior and having no way to control it in > a simple way look incompatible to me. You've just explained what a reloption is. Again, this is not helping. > Perhaps. What I am really scared about is that you are assuming that > enforcing this decision will be *always* fine. I very clearly and repeatedly said that there was uncertainty about causing issues in rare real world cases. Are you always 100% certain that your code has no bugs before you commit it? Should I now add a GUC for every single feature that I commit? You are just asserting that we must need to add a GUC, without giving any real reasons -- you're just giving generic reasons that work just as well in most situations. I'm baffled by this. > What I am trying to > say here is that it *may not* be fine for everybody, and that there > should be an easy way to turn it off if that proves to be a problem. As I said, I think that the relationship is both necessary and sufficient. A GUC is a heavyweight solution that seems quite unnecessary. > I don't quite see how that's an implementation problem, we have > already many reloptions that are controlled with GUCs if the > reloptions have no default. I never said that there was an implementation problem with a GUC. Just that it was unnecessary, and not consistent with existing practice. Does anyone else have an opinion on this? Of course I can easily add a GUC. But I won't do so in the absence of any real argument in favor of it. > I think that a more careful choice implementation would have been to > turn this optimization off by default, while having an option to allow > one to turn it on at will. You have yet to say anything about the implementation. -- Peter Geoghegan
On Sun, May 30, 2021 at 6:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > We need to accept "yes" and "no" too? Currently, the parsing of a > boolean type reloption accepts those words. Added those in the attached revision, version 2. This is much closer to being commitable than v1 was. I plan on committing this in the next several days. I probably need to polish the documentation some more, though. > It seems to me that it's better to have INDEX_CLEANUP option of VACUUM > command support AUTO for consistency. Do you have any concerns about > supporting it? v2 sorts out the mess with VacOptTernaryValue by just adding a new enum constant to VacOptTernaryValue, called VACOPT_TERNARY_AUTO -- the enum still has a distinct VACOPT_TERNARY_DEFAULT value. v2 also adds a new reloption-specific enum, StdRdOptIndexCleanup, which is the datatype that we actually use inside the StdRdOptions struct. So we are now able to specify "VACUUM (INDEX_CLEANUP AUTO)" in v2 of the patch. v2 also adds a new option to vacuumdb, --force-index-cleanup. This seemed to make sense because we already have a --no-index-cleanup option. > > And does StdRdOptions.vacuum_truncate now need > > to become a VacOptTernaryValue field too, for consistency with the new > > definition of StdRdOptions.vacuum_index_cleanup? > > We don't have the bypass optimization for heap truncation, unlike > index vacuuming. So I think we can leave both vacuum_truncate > reloption and TRUNCATE option as boolean parameters. Actually FWIW we do have a bypass optimization for TRUNCATE -- it too has an always-on dynamic behavior -- so it really is like the index vacuuming thing. In theory it might make sense to have the same "auto, on, off" thing, just like with index vacuuming in the patch. However, I haven't done that in the patch because in practice it's a bad idea. If we offered users the option of truly forcing truncation, then lazy_truncate_heap() could just insist on truncating. It would have to just wait for an AEL, no matter how long it took. That would probably be dangerous because waiting for an AEL without backing out in VACUUM just isn't a great idea. Thanks -- Peter Geoghegan
Attachment
On Thu, Jun 17, 2021 at 10:54 AM Peter Geoghegan <pg@bowt.ie> wrote: > > On Sun, May 30, 2021 at 6:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > We need to accept "yes" and "no" too? Currently, the parsing of a > > boolean type reloption accepts those words. > > Added those in the attached revision, version 2. This is much closer > to being commitable than v1 was. I plan on committing this in the next > several days. > > I probably need to polish the documentation some more, though. > > > It seems to me that it's better to have INDEX_CLEANUP option of VACUUM > > command support AUTO for consistency. Do you have any concerns about > > supporting it? > > v2 sorts out the mess with VacOptTernaryValue by just adding a new > enum constant to VacOptTernaryValue, called VACOPT_TERNARY_AUTO -- the > enum still has a distinct VACOPT_TERNARY_DEFAULT value. v2 also adds a > new reloption-specific enum, StdRdOptIndexCleanup, which is the > datatype that we actually use inside the StdRdOptions struct. So we > are now able to specify "VACUUM (INDEX_CLEANUP AUTO)" in v2 of the > patch. > > v2 also adds a new option to vacuumdb, --force-index-cleanup. This > seemed to make sense because we already have a --no-index-cleanup > option. > > > > And does StdRdOptions.vacuum_truncate now need > > > to become a VacOptTernaryValue field too, for consistency with the new > > > definition of StdRdOptions.vacuum_index_cleanup? > > > > We don't have the bypass optimization for heap truncation, unlike > > index vacuuming. So I think we can leave both vacuum_truncate > > reloption and TRUNCATE option as boolean parameters. > > Actually FWIW we do have a bypass optimization for TRUNCATE -- it too > has an always-on dynamic behavior -- so it really is like the index > vacuuming thing. In theory it might make sense to have the same "auto, > on, off" thing, just like with index vacuuming in the patch. However, > I haven't done that in the patch because in practice it's a bad idea. > If we offered users the option of truly forcing truncation, then > lazy_truncate_heap() could just insist on truncating. It would have to > just wait for an AEL, no matter how long it took. That would probably > be dangerous because waiting for an AEL without backing out in VACUUM > just isn't a great idea. I agree that it doesn't make sense to force heap truncation. Thank you for updating the patch! Here are comments on v2 patch: typedef enum VacOptTernaryValue { VACOPT_TERNARY_DEFAULT = 0, + VACOPT_TERNARY_AUTO, VACOPT_TERNARY_DISABLED, VACOPT_TERNARY_ENABLED, } VacOptTernaryValue; VacOptTernaryValue is no longer a ternary value. Can we rename it something like VacOptValue? --- + if (vacopts->force_index_cleanup) { - /* INDEX_CLEANUP is supported since v12 */ + /* + * "INDEX_CLEANUP TRUE" has been supported since v12. Though + * the --force-index-cleanup vacuumdb option was only added in + * v14, it still works in the same way on v12+. + */ Assert(serverVersion >= 120000); + Assert(!vacopts->no_index_cleanup); appendPQExpBuffer(sql, "%sINDEX_CLEANUP FALSE", sep); sep = comma; } We should specify TRUE instead. --- --force-index-cleanup option isn't shown in the help message. --- I think we also improve the tab completion for INDEX_CLEANUP option. --- @@ -32,7 +32,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet ANALYZE [ <replaceable class="parameter">boolean</replaceable> ] DISABLE_PAGE_SKIPPING [ <replaceable class="parameter">boolean</replaceable> ] SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ] - INDEX_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ] + INDEX_CLEANUP [ <replaceable class="parameter">enum</replaceable> ] PROCESS_TOAST [ <replaceable class="parameter">boolean</replaceable> ] TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ] PARALLEL <replaceable class="parameter">integer</replaceable> How about listing the available values of INDEX_CLEANUP here instead of enum? For example, we do a similar thing in the description of FORMAT option of EXPLAIN command. It would be easier to perceive all available values. --- + <varlistentry> + <term><option>--no-index-cleanup</option></term> + <listitem> + <para> It should be --force-index-cleanup. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Thu, Jun 17, 2021 at 2:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Thank you for updating the patch! Here are comments on v2 patch: Thanks for the review! Attached is v3, which has all the changes that you suggested (plus the doc stuff from Justin). I also renamed the "default" VacOptTernaryValue (actually now called VacOptValue) value -- it seems clearer to call this "unspecified". Because it represents a state that has nothing to do with the default of the reloption or GUC. Really, it means "VACUUM command didn't have this specified explicitly" (note that this means that it always starts out "default" in an autovacuum worker). Unspecified seems much clearer because it directly expresses "fall back on the reloption, and then fall back on the reloption's default". I find this much clearer -- it is unspecified, but will have to *become* specified later, so that vacuumlazy.c has a truly usable value ("unspecified" is never usable in vacuumlazy.c). > VacOptTernaryValue is no longer a ternary value. Can we rename it > something like VacOptValue? As I said, done that way. > We should specify TRUE instead. Ooops. Fixed. > --force-index-cleanup option isn't shown in the help message. Fixed. > --- > I think we also improve the tab completion for INDEX_CLEANUP option. Fixed. > How about listing the available values of INDEX_CLEANUP here instead > of enum? For example, we do a similar thing in the description of > FORMAT option of EXPLAIN command. It would be easier to perceive all > available values. That looks much better. Fixed. > It should be --force-index-cleanup. Fixed. -- Peter Geoghegan
Attachment
On Thu, Jun 17, 2021 at 5:55 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > (Various sgml typos) Fixed in the v3 I just posted. > + removed until index cleanup is completed. This option has no > + effect for tables that do not have an index and is ignored if > + the <literal>FULL</literal> option is used. > > I'd say "tables that have no indexes," That wording wasn't mine (it just happened to be moved around by reformatting), but I think you're right. I went with your suggestion. Thanks for taking a look -- Peter Geoghegan
On Thu, Jun 17, 2021 at 7:26 PM Peter Geoghegan <pg@bowt.ie> wrote: > Thanks for the review! > > Attached is v3, which has all the changes that you suggested (plus the > doc stuff from Justin). Just pushed a version of that with much improved documentation. Thanks again -- Peter Geoghegan
> On Jun 14, 2021, at 7:46 PM, Peter Geoghegan <pg@bowt.ie> wrote: > > Does anyone else have an opinion on this? Of course I can easily add a > GUC. But I won't do so in the absence of any real argument in favor of > it. I'd want to see some evidence that the GUC is necessary. (For that matter, why is a per relation setting necessary?) Isthere a reproducible pathological case, perhaps with a pgbench script, to demonstrate the need? I'm not asking whetherthere might be some regression, but rather whether somebody wants to construct a worst-case pathological case andpublish quantitative results about how bad it is. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Jun 20, 2021 at 9:22 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > I'd want to see some evidence that the GUC is necessary. (For that matter, why is a per relation setting necessary?) Is there a reproducible pathological case, perhaps with a pgbench script, to demonstrate the need? I'm not asking whetherthere might be some regression, but rather whether somebody wants to construct a worst-case pathological case andpublish quantitative results about how bad it is. One clear argument in favor of the VACUUM option (not so much the reloption) is that it enables certain testing scenarios. For example, I was recently using pg_visibility to do a low-level analysis of how visibility map bits were getting set with a test case that built on the BenchmarkSQL fair-use TPC-C implementation. The optimization was something that I noticed in certain scenarios -- I could have used the option of disabling it at the VACUUM command level just to get a perfectly clean slate. A small fraction of the pages in the table to not be set all-visible, which would be inconsequential to users but was annoying in the context of this particular test scenario. The way the optimization works will only ever leave an affected table in a state where the LP_DEAD items left behind would be highly unlikely to be counted by ANALYZE. They would not be counted accurately anyway, either because they're extremely few in number or because there are relatively many that are concentrated in just a few heap blocks -- that's how block-based sampling by ANALYZE works. In short, even if there really was a performance problem implied by the bypass indexes optimization, it seems unlikely that autovacuum would run in the first place to take care of it, with or without the optimization. Even if autovacuum_vacuum_scale_factor were set very aggressively. VACUUM (really autovacuum) just doesn't tend to work at that level of precision. -- Peter Geoghegan