Re: Teaching users how they can get the most out of HOT in Postgres 14 - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Teaching users how they can get the most out of HOT in Postgres 14 |
Date | |
Msg-id | CAH2-Wzn4b=osqDi4wa2ajuePzZLTWFL_kz2rz1--dNZe9+SdeQ@mail.gmail.com Whole thread Raw |
In response to | Re: Teaching users how they can get the most out of HOT in Postgres 14 (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: Teaching users how they can get the most out of HOT in Postgres 14
|
List | pgsql-hackers |
On Thu, Jun 3, 2021 at 11:15 PM Michael Paquier <michael@paquier.xyz> wrote: > I have read through the patch, and I am surprised to see that this > only makes possible to control the optimization at relation level. > The origin of the complaints is that this index cleanup optimization > has been introduced as a new rule that gets enforced at *system* > level, so I think that we should have an equivalent with a GUC to > control the behavior for the whole system. *Why* does it have to work at the system level? I don't understand what you mean about the system level. As Masahiko pointed out, adding a GUC isn't what we've done in other similar cases -- that's how DISABLE_PAGE_SKIPPING works, which was a defensive option that seems similar enough to what we want to add now. To give another example, the TRUNCATE VACUUM option (or the related reloption) can be used to disable relation truncation, a behavior that sometimes causes *big* issues in production. The truncate behavior is determined dynamically in most situations -- which is another similarity to the optimization we've added here. Why is this fundamentally different to those two things? > With what you are > presenting here, one could only disable the optimization for each > relation, one-by-one. If this optimization proves to be a problem, > it's just going to be harder to users to go through all the relations > and re-tune autovacuum. Am I missing something? Why would you expect autovacuum to run even when the optimization is unavailable (e.g. with Postgres 13)? After all, the specifics of when the bypass optimization kicks in make it very unlikely that ANALYZE will ever be able to notice enough dead tuples to trigger an autovacuum (barring antiwraparound and insert-driven autovacuums). There will probably be very few LP_DEAD items remaining. Occasionally there will be somewhat more LP_DEAD items, that happen to be concentrated in less than 2% of the table's blocks -- but block-based sampling by ANALYZE is likely to miss most of them and underestimate the total number. The sampling approach taken by acquire_sample_rows() ensures this with larger tables. With small tables the chances of the optimization kicking in are very low, unless perhaps fillfactor has been tuned very aggressively. There has never been a guarantee that autovacuum will be triggered (and do index vacuuming) in cases that have very few LP_DEAD items, no matter how the system has been tuned. The main reason why making the optimization behavior controllable is for the VACUUM command. Principally for hackers. I can imagine myself using the VACUUM option to disable the optimization when I was interested in testing VACUUM or space utilization in some specific, narrow way. Of course it's true that there is still some uncertainty about the optimization harming production workloads -- that is to be expected with an enhancement like this one. But there is still no actual example or test case that shows the optimization doing the wrong thing, or anything like it. Anything is possible, but I am not expecting there to be even one user complaint about the feature. Naturally I don't want to add something as heavyweight as a GUC, given all that. -- Peter Geoghegan
pgsql-hackers by date: