Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: New IndexAM API controlling index vacuum strategies |
Date | |
Msg-id | CAH2-WznCxtWL4B995y2KJWj-+jrjahH4n6gD2R74SyQJo6Y63w@mail.gmail.com Whole thread Raw |
In response to | Re: New IndexAM API controlling index vacuum strategies (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: New IndexAM API controlling index vacuum strategies
|
List | pgsql-hackers |
On Mon, Apr 5, 2021 at 2:44 PM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > This assertion is false, and should be a guarding if-statement. HOT > redirect pointers are not updated if the tuple they're pointing to is > vacuumed (i.e. when it was never committed) so this nextoffnum might > in a correctly working system point past maxoff. I will need to go through this in detail soon. > > Line pointer truncation doesn't happen during pruning, as it did in > > Matthias' original patch. In this revised version, line pointer > > truncation occurs during the second phase of VACUUM. There are several > > reasons to prefer this approach. It seems both safer and more useful > > that way (compared to the approach of doing line pointer truncation > > during pruning). It also makes intuitive sense to do it this way, at > > least to me -- the second pass over the heap is supposed to be for > > "freeing" LP_DEAD line pointers. > > Good catch for running a line pointer truncating pass at the second > pass over the heap in VACUUM, but I believe that it is also very > useful for pruning. Line pointer bloat due to excessive HOT chains > cannot be undone until the 2nd run of VACUUM happens with this patch, > which is a missed chance for all non-vacuum pruning. Maybe - I have my doubts about it having much value outside of the more extreme cases. But let's assume that I'm wrong about that, for the sake of argument. The current plan is to no longer require a super-exclusive lock inside lazy_vacuum_heap_page(), which means that we can no longer safely call PageRepairFragmentation() at that point. This will mean that PageRepairFragmentation() is 100% owned by pruning. And so the question of whether or not line pointer truncation should also happen in PageRepairFragmentation() to cover pruning is (or will be) a totally separate question to the question of how lazy_vacuum_heap_page() does it. Nothing stops you from independently pursuing that as a project for Postgres 15. > What difference is there between opportunistically pruned HOT line > pointers, and VACUUMed line pointers? The fact that they are physically identical to each other isn't everything. The "life cycle" of an affected page is crucially important. I find that there is a lot of value in thinking about how things look at the page level moment to moment, and even over hours and days. Usually with a sample workload and table in mind. I already mentioned the new_order table from TPC-C, which is characterized by continual churn from more-or-less even amounts of range deletes and bulk inserts over time. That seems to be the kind of workload where you see big problems with line pointer bloat. Because there is constant churn of unrelated logical rows (it's not a bunch of UPDATEs). It's possible for very small effects to aggregate into large and significant effects -- I know this from my experience with indexing. Plus the FSM is probably not very smart about fragmentation, which makes it even more complicated. And so it's easy to be wrong if you predict that some seemingly insignificant extra intervention couldn't possibly help. For that reason, I don't want to predict that you're wrong now. It's just a question of time, and of priorities. > Truncating during pruning has > the benefit of keeping the LP array short where possible, and seeing > that truncating the LP array allows for more applied > PD_HAS_FREE_LINES-optimization, I fail to see why you wouldn't want to > truncate the LP array whenever clearing up space. Truncating the line pointer array is not an intrinsic good. I hesitate to do it during pruning in the absence of clear evidence that it's independently useful. Pruning is a very performance sensitive operation. Much more so than VACUUM's second heap pass. > Other than those questions, some comments on the other patches: > > 0002: > > + appendStringInfo(&buf, _("There were %lld dead item identifiers.\n"), > > + (long long) vacrel->lpdead_item_pages); > > I presume this should use vacrel->lpdead_items?. It should have been, but as it happens I have decided to not do this at all in 0002-*. Better to not report on LP_UNUSED *or* LP_DEAD items at this point of VACUUM VERBOSE output. > 0003: > > + * ... Aborted transactions > > + * have tuples that we can treat as DEAD without caring about where there > > + * tuple header XIDs ... > > This should be '... where their tuple header XIDs...' Will fix. > > +retry: > > + > > ... > > + res = HeapTupleSatisfiesVacuum(&tuple, vacrel->OldestXmin, buf); > > + > > + if (unlikely(res == HEAPTUPLE_DEAD)) > > + goto retry; > > In this unlikely case, you reset the tuples_deleted value that was > received earlier from heap_page_prune. This results in inaccurate > statistics, as repeated calls to heap_page_prune on the same page will > not count tuples that were deleted in a previous call. I don't think that it matters. The "tupgone=true" case has no test coverage (see coverage.postgresql.org), and it would be hard to ensure that the "res == HEAPTUPLE_DEAD" that replaces it gets coverage, for the same reasons. Keeping the rules as simple as possible seem like a good goal. What's more, it's absurdly unlikely that this will happen even once. The race is very tight. Postgres will do opportunistic pruning at almost any point, often from a SELECT, so the chances of anybody noticing an inaccuracy from this issue in particular are remote in the extreme. Actually, a big problem with the tuples_deleted value surfaced by both log_autovacuum and by VACUUM VERBOSE is that it can be wildly different to the number of LP_DEAD items. This is commonly the case with tables that get lots of non-HOT updates, with opportunistic pruning kicking in a lot, with LP_DEAD items constantly accumulating. By the time VACUUM comes around, it reports an absurdly low tuples_deleted because it's using this what-I-pruned-just-now definition. The opposite extreme is also possible, since there might be far fewer LP_DEAD items when VACUUM does a lot of pruning of HOT chains specifically. > 0004: > > + * truncate to. Note that we avoid truncating the line pointer to 0 items > > + * in all cases. > > + */ > > Is there a specific reason that I'm not getting as to why this is necessary? I didn't say it was strictly necessary. There is special-case handling of PageIsEmpty() at various points, though, including within VACUUM. It seemed worth avoiding hitting that. Perhaps I should change it to not work that way. > 0005: > > + The default is 1.8 billion transactions. Although users can set this value > > + anywhere from zero to 2.1 billion, <command>VACUUM</command> will silently > > + adjust the effective value more than 105% of > > + <xref linkend="guc-autovacuum-freeze-max-age"/>, so that only > > + anti-wraparound autovacuums and aggressive scans have a chance to skip > > + index cleanup. > > This documentation doesn't quite make it clear what its relationship > is with autovacuum_freeze_max_age. How about the following: "... > >VACUUM< will use the higher of this value and 105% of > >guc-autovacuum-freeze-max-age<, so that only ...". It's only slightly > easier to read, but at least it conveys that values lower than 105% of > autovacuum_freeze_max_age are not considered. The same can be said for > the multixact guc documentation. This does need work too. I'm going to push 0002- and 0003- tomorrow morning pacific time. I'll publish a new set of patches tomorrow, once I've finished that up. The last 2 patches will require a lot of focus to get over the line for Postgres 14. -- Peter Geoghegan
pgsql-hackers by date: