Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: New IndexAM API controlling index vacuum strategies |
Date | |
Msg-id | CAD21AoAkBG3vJ2-Mqe-feq4CkHyg5wTGFjpAs4yMPq9HxuAmAA@mail.gmail.com Whole thread Raw |
In response to | Re: New IndexAM API controlling index vacuum strategies (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: New IndexAM API controlling index vacuum strategies
|
List | pgsql-hackers |
On Sun, Apr 4, 2021 at 11:00 AM Peter Geoghegan <pg@bowt.ie> wrote: > > On Thu, Apr 1, 2021 at 8:25 PM Peter Geoghegan <pg@bowt.ie> wrote: > > I've also found a way to further simplify the table-without-indexes > > case: make it behave like a regular two-pass/has-indexes VACUUM with > > regard to visibility map stuff when the page doesn't need a call to > > lazy_vacuum_heap() (because there are no LP_DEAD items to set > > LP_UNUSED on the page following pruning). But when it does call > > lazy_vacuum_heap(), the call takes care of everything for > > lazy_scan_heap(), which just continues to the next page due to > > considering prunestate to have been "invalidated" by the call to > > lazy_vacuum_heap(). So there is absolutely minimal special case code > > for the table-without-indexes case now. > > Attached is v10, which simplifies the one-pass/table-without-indexes > VACUUM as described. > Thank you for updating the patch. > > * I now include a modified version of Matthias van de Meent's line > pointer truncation patch [1]. > > Matthias' patch seems very much in scope here. The broader patch > series establishes the principle that we can leave LP_DEAD line > pointers in an unreclaimed state indefinitely, without consequence > (beyond the obvious). We had better avoid line pointer bloat that > cannot be reversed when VACUUM does eventually get around to doing a > second pass over the heap. This is another case where it seems prudent > to keep the costs understandable/linear -- page-level line pointer > bloat seems like a cost that increases in a non-linear fashion, which > undermines the whole idea of modelling when it's okay to skip > index/heap vacuuming. (Also, line pointer bloat sucks.) > > Line pointer truncation doesn't happen during pruning, as it did in > Matthias' original patch. In this revised version, line pointer > truncation occurs during the second phase of VACUUM. There are several > reasons to prefer this approach. It seems both safer and more useful > that way (compared to the approach of doing line pointer truncation > during pruning). It also makes intuitive sense to do it this way, at > least to me -- the second pass over the heap is supposed to be for > "freeing" LP_DEAD line pointers. +1 0002, 0003, and 0004 patches look good to me. 0001 and 0005 also look good to me but I have some trivial review comments on them. 0001 patch: /* - * Now that stats[idx] points to the DSM segment, we don't need the - * locally allocated results. + * Now that top-level indstats[idx] points to the DSM segment, we + * don't need the locally allocated results. */ - pfree(*stats); - *stats = bulkdelete_res; + pfree(istat); + istat = bulkdelete_res; Did you try the change around parallel_process_one_index() that I suggested in the previous reply[1]? If we don't change the logic, we need to update the above comment. Previously, we update stats[idx] in vacuum_one_index() (renamed to parallel_process_one_index()) but with your patch, where we update it is its caller. --- +lazy_vacuum_all_indexes(LVRelState *vacrel) { - Assert(!IsParallelWorker()); - Assert(nindexes > 0); + Assert(vacrel->nindexes > 0); + Assert(TransactionIdIsNormal(vacrel->relfrozenxid)); + Assert(MultiXactIdIsValid(vacrel->relminmxid)); and - Assert(!IsParallelWorker()); - Assert(nindexes > 0); + Assert(vacrel->nindexes > 0); We removed two Assert(!IsParallelWorker()) at two places. It seems to me that those assertions are still valid. Do we really need to remove them? --- 0004 patch: src/backend/access/heap/heapam.c:638: trailing whitespace. + /* I found a whitespace issue. --- 0005 patch: + * Caller is expected to call here before and after vacuuming each index in + * the case of two-pass VACUUM, or every BYPASS_EMERGENCY_MIN_PAGES blocks in + * the case of no-indexes/one-pass VACUUM. I think it should be "every VACUUM_FSM_EVERY_PAGES blocks" instead of "every BYPASS_EMERGENCY_MIN_PAGES blocks". --- +/* + * Threshold that controls whether we bypass index vacuuming and heap + * vacuuming. When we're under the threshold they're deemed unnecessary. + * BYPASS_THRESHOLD_PAGES is applied as a multiplier on the table's rel_pages + * for those pages known to contain one or more LP_DEAD items. + */ +#define BYPASS_THRESHOLD_PAGES 0.02 /* i.e. 2% of rel_pages */ + +#define BYPASS_EMERGENCY_MIN_PAGES \ + ((BlockNumber) (((uint64) 4 * 1024 * 1024 * 1024) / BLCKSZ)) + I think we need a description for BYPASS_EMERGENCY_MIN_PAGES. --- for (int idx = 0; idx < vacrel->nindexes; idx++) { Relation indrel = vacrel->indrels[idx]; IndexBulkDeleteResult *istat = vacrel->indstats[idx]; vacrel->indstats[idx] = lazy_vacuum_one_index(indrel, istat, vacrel->old_live_tuples, vacrel); + + if (should_speedup_failsafe(vacrel)) + { + /* Wraparound emergency -- end current index scan */ + allindexes = false; + break; + } allindexes can be false even if we process all indexes, which is fine with me because setting allindexes = false disables the subsequent heap vacuuming. I think it's appropriate behavior in emergency cases. In that sense, can we do should_speedup_failsafe() check also after parallel index vacuuming? And we can also check it at the beginning of lazy vacuum. Regards, [1] https://www.postgresql.org/message-id/CAD21AoDOWo4H6vmtLZoJ2SznMp_zOej2Kww%2BJBkVRPXs%2Bj48uw%40mail.gmail.com -- Masahiko Sawada EDB: https://www.enterprisedb.com/
pgsql-hackers by date: