Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers

From Victor Yegorov
Subject Re: New IndexAM API controlling index vacuum strategies
Date
Msg-id CAGnEboiYish96utN02u06NYr16AEEOzRBKfSeAkqsvS3EkAEqA@mail.gmail.com
Whole thread Raw
In response to Re: New IndexAM API controlling index vacuum strategies  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: New IndexAM API controlling index vacuum strategies
List pgsql-hackers
вт, 2 февр. 2021 г. в 05:27, Peter Geoghegan <pg@bowt.ie>:
And now here is the second thing I thought of, which is much better:

Sometimes 1% of the dead tuples in a heap relation will be spread
across 90%+ of the pages. With other workloads 1% of dead tuples might
be highly concentrated, and appear in no more than 1% of all heap
pages. Obviously the distinction between these two cases/workloads
matters a lot. And so the triggering criteria must be quantitative
*and* qualitative. It should not be based on counting dead tuples,
since that alone won't differentiate these two extreme cases - both of
which are probably quite common (in the real world extremes are
actually the normal and common case IME).

I like the idea of basing it on counting *heap blocks*, not dead
tuples. We can count heap blocks that have *at least* one dead tuple
(of course it doesn't matter how they're dead, whether it was this
VACUUM operation or some earlier opportunistic pruning). Note in
particular that it should not matter if it's a heap block that has
only one LP_DEAD line pointer or a heap page that is near the
MaxHeapTuplesPerPage limit for the page -- we count either type of
page towards the heap-page based limit used to decide if index
vacuuming goes ahead for all indexes during VACUUM.

I really like this idea!

It resembles the approach used in bottom-up index deletion, block-based
accounting provides a better estimate for the usefulness of the operation.

I suppose that 1% threshold should be configurable as a cluster-wide GUC
and also as a table storage parameter?


--
Victor Yegorov

pgsql-hackers by date:

Previous
From: 'Alvaro Herrera'
Date:
Subject: Re: libpq debug log
Next
From: Julien Rouhaud
Date:
Subject: Re: Add primary keys to system catalogs