Thread: Minimum tuple threshold to decide last pass of VACUUM
Hi all, Commit 4046e58c (dated of 2001) has introduced the following comment in vacuumlazy.c: + /* If any tuples need to be deleted, perform final vacuum cycle */ + /* XXX put a threshold on min nuber of tuples here? */ + if (vacrelstats->num_dead_tuples > 0) In short, we may want to have a reloption to decide if we do or not the last pass of VACUUM or not depending on a given number of remaining tuples. Is this still something we would like to have? Regards, -- Michael
On 2 August 2015 at 13:13, Michael Paquier <michael.paquier@gmail.com> wrote:
--
Hi all,
Commit 4046e58c (dated of 2001) has introduced the following comment
in vacuumlazy.c:
+ /* If any tuples need to be deleted, perform final vacuum cycle */
+ /* XXX put a threshold on min nuber of tuples here? */
+ if (vacrelstats->num_dead_tuples > 0)
In short, we may want to have a reloption to decide if we do or not
the last pass of VACUUM or not depending on a given number of
remaining tuples. Is this still something we would like to have?
I don't think we want a new user parameter, but we should have an internal limit with a heuristic, similar to how we decide whether to truncate.
I would suggest this internal logic...
* If its a VACUUM FREEZE then index_scan_threshold = 0, i.e. always scan if needed, since the user is requesting maximum vacuum
* For emergency anti-wraparound VACUUMs we shouldn't scan indexes at all, since they aren't critical path activities at that point
* For normal VACUUMs we should scan indexes only if (num_dead_tuples * 20) > (blocks to be scanned in any one index), which allows some index bloat but not much
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Simon Riggs wrote: > * For emergency anti-wraparound VACUUMs we shouldn't scan indexes at all, > since they aren't critical path activities at that point It is not possible to skip scanning indexes completely, unless no tuples are to be removed from the heap. Otherwise, index tuples become lingering pointers (and when such heap address are later refilled, they become corrupted indexscans). But actually this is an interesting point and I don't think we do this: if in emergency mode, maybe we shouldn't try to remove any dead tuples at all, and instead only freeze very old tuples. That would make such vacuums go much quicker. (More accurately, if the updating xid is older than the freeze point, then remove the tuple, but otherwise keep it.) My point is that emergency vacuums are troublesome for various reasons and it would be better if they did only the minimum possible. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Simon Riggs wrote: >> * For emergency anti-wraparound VACUUMs we shouldn't scan indexes at all, >> since they aren't critical path activities at that point > It is not possible to skip scanning indexes completely, unless no tuples > are to be removed from the heap. Right. > But actually this is an interesting point and I don't think we do this: > if in emergency mode, maybe we shouldn't try to remove any dead tuples > at all, and instead only freeze very old tuples. +1 ... not sure if that's what Simon had in mind exactly, but it seems like a correct statement of what he was getting at. regards, tom lane
On 3 August 2015 at 17:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
--
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Simon Riggs wrote:
>> * For emergency anti-wraparound VACUUMs we shouldn't scan indexes at all,
>> since they aren't critical path activities at that point
> It is not possible to skip scanning indexes completely, unless no tuples
> are to be removed from the heap.
Right.
> But actually this is an interesting point and I don't think we do this:
> if in emergency mode, maybe we shouldn't try to remove any dead tuples
> at all, and instead only freeze very old tuples.
+1 ... not sure if that's what Simon had in mind exactly, but it seems
like a correct statement of what he was getting at.
Yes, that's what I was thinking, I just didn't say actually it. I'd been thinking about having VACUUM do just Phase 1 for some time, since its so much faster to do that. Will code.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Aug 3, 2015 at 11:52 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Simon Riggs wrote: >> * For emergency anti-wraparound VACUUMs we shouldn't scan indexes at all, >> since they aren't critical path activities at that point > > It is not possible to skip scanning indexes completely, unless no tuples > are to be removed from the heap. Otherwise, index tuples become > lingering pointers (and when such heap address are later refilled, they > become corrupted indexscans). Well, if we skip the index scans, we can't do the second heap pass either, but that's OK. I think we're all talking about the same thing here, which is to do only the first heap pass in some cases. That will prune dead tuples to line pointers, freeze old XIDs, and mark pages all-visible where appropriate. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Aug 3, 2015 at 4:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > * For normal VACUUMs we should scan indexes only if (num_dead_tuples * 20) > > (blocks to be scanned in any one index), which allows some index bloat but > not much I think this kind of heuristic is good, but I think we should expose a setting for it. There's no way for us to know without testing whether the right value for that multiplier is 2 or 20 or 200 or 2000, and if we don't make it easy to tweak, we'll never find out. It may even be workload-dependent. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 8/3/15 12:04 PM, Simon Riggs wrote: > Yes, that's what I was thinking, I just didn't say actually it. I'd been > thinking about having VACUUM do just Phase 1 for some time, since its so > much faster to do that. Will code. I'd like to see that exposed as an option as well. There are certain situations where you'd really like to just freeze things as fast as possible, without waiting for a full vacuum. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com
On Tue, Aug 4, 2015 at 2:04 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 3 August 2015 at 17:36, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> > Simon Riggs wrote: >> >> * For emergency anti-wraparound VACUUMs we shouldn't scan indexes at >> >> all, >> >> since they aren't critical path activities at that point >> >> > It is not possible to skip scanning indexes completely, unless no tuples >> > are to be removed from the heap. >> >> Right. >> >> > But actually this is an interesting point and I don't think we do this: >> > if in emergency mode, maybe we shouldn't try to remove any dead tuples >> > at all, and instead only freeze very old tuples. >> >> +1 ... not sure if that's what Simon had in mind exactly, but it seems >> like a correct statement of what he was getting at. > > > Yes, that's what I was thinking, I just didn't say actually it. I'd been > thinking about having VACUUM do just Phase 1 for some time, since its so > much faster to do that. Will code. Interesting. I'll be happy to have a look at any patch produced, that's surely something we want to improve in emergency mode. -- Michael