Re: Visibility map, partial vacuums - Mailing list pgsql-hackers
From | Gregory Stark |
---|---|
Subject | Re: Visibility map, partial vacuums |
Date | |
Msg-id | 87bpvtqh2b.fsf@oxford.xeocode.com Whole thread Raw |
In response to | Re: Visibility map, partial vacuums (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Responses |
Re: Visibility map, partial vacuums
Re: Visibility map, partial vacuums |
List | pgsql-hackers |
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > Gregory Stark wrote: >> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> >>> Hmm. It just occurred to me that I think this circumvented the anti-wraparound >>> vacuuming: a normal vacuum doesn't advance relfrozenxid anymore. We'll need to >>> disable the skipping when autovacuum is triggered to prevent wraparound. VACUUM >>> FREEZE does that already, but it's unnecessarily aggressive in freezing. > > FWIW, it seems the omission is actually the other way 'round. Autovacuum always > forces a full-scanning vacuum, making the visibility map useless for > autovacuum. This obviously needs to be fixed. How does it do that? Is there some option in the VacStmt to control this? Do we just need a syntax to set that option? How easy is it to tell what percentage of the table needs to be vacuumed? If it's > 50% perhaps it would make sense to scan the whole table? (Hm. Not really if it's a contiguous 50% though...) Another idea: Perhaps each page of the visibility map should have a frozenxid (or multiple frozenxids?). Then if an individual page of the visibility map is old we could force scanning all the heap pages covered by that map page and update it. I'm not sure we can do that safely though without locking issues -- or is it ok because it's vacuum doing the updating? >> Worse, vacuum will set the freeze_xid to nearly the same value for all of the >> tables. So it will all happen again in another 100M transactions. And again in >> another 100M transactions, and again... > > But we already have that problem, don't we? When you initially load your > database, all tuples will have the same xmin, and all tables will have more or > less the same relfrozenxid. I guess you can argue that it becomes more obvious > if vacuums are otherwise cheaper, but I don't think the visibility map makes > that much difference to suddenly make this issue urgent. We already have that problem but it only bites in a specific case: if you have no other vacuums being triggered by the regular dead tuple scale factor. The normal case is intended to be that autovacuum triggers much more frequently than every 100M transactions to reduce bloat. However in practice this specific case does seem to arise rather alarmingly easy. Most databases do have some large tables which are never deleted from or updated. Also, the default scale factor of 20% is actually quite easy to never reach if your tables are also growing quickly -- effectively moving the goalposts further out as fast as the updates and deletes bloat the table. The visibility map essentially widens this specific use case to cover *all* tables. Since the relfrozenxid would never get advanced by regular vacuums the only time it would get advanced is when they all hit the 200M wall simultaneously. > Agreed that it would be nice to do something about it, though. > >> I think there are several things which need to happen here. >> >> 1) Raise autovacuum_max_freeze_age to 400M or 800M. Having it at 200M just >> means unnecessary full table vacuums long before they accomplish anything. > > It allows you to truncate clog. If I did my math right, 200M transactions > amounts to ~50MB of clog. Perhaps we should still raise it, disk space is cheap > after all. Ah. Hm. Then perhaps this belongs in the realm of the config generator people are working on. They'll need a dial to say how much disk space you expect your database to take in addition to how much memory your machine has available. 50M is nothing for a 1TB database but it's kind of silly to have to keep hundreds of megs of clogs on a 1MB database. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB'sPostgreSQL training!
pgsql-hackers by date: