Re: Berserk Autovacuum (let's save next Mandrill) - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Berserk Autovacuum (let's save next Mandrill) |
Date | |
Msg-id | CA+fd4k4Xf3bf9Yj4wDOti7XwG91n5zL6YD9wSmkb=cjHeETvEQ@mail.gmail.com Whole thread Raw |
In response to | Re: Berserk Autovacuum (let's save next Mandrill) (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
On Fri, 20 Mar 2020 at 15:20, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2020-03-19 06:45:48 +0100, Laurenz Albe wrote: > > On Tue, 2020-03-17 at 18:02 -0700, Andres Freund wrote: > > > I don't think a default scale factor of 0 is going to be ok. For > > > large-ish tables this will basically cause permanent vacuums. And it'll > > > sometimes trigger for tables that actually coped well so far. 10 million > > > rows could be a few seconds, not more. > > > > > > I don't think that the argument that otherwise a table might not get > > > vacuumed before autovacuum_freeze_max_age is convincing enough. > > > > > > a) if that's indeed the argument, we should increase the default > > > autovacuum_freeze_max_age - now that there's insert triggered vacuums, > > > the main argument against that from before isn't valid anymore. > > > > > > b) there's not really a good arguments for vacuuming more often than > > > autovacuum_freeze_max_age for such tables. It'll not be not frequent > > > enough to allow IOS for new data, and you're not preventing > > > anti-wraparound vacuums from happening. > > > > According to my reckoning, that is the remaining objection to the patch > > as it is (with ordinary freezing behavior). > > > > How about a scale_factor od 0.005? That will be high enough for large > > tables, which seem to be the main concern here. > > > > I fully agree with your point a) - should that be part of the patch? > > > > I am not sure about b). In my mind, the objective is not to prevent > > anti-wraparound vacuums, but to see that they have less work to do, > > because previous autovacuum runs already have frozen anything older than > > vacuum_freeze_min_age. So, assuming linear growth, the number of tuples > > to freeze during any run would be at most one fourth of today's number > > when we hit autovacuum_freeze_max_age. > > Based on two IM conversations I think it might be worth emphasizing how > vacuum_cleanup_index_scale_factor works: > > For btree, even if there is not a single deleted tuple, we can *still* > end up doing a full index scans at the end of vacuum. As the docs describe > vacuum_cleanup_index_scale_factor: > > <para> > Specifies the fraction of the total number of heap tuples counted in > the previous statistics collection that can be inserted without > incurring an index scan at the <command>VACUUM</command> cleanup stage. > This setting currently applies to B-tree indexes only. > </para> > > I.e. with the default settings we will perform a whole-index scan > (without visibility map or such) after every 10% growth of the > table. Which means that, even if the visibility map prevents repeated > tables accesses, increasing the rate of vacuuming for insert-only tables > can cause a lot more whole index scans. Which means that vacuuming an > insert-only workload frequently *will* increase the total amount of IO, > even if there is not a single dead tuple. Rather than just spreading the > same amount of IO over more vacuums. Right. > > And both gin and gist just always do a full index scan, regardless of > vacuum_cleanup_index_scale_factor (either during a bulk delete, or > during the cleanup). Thus more frequent vacuuming for insert-only > tables can cause a *lot* of pain (even an approx quadratic increase of > IO? O(increased_frequency * peak_index_size)?) if you have large > indexes - which is very common for gin/gist. That's right but for gin, more frequent vacuuming for insert-only tables can help to clean up the pending list, which increases search speed and better than doing it by a backend process. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: