BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower. - Mailing list pgsql-bugs

From daveg@sonic.net
Subject BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Date
Msg-id 20151030133252.3033.4249@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      13750
Logged by:          David Gould
Email address:      daveg@sonic.net
PostgreSQL version: 9.4.5
Operating system:   Linux
Description:

With more than a few tens of thousands of tables in one database
autovacuuming slows down radically and becomes ineffective. Increasing the
number of autovacuum workers makes the slow down worse.

A client has an application that loads data from thousands of external feeds
many times a day. They create a new table for each batch of new data. After
some months old tables are dropped. Typically the database has about 200,000
fairly small tables each of which has a few indexes and a toast. There are
also a lot of temp tables that come and go. pg_class has over 1/2 million
rows. The hosts have 80 hardware threads, 1TB of memory and fusionIO
storage.

They started seeing long running autovacuum workers doing antiwraparound
vacuums. pg_stat_activity showed workers had been vacuuming a single small
table (ex, 10k rows) for several hours, however in the query log the actual
vacuum took less than a second.

When they updated to 9.4.4 they also increased the number of autovacuum
workers in anticipation of the multixact fix causing extra vacuuming.
Several days later they observed massive catalog bloat, eg pg_attribute was
over 200GB of mostly empty pages. This caused new connections to get stuck
in startup as the catalogs no longer fit in the buffer cache.

I then tried experimenting with different setting of autovacuum workers and
found:

/Autovacuum Actions per Hour/
Workers  Actions  per Worker
   1     2110.1     2110.1
   2     1760.8      880.4
   4      647.3      161.8
   8      386.2       48.3
  72       62.0        0.9

I have analyzed this and created reproduction scripts. I'll send that later
today.

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #13748: Syntax error not emitted
Next
From: Alvaro Herrera
Date:
Subject: Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.