Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower. - Mailing list pgsql-bugs

From David Gould
Subject Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Date
Msg-id 20151030234140.282542c3@engels
Whole thread Raw
In response to Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On Fri, 30 Oct 2015 12:51:43 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Tom Lane wrote:
> >> Good point ... shouldn't we have already checked the stats before ever
> >> deciding to try to claim the table?
>
> > The second check is there to allow for some other worker (or manual
> > vacuum) having vacuumed it after we first checked, but which had
> > finished before we check the array of current jobs.
>
> I wonder whether that check costs more than it saves.

It does indeed. It drives the stats collector wild. And of course if there
are lots of tables and indexes the stats temp file gets very large so that
it can take a long time (seconds) to rewrite it. This happens for each
worker for each table that is a candidate for vacuuming.

Since it would not be convenient to provide a copy of the clients 8TB
database I have made a standalone reproduction. The attached files:

 build_test_instance.sh - create a test instance
 datagen.py             - used by above to populate it with lots of tables
 logbyv.awk             - count auto analyze actions in postgres log
 trace.sh               - strace the stats collector and autovacuum workers
 tracereport.sh         - list top 50 calls in strace output

The test process is to run the build_test_instance script to create an
instance with one database with a large number of tiny tables. During the
setup autovacuuming is off. Then make a tarball of the instance for reuse.
For each test case, untar the instance, set the number of workers and start
it. After a short time autovacuum will start workers to analyze the new
tables. Expect to see the stats collector doing lots of writing.

You may want to use tmpfs or a ramdisk for the data dir for building the
test instance. The configuration is sized for reasonable desktop, 8 to 16GB
of memory and an SSD.

-dg

--
David Gould              510 282 0869         daveg@sonic.net
If simplicity worked, the world would be overrun with insects.

Attachment

pgsql-bugs by date:

Previous
From: David Gould
Date:
Subject: Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Next
From: David Gould
Date:
Subject: Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.