Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower. - Mailing list pgsql-bugs

From David Gould
Subject Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Date
Msg-id 20160302034537.0b1c2da7@engels
Whole thread Raw
In response to Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
List pgsql-bugs
On Mon, 29 Feb 2016 18:33:50 -0300
Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> Hi David, did you ever post an updated version of this patch?

No. Let me fix that now. I've attached my current revision of the patch
based on master. This version is significantly better than the original
version and resolves multiple issues:

 - autovacuum workers no longer race each other
 - autovacuum workers do not revacuum each others tables
 - autovacuums workers no longer thrash the stats collector which saves a
   lot of IO when the stats is large.

Hopefully the earlier discussion and the comments in the patch are
sufficient, but please ask questions if it is not clear.

The result is that on a freshly created 40,000 table database with tiny
tables that all need an initial analyze the unpatched postgres saturates
an SSD updating the stats and manages to process less than tables per
minute. With the attached patch it processes several thousand tables per
minute.

The following is a summary of strace output for the autovacuum workers
and the stats collector while the 40k table test is running. The counts and
times are the cost per table.

postgresql 9.5:   85 tables per minute.

     Operations per Table
 calls    msec    system call        [ 4 autovacuum workers ]
------  ------    -------------------
 19.46  196.09    select(0,          <<< Waiting for stats snapshot
  3.26 1040.46    semop(43188238,    <<< Waiting for AutovacuumScheduleLock
  2.05    0.83    sendto(8,          <<< Asking for stats snapshot

 calls    msec    system call        [ stats collector ]
------  ------    -------------------
  3.02    0.05    recvfrom(8,        <<< Request for snapshot refresh
  1.55  248.64    rename("pg_stat_tmp/db_16385.tmp",  <<< Snapshot refresh


+ autovacuum contention patch: 5225 tables per minute

     Operations per Table
 calls    msec    system call        [ 4 autovacuum workers ]
------  ------    -------------------
  0.63    6.34    select(0,          <<< Waiting for stats snapshot
  0.21    0.01    sendto(8,          <<< Asking for stats snapshot
  0.07    0.00    semop(43712518,    <<< Waiting for AutovacuumLock

 calls    msec    system call        [ stats collector ]
------  ------    -------------------
  0.40    0.01    recvfrom(8,        <<< Request for snapshot refresh
  0.04    6.75    rename("pg_stat_tmp/db_16385.tmp",  <<< Snapshot refresh


Regards,

-dg


--
David Gould                                   daveg@sonic.net
If simplicity worked, the world would be overrun with insects.

Attachment

pgsql-bugs by date:

Previous
From: Mark Kirkwood
Date:
Subject: Re: Re: could not migrate 8.0.13 database with large object data to 9.5.1
Next
From: David Gould
Date:
Subject: Re: Prepared statements