Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate. - Mailing list pgsql-hackers

From David Gould
Subject Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.
Date
Msg-id 20180304151857.11921586@engels
Whole thread Raw
In response to Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
List pgsql-hackers
On Sun, 4 Mar 2018 07:49:46 -0800
Jeff Janes <jeff.janes@gmail.com> wrote:

> On Wed, Jan 17, 2018 at 4:49 PM, David Gould <daveg@sonic.net> wrote:

> > # analyze verbose pg_attribute;
> > INFO:  "pg_attribute": scanned 30000 of 24519424 pages, containing 6475
> > live rows and 83 dead rows; 6475 rows in sample, 800983035 estimated total
> > rows.
> 
> I can see how this issue would prevent ANALYZE from fixing the problem, but
> I don't see how it could have caused the problem in the first place.  In
> your demonstration case, you had to turn off autovac in order to get it to
> happen, and then when autovac is turned back on, it is all primed for an
> autovac to launch, go through, touch almost all of the pages, and fix it
> for you.  How did your original table get into a state where this wouldn't
> happen?
> 
> Maybe a well-timed crash caused n_dead_tup to get reset to zero and that is
> why autovac is not kicking in?  What are the pg_stat_user_table number and
> the state of the visibility map for your massively bloated table, if you
> still have them?

We see this sort of thing pretty routinely on more than just catalogs, but
catalogs are where it really hurts. These systems are 40 cores/80 threads, 1
TB memory, Fusion IO. Databases are 5 to 10 TB with 100,000 to 200,000 tables.
Tables are updated in batches every few minutes 100 threads at a time. There
are also some long running queries that don't help. Due to the large number of
tables and high rate of mutation it can take a long time between visits from
autovacuum, especially since autovacuum builds a list of pending work and
then processes it to completion so new tables in need of vacuum can't even be
seen until all the old work is done. For what it is worth, streaming
replication doesn't work either as the single threaded recovery can't keep up
with the 80 thread mutator.

We tried relying on external scripts to address the most bloated tables, but
those also depended on reltuples to detect bloat so they missed out a lot.
For a long time we simply had recurring crisis. Once I figured out that
ANALYZE could not set reltuples effectively we worked around it by running
ANALYZE VERBOSE on all the large tables and parsing the notices to calculate
the rowcount the same way as in the patch. This works, but is a nuisance.

The main pain points are that when reltuples gets inflated there is no way
to fix it, auto vacuum stops looking at the table and hand run ANALYZE can't
reset the reltuples. The only cure is VACUUM FULL, but that is not really
practical without unacceptable amounts of downtime.

-dg

-- 
David Gould                                   daveg@sonic.net
If simplicity worked, the world would be overrun with insects.


pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Online enabling of checksums
Next
From: David Gould
Date:
Subject: Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.