Re: Heavily modified big table bloat even in auto vacuum is running - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Heavily modified big table bloat even in auto vacuum is running
Date
Msg-id CAA4eK1+4_J2BUs4AAhD+Q98ZNRkpF2FiCNuoHq8WWU_mfiqbpw@mail.gmail.com
Whole thread Raw
In response to Re: Heavily modified big table bloat even in auto vacuum is running  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Heavily modified big table bloat even in auto vacuum is running  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Jan 19, 2014 at 5:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Kapila <amit.kapila16@gmail.com> writes:
>> I am marking this (based on patch vacuum_fix_v7_nkeep.patch) as Ready
>> For Committer.
>
> I've reviewed and committed this patch, with one significant change.
> If you look at the way that vacuumlazy.c computes new_rel_tuples, it's
> based on scanned_tuples (lazy_scan_heap's num_tuples), which is the total
> number of surviving tuples *including* the recently-dead ones counted in
> nkeep.  This is the number that we want to put into pg_class.reltuples,
> I think, but it's wrong for the pgstats stuff to use it as n_live_tuples
> if we're going to count the recently-dead ones as dead.  That is, if we're
> improving the approximation that n_dead_tuples is zero after a vacuum,
> the fix should involve reducing the n_live_tuples estimate as well as
> increasing the n_dead_tuples estimate.
>
> Using your test script against the unpatched code, it's easy to see that
> there's a large (and wrong) value of n_live_tup reported by an autovacuum,
> which gets corrected by the next autoanalyze.  For instance note these
> successive readouts from the pg_stat_all_tables query:
>
>  n_live_tup | autovacuum_count | autoanalyze_count | n_dead_tup
> ------------+------------------+-------------------+------------
>      497365 |                9 |                 8 |    4958346
>      497365 |                9 |                 8 |    5458346
>     1186555 |               10 |                 8 |          0
>     1186555 |               10 |                 8 |     500000
>      499975 |               10 |                 9 |    2491877
>
> Since we know the true number of live tuples is always exactly 500000
> in this test, that jump is certainly wrong.  With the committed patch,
> the behavior is significantly saner:
>
>  n_live_tup | autovacuum_count | autoanalyze_count | n_dead_tup
> ------------+------------------+-------------------+------------
>      483416 |                2 |                 2 |    5759861
>      483416 |                2 |                 2 |    6259861
>      655171 |                3 |                 2 |     382306
>      655171 |                3 |                 2 |     882306
>      553942 |                3 |                 3 |    3523744
>
> Still some room for improvement, but it's not so silly anymore.
>
> It strikes me that there may be an obvious way to improve the number
> further, based on the observation in this thread that nkeep doesn't need
> to be scaled up because VACUUM should have scanned every page that could
> contain dead tuples.  Namely, that we're arriving at new_rel_tuples by
> scaling up num_tuples linearly --- but perhaps we should only scale up
> the live-tuples fraction of that count, not the dead-tuples fraction.
> By scaling up dead tuples too, we are presumably still overestimating
> new_rel_tuples somewhat, and the behavior that I'm seeing with this test
> script seems to confirm that.

After reading your analysis, first thought occurred to me is that we can
directly subtract nkeep from num_tuples to account for better scaling
of live tuples, but I think the scaling routine vac_estimate_reltuples()
is expecting scanned_tuples and this routine is shared by both
Analyze and Vacuum where the mechanism to calculate the live
and dead tuples seems to be bit different, so may be directly passing
a subtract of num_tuples and nkeep to this routine might create some
problem. However I think this idea is definitely worth pursuing to
improve the estimates of live tuples in Vacuum.

> I haven't time to pursue this idea at the moment, but perhaps someone else would like to.

I think this idea is worth to be added in Todo list.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Next
From: Jov
Date:
Subject: Re: improve the help message about psql -F