Thread: Autovacuum is cleaning very less dead tuples

Autovacuum is cleaning very less dead tuples

From

Amarendra Konda

Date:

27 September 2019, 08:40:50

Hi,

As part of vacuum tuning, We have set the below set of parameters.

> select relname,reloptions, pg_namespace.nspname from pg_class join pg_namespace on pg_namespace.oid=pg_class.relnamespace where relname IN('process_instance') and pg_namespace.nspname='public';
relname | reloptions | nspname
--------------+------------------------------------------------------------------------------------------------------------------------------------------------------+---------
process_instance | {autovacuum_vacuum_scale_factor=0,autovacuum_vacuum_threshold=20000,autovacuum_vacuum_cost_limit=1000,autovacuum_vacuum_cost_delay=10} | public

autovaccumm threshold was set for 20,000. However after the vacuuming, it is cleaning up less than 2,000 tuples only. And also vacuuming frequncy was increased as it is becoming eligible for the autovacuuming.
However n_dead_tup value from pg_stat_user_tables was always showing very high value. Most of the time, it is greater than 100K dead tuples.

Overall, we couldn't able to correlate on why autovacuum was able to cleanup only < 2K tuples, even though there are mode dead tuples based on the statistics ? Can you please explain on why we are notcing huge difference and what steps needs to taken to minimize the gap ?

Log message

2019-09-25 00:06:31 UTC::@:[80487]:LOG: automatic vacuum of table "fc_db_web_2.public.process_instance": index scans: 1
pages: 0 removed, 854445 remain, 0 skipped due to pins, 774350 skipped frozen
tuples: 1376 removed, 16819201 remain, 21 are dead but not yet removable
buffer usage: 553118 hits, 9070720 misses, 14175 dirtied
avg read rate: 13.926 MB/s, avg write rate: 0.022 MB/s
system usage: CPU 44.57s/33.04u sec elapsed 5088.65 sec

Table Information

SELECT nspname || '.' || relname AS "relation",
pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND C.relkind <> 'i'
AND nspname !~ '^pg_toast'
AND relname='process_instance';

relation | total_size
---------------------+------------
public.process_instance | 77 GB

Live and Dead tuples

select relname, n_live_tup, n_dead_tup FROM pg_stat_user_tables WHERE relname='process_instance';
relname | n_live_tup | n_dead_tup
--------------+------------+------------
conversation | 16841596 | 144202

show track_counts;
track_counts
--------------
on

show default_statistics_target;
default_statistics_target
---------------------------
100

Version

PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit

Thanks in advance.

Regards, Amarendra

Re: Autovacuum is cleaning very less dead tuples

From

Laurenz Albe

Date:

27 September 2019, 08:55:06

On Fri, 2019-09-27 at 11:10 +0530, Amarendra Konda wrote:
> As part of vacuum tuning, We have set the below set of parameters. 
> 
> > select relname,reloptions, pg_namespace.nspname from pg_class join
> pg_namespace on pg_namespace.oid=pg_class.relnamespace where relname
> IN('process_instance') and pg_namespace.nspname='public';
>    relname    |                                                      
>                reloptions                                            
>                          | nspname
> --------------+----------------------------------------------------
> -------------------------------------------------------------------
> -------------------------------+---------
>  process_instance |
> {autovacuum_vacuum_scale_factor=0,autovacuum_vacuum_threshold=20000,a
> utovacuum_vacuum_cost_limit=1000,autovacuum_vacuum_cost_delay=10}    
>           | public

That's not so much tuning as breaking.

You have set autovacuum to run all the time at a snail's pace.
That way, it will have trouble getting any work done.

Don't touch autovacuum_vacuum_scale_factor and
autovacuum_vacuum_threshold. Don't raise autovacuum_vacuum_cost_limit.
If anything, lower autovacuum_vacuum_cost_delay.
 
> However n_dead_tup value from pg_stat_user_tables was always showing
> very high value. Most of the time, it is greater than 100K dead
> tuples.  

That is only a problem if the number of live tuples is less than
500000.

> Overall, we couldn't able to correlate on why autovacuum was able to
> cleanup only < 2K tuples, even though there are mode dead tuples
> based on the statistics ?  Can you please explain on why we are
> notcing huge difference and what steps needs to taken to minimize the
> gap ?

It is questionable if there is a problem at all.

> Log message
> 
>  2019-09-25 00:06:31 UTC::@:[80487]:LOG:  automatic vacuum of table
> "fc_db_web_2.public.process_instance": index scans: 1
> pages: 0 removed, 854445 remain, 0 skipped due to pins, 774350
> skipped frozen
> tuples: 1376 removed, 16819201 remain, 21 are dead but not yet
> removable
> system usage: CPU 44.57s/33.04u sec elapsed 5088.65 sec

This shows that at least this table has no problem.

Even with your settings, autovacuum finished in 5 seconds and
could clean up almost everything.

> Live and Dead tuples
> 
> select relname, n_live_tup, n_dead_tup FROM pg_stat_user_tables WHERE
> relname='process_instance';
>    relname    | n_live_tup | n_dead_tup 
> --------------+------------+------------
>  conversation |   16841596 |     144202

Perfect.  There is no problem at all.

The table has less than 20% dead tuples, so everything is in perfect
order.

Just stop fighting windmills.

Give up your tuning attempts and reset all parameters back to the
default.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com