monitoring tuple_count vs dead_tuple_count - Mailing list pgsql-performance

From Mariel Cherkassky
Subject monitoring tuple_count vs dead_tuple_count
Date
Msg-id CA+t6e1nkgonSYNCjWucP1yz1NxeeyTgL0cnAACL6KgyJVcgWiA@mail.gmail.com
Whole thread Raw
List pgsql-performance
Hi,
I wrote a script that monitored the size of a specific table of mine(dead tuples_mb vs live tuples_mb). The script run a query on pg_stattuple every 15 minutes : select * from pg_stattuple('table_name'). I know that every night there is a huge delete query that deletes most of the table`s content. In addition, I set the following parameters for the table : 
toast.autovacuum_vacuum_scale_factor=0,
 toast.autovacuum_vacuum_threshold=10000, toast.autovacuum_vacuum_cost_limit=10000
toast.autovacuum_vacuum_cost_delay=5    

After a week of monitoring I generates a csv of the results and I created a graph from that data.  However, the graph that I created confused me very much.
A small sample of all the data that I gathered : 
datetoasted_live_tup_size_MBtoasted_dead_tup_size_mb
6/16/19 0:0058.853794125.68760395
6/16/19 0:158.72510242525.02167416
6/16/19 0:308.66871643125.08410168
6/16/19 0:458.81006622324.94327927
6/16/19 1:008.73218345625.02435684
6/16/19 1:158.6765651720.01097107
6/16/19 1:309.57383251220.76298809
6/16/19 1:459.56231975620.7739706
6/16/19 2:009.56703090721.01560402
6/16/19 2:159.57625389170.62042999
6/16/19 2:309.715950966492.2445602
6/16/19 2:459.59837532801.455843
6/16/19 3:009.5997743611110.201434
6/16/19 3:159.6066713331402.255548
6/16/19 3:309.6016988751698.487226
6/16/19 3:459.6069345472003.051514
6/16/19 4:009.6006412512307.625901
6/16/19 4:159.613205912612.196963
6/16/19 4:309.6066465382916.773588
6/16/19 4:459.612946513221.337314
6/16/19 5:009.6076364523525.914713
6/16/19 5:155.4472188953826.313025
6/16/19 5:309.6210546494130.883012
6/16/19 5:4511.487306594433.29188
6/16/19 6:007.3117456444742.039024
6/16/19 6:1512.313211445135.994677
6/16/19 6:3012.123825075671.512811
6/16/19 6:458.0294485096171.677253
6/16/19 7:007.9556779866666.846472
6/16/19 7:1512.211739547161.934807
6/16/19 7:307.963257797661.273341
6/16/19 7:4512.206234938156.362462
6/16/19 8:007.9602050788655.704986
6/16/19 8:1512.1381969533.60424519
6/16/19 8:3012.2174663557.87192154
6/16/19 8:4512.217996633.52415848
6/16/19 9:0012.1441774433.60204792
6/16/19 9:1512.2195444126.85134888


As you can see in this example, The size of the dead rows from 2am until 8am increased while there isnt any change in the size of the live rows. During that time I know that there were a delete query that run and deleted a lot of rows. That is why I'm confused here, if more dead rows are generated because of a delete, it means that number of live_tuples should be decreased but it doesnt happen here. Any idea why ?

pgsql-performance by date:

Previous
From: Rick Otten
Date:
Subject: materialized view refresh of a foreign table
Next
From: AminPG Jaffer
Date:
Subject: Re: Incorrect index used in few cases..