autovacuum just stop vacuuming specific table for 7 hours - Mailing list pgsql-performance

From Mariel Cherkassky
Subject autovacuum just stop vacuuming specific table for 7 hours
Date
Msg-id CA+t6e1k2wvczw2w3dG748uTi-jwXKbh1TY_1otjW0_wxJsTUAQ@mail.gmail.com
Whole thread Raw
Responses Re: autovacuum just stop vacuuming specific table for 7 hours
List pgsql-performance
Hi,
I have the next relation in my db : A(id int, info bytea,date timestamp). Every cell in the info column is very big and because of that there is a toasted table with the data of the info column (pg_toast.pg_toast_123456).

The relation contains the login info for every user that logs into the session and cleaned whenever the user disconnected. In case the use doesnt logoff, during the night we clean login info that is older then 3 days.

The toasted table grew very fast (more then 50M+ rows per week) and thats why I set the next autovacuum settings for the table :
Options: 
toast.autovacuum_vacuum_scale_factor=0
toast.autovacuum_vacuum_threshold=10000
toast.autovacuum_vacuum_cost_limit=10000
toast.autovacuum_vacuum_cost_delay=5

Those settings helped but the table still grey very much. I wrote a script that monitored some metadata about the table (pg_stat_all_tables,count(*) from orig and toasted table). I let the system monitor the table for a week and I found out the next info : 

Autovacuum was running great during the whole week and whenever it reached 10k records in the toasted table it started vacuuming the table. However, The db grew dramatically during a period of 7 hours in a specific day. In those 7 hours the table contained more then 10k (and kept increasing) but the autovacuum didnt vacuum the table. I saw that during those 7 hours autovacuum didnt run and as a result of that the table grew to its max size(the current size).

an example of an autovacuum run on the toasted table : 
automatic vacuum of table "db.pg_toast.pg_toast_123456": index scans: 1
        pages: 0 removed, 1607656 remain
        tuples: 6396770 removed, 33778 remain
        buffer usage: 1743021 hits, 3281298 misses, 3217528 dirtied
        avg read rate: 2.555 MiB/s, avg write rate: 2.505 MiB/s
        system usage: CPU 98.44s/54.02u sec elapsed 10034.34 sec

the vacuum hits/misses/dirtied are set to default (1,10,20) 

autovacuum workers - 16
maintenance_work_mem - 200MB
130GB RAM , 23 CPU
Can anyone explain why suddenly the autovacuum should stop working for that long period ?

pgsql-performance by date:

Previous
From: Ian Harding
Date:
Subject: Re: Server upgrade advice [xpost]
Next
From: Justin Pryzby
Date:
Subject: Re: autovacuum just stop vacuuming specific table for 7 hours