Thread: Autovacuum/Analyze Doesn't seem to be running properly
Hello,
I have a cluster with these parameters (no changed settings for any of the database/tables):
autovacuum | on
autovacuum_analyze_scale_factor | 0.05
autovacuum_analyze_threshold | 25
autovacuum_freeze_max_age | 200000000
autovacuum_max_workers | 6
autovacuum_multixact_freeze_max_age | 400000000
autovacuum_naptime | 15s
autovacuum_vacuum_cost_delay | 20ms
autovacuum_vacuum_cost_limit | -1
autovacuum_vacuum_scale_factor | 0.1
autovacuum_vacuum_threshold | 25
autovacuum_work_mem | -1
autovacuum | on
autovacuum_analyze_scale_factor | 0.05
autovacuum_analyze_threshold | 25
autovacuum_freeze_max_age | 200000000
autovacuum_max_workers | 6
autovacuum_multixact_freeze_max_age | 400000000
autovacuum_naptime | 15s
autovacuum_vacuum_cost_delay | 20ms
autovacuum_vacuum_cost_limit | -1
autovacuum_vacuum_scale_factor | 0.1
autovacuum_vacuum_threshold | 25
autovacuum_work_mem | -1
I have about 20 database with thousands of tuples (live and dead) but when I look at the pg_stat_user_tables within each database:
SELECT
schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum, last_analyze
FROM
pg_stat_user_tables;
all n_live_tups are 0 and all dead tups are 0, with null data for last_vacuum, and last_analyze which seems very odd to me.
I can see there's live rows with select count and then I analyze a database, then I see live and dead tuples, and last_analyze shows the timestamp of when I just ran ANALYZE;
Am I missing something here or is this abnormal behaviour?
I'm also seeing my databases are above 90% towards hitting my autovacuum_max_freeze-age using this command.
SELECT freez, txns, ROUND(100*(txns/freez::float)) AS perc, datname |
FROM (SELECT foo.freez::int, age(datfrozenxid) AS txns, datname. |
FROM pg_database d JOIN (SELECT setting AS freez FROM pg_settings WHERE name = 'autovacuum_freeze_max_age') AS foo |
I also have looked for vacuum via ps -ef | grep vacuum but didn't get any return.
And select * from pg_stat_progress_vacuum shows nothing as well.
I shouldn't have to create cron jobs to run analyze and vacuum manually for each database, but I don't know how to further investigate this issue.
Has anyone seen this before? Any advise would be greatly appreciated!
Thanks in advance for the help,
Chris
On Mon, 2020-03-02 at 19:57 -0700, Chris Kim wrote: > I have a cluster with these parameters (no changed settings for any of the database/tables): > > autovacuum | on > autovacuum_analyze_scale_factor | 0.05 > autovacuum_analyze_threshold | 25 > autovacuum_freeze_max_age | 200000000 > autovacuum_max_workers | 6 > autovacuum_multixact_freeze_max_age | 400000000 > autovacuum_naptime | 15s > autovacuum_vacuum_cost_delay | 20ms > autovacuum_vacuum_cost_limit | -1 > autovacuum_vacuum_scale_factor | 0.1 > autovacuum_vacuum_threshold | 25 > autovacuum_work_mem | -1 > > I have about 20 database with thousands of tuples (live and dead) but when I look at the pg_stat_user_tables within eachdatabase: > SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum, last_analyze > FROM pg_stat_user_tables; > > all n_live_tups are 0 and all dead tups are 0, with null data for last_vacuum, and last_analyze which seems very odd tome. > > I can see there's live rows with select count and then I analyze a database, then I see live and dead tuples, and last_analyzeshows the timestamp of when I just ran ANALYZE; > > Am I missing something here or is this abnormal behaviour? > > I'm also seeing my databases are above 90% towards hitting my autovacuum_max_freeze-age using this command. > SELECT freez, txns, ROUND(100*(txns/freez::float)) AS perc, datname > FROM (SELECT foo.freez::int, age(datfrozenxid) AS txns, datname. > FROM pg_database d JOIN (SELECT setting AS freez FROM pg_settings WHERE name = 'autovacuum_freeze_max_age') AS foo > ON (true) WHERE d.datallowconn) AS foo2 ORDER BY 3 DESC, 4 ASC; > > I also have looked for vacuum via ps -ef | grep vacuum but didn't get any return. > > And select * from pg_stat_progress_vacuum shows nothing as well. > > I shouldn't have to create cron jobs to run analyze and vacuum manually for each database, but I don't know how to furtherinvestigate this issue. > > Has anyone seen this before? Any advise would be greatly appreciated! Some ideas: - Did somebody run SELECT pg_stat_reset(); That would reset the statistics and explain that you see nothing in "pg_stat_user_tables". - After you ANALYZE the database, do you see something in "pg_stat_user_tables"? You should see something then. - Another explaination for the missing statistics could be a problem with the statistics collector process. Do you see and pertinent errors in the database log? If you restart the database, do you get errors from the statistics collector? Does a restart fix the problem? - Are the "track_activities" and "track_counts" parameters set to on? - Hitting 90% of autovacuum_freeze_max_age is no problem. It only means that you'll get some anti-wraparound autovacuums some time soon. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
In my testing it looks like pg_stat_reset was ran, but I haven't been able to verify this from my logs.
Is there another way the stats could get reset?
I also found that I need to look at the column last_autoanalyze, not last_analyze.
Does autovacuum use the stats from pg_stat_user_table meaning if there 50 live tuples and 1000 dead tuples, but then the stats are reset, those dead tuples would never get cleaned up until there was new activity on that table?
Thank you,
Chris
On Tue, Mar 3, 2020, 1:34 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Mon, 2020-03-02 at 19:57 -0700, Chris Kim wrote:
> I have a cluster with these parameters (no changed settings for any of the database/tables):
>
> autovacuum | on
> autovacuum_analyze_scale_factor | 0.05
> autovacuum_analyze_threshold | 25
> autovacuum_freeze_max_age | 200000000
> autovacuum_max_workers | 6
> autovacuum_multixact_freeze_max_age | 400000000
> autovacuum_naptime | 15s
> autovacuum_vacuum_cost_delay | 20ms
> autovacuum_vacuum_cost_limit | -1
> autovacuum_vacuum_scale_factor | 0.1
> autovacuum_vacuum_threshold | 25
> autovacuum_work_mem | -1
>
> I have about 20 database with thousands of tuples (live and dead) but when I look at the pg_stat_user_tables within each database:
> SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum, last_analyze
> FROM pg_stat_user_tables;
>
> all n_live_tups are 0 and all dead tups are 0, with null data for last_vacuum, and last_analyze which seems very odd to me.
>
> I can see there's live rows with select count and then I analyze a database, then I see live and dead tuples, and last_analyze shows the timestamp of when I just ran ANALYZE;
>
> Am I missing something here or is this abnormal behaviour?
>
> I'm also seeing my databases are above 90% towards hitting my autovacuum_max_freeze-age using this command.
> SELECT freez, txns, ROUND(100*(txns/freez::float)) AS perc, datname
> FROM (SELECT foo.freez::int, age(datfrozenxid) AS txns, datname.
> FROM pg_database d JOIN (SELECT setting AS freez FROM pg_settings WHERE name = 'autovacuum_freeze_max_age') AS foo
> ON (true) WHERE d.datallowconn) AS foo2 ORDER BY 3 DESC, 4 ASC;
>
> I also have looked for vacuum via ps -ef | grep vacuum but didn't get any return.
>
> And select * from pg_stat_progress_vacuum shows nothing as well.
>
> I shouldn't have to create cron jobs to run analyze and vacuum manually for each database, but I don't know how to further investigate this issue.
>
> Has anyone seen this before? Any advise would be greatly appreciated!
Some ideas:
- Did somebody run
SELECT pg_stat_reset();
That would reset the statistics and explain that you see nothing in "pg_stat_user_tables".
- After you ANALYZE the database, do you see something in "pg_stat_user_tables"?
You should see something then.
- Another explaination for the missing statistics could be a problem with the
statistics collector process. Do you see and pertinent errors in the database log?
If you restart the database, do you get errors from the statistics collector?
Does a restart fix the problem?
- Are the "track_activities" and "track_counts" parameters set to on?
- Hitting 90% of autovacuum_freeze_max_age is no problem.
It only means that you'll get some anti-wraparound autovacuums some time soon.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
On 2020-Mar-04, Chris Kim wrote: > In my testing it looks like pg_stat_reset was ran, but I haven't been able > to verify this from my logs. > > Is there another way the stats could get reset? Yes, a crash/restart does that. > Does autovacuum use the stats from pg_stat_user_table meaning if there 50 > live tuples and 1000 dead tuples, but then the stats are reset, those dead > tuples would never get cleaned up until there was new activity on that > table? That's correct. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Thanks anyone for your help. This has helped me better understand autoanalyze.
Thanks,
Chris
On Wed, Mar 4, 2020, 1:53 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2020-Mar-04, Chris Kim wrote:
> In my testing it looks like pg_stat_reset was ran, but I haven't been able
> to verify this from my logs.
>
> Is there another way the stats could get reset?
Yes, a crash/restart does that.
> Does autovacuum use the stats from pg_stat_user_table meaning if there 50
> live tuples and 1000 dead tuples, but then the stats are reset, those dead
> tuples would never get cleaned up until there was new activity on that
> table?
That's correct.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services