autovacuum and immediate shutdown issues - Mailing list pgsql-general

From Brad Nicholson
Subject autovacuum and immediate shutdown issues
Date
Msg-id 1255967639.4316.24.camel@bnicholson-desktop
Whole thread Raw
Responses Re: autovacuum and immediate shutdown issues
List pgsql-general
If you issue an immediate shutdown to the database, autovacumm will not
process tables that should be vacuumed until manually re-analyzed.

PG 8.3.8

Relevant settings:

autovacuum = on
log_autovacuum_min_duration = 0
autovacuum_max_workers = 3
autovacuum_naptime = 1min
autovacuum_vacuum_threshold = 10
autovacuum_analyze_threshold = 10
autovacuum_vacuum_scale_factor = 0.05
autovacuum_analyze_scale_factor = 0.05


test=# CREATE TABLE foo (id int);
test=# INSERT INTO foo SELECT generate_series(1,1000);

auto-analyze runs as expected

test=# SELECT last_autovacuum, last_autoanalyze from pg_stat_user_tables
where relname ='foo';
-[ RECORD 1 ]----+------------------------------
last_autovacuum  |
last_autoanalyze | 2009-10-19 14:14:47.791658+00


test=# DELETE FROM foo WHERE id IN (SELECT id FROM foo LIMIT 100);

test=# SELECT * from pgstattuple('foo');
-[ RECORD 1 ]------+------
table_len          | 32768
tuple_count        | 900
tuple_len          | 25200
tuple_percent      | 76.9
dead_tuple_count   | 100
dead_tuple_len     | 2800
dead_tuple_percent | 8.54
free_space         | 656
free_percent       | 2

dead_tuple_percent > 5% - autovacuum runs as expected on next pass:

test=# SELECT last_autovacuum, last_autoanalyze from pg_stat_user_tables
where relname ='foo';
-[ RECORD 1 ]----+------------------------------
last_autovacuum  | 2009-10-19 14:16:47.910177+00
last_autoanalyze | 2009-10-19 14:14:47.791658+00


--repopulate table
test=# TRUNCATE foo;
test=# INSERT INTO foo SELECT generate_series(1,1000);

Wait for autoanalyze, then before next run of autovacuum

test=# DELETE FROM foo WHERE id IN (SELECT id FROM foo LIMIT 100);
pg_ctl -D data -m immediate stop

restart the postmaster
stats are gone due to the immediate shutdown.

test=# SELECT last_autovacuum, last_autoanalyze from pg_stat_user_tables
where relname ='foo';
-[ RECORD 1 ]----+-
last_autovacuum  |
last_autoanalyze |


test=# SELECT * from pgstattuple('foo');
-[ RECORD 1 ]------+------
table_len          | 32768
tuple_count        | 900
tuple_len          | 25200
tuple_percent      | 76.9
dead_tuple_count   | 100
dead_tuple_len     | 2800
dead_tuple_percent | 8.54
free_space         | 656
free_percent       | 2

dead_tuple_percent > 5% of table is dead, autovacuum should pick it up,
but it doesn't (yes, I have waited longer enough).
autoanalyze does not process the table.

Autovacuum will not process this table again until one of the following
two conditions are met:

1: Manually analyze the table
2: an additional 5% (autovacuum_vacuum_scale_factor) of the tuples in
the are dirtied - which amounts to 10% (autovacuum_vacuum_scale_factor *
2) of the actual table being dirtied before it gets vacuumed.

In case 2 - reports from pgstattuple are odd.  After deleting one row.

test=# DELETE FROM foo WHERE id IN (SELECT id FROM foo LIMIT 1);
DELETE 1

test=# SELECT * from pgstattuple('foo');
-[ RECORD 1 ]------+------
table_len          | 32768
tuple_count        | 899
tuple_len          | 25172
tuple_percent      | 76.82
dead_tuple_count   | 1
dead_tuple_len     | 28
dead_tuple_percent | 0.09
free_space         | 3456
free_percent       | 10.55

Stats start over.

The DB should be able to recover cleanly from an immediate shutdown
IMHO.  If the stats are no longer there, I would expect autoanalyze to
run and regenerate them.  This is the same behaviour as when a new table
is created and populated.


A few questions

1: Is this expected behaviour, or a bug?
2: If not a bug, why does autoanalyze not process these tables.  It will
process newly loaded tables when they do not have any stats.
3: What is the best work around for this?  When our HA solution triggers
a DB shutdown, we want it to be immediate.  Currently we run a manual
analyze post recovery, this is bad for really for large databases.
4: after restart, why does pgstattuple shoe dead_tuple_percent = 8.54,
but after deleting one row, it shows dead_tuple_percent = 0.09?
5: on the missing stats - does this mean my query plans are potentially
bad until the stats are regenerated?


--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



pgsql-general by date:

Previous
From: Mitesh51
Date:
Subject: Un successful Restoration of DATA using WAL files
Next
From: Sim Zacks
Date:
Subject: cast numeric with scale and precision to numeric plain