I'm running PostgreSQL 9.2.4 on Ubuntu 12.04.3 LTS. It's being used by an application that has recently gone from demo (very, very low usage) to production (low, but constant usage). It runs the autovacuum daemon and zero_damaged_pages configuration parameter is off.
Since Postgres 9.2 doesn't have block checksums I implemented a poor man's consistency checking: on the backup server every backup is unpacked and pg_filedump <db_file> | grep Error is run on every database file. I have never seen any error reported by that, until the last Sunday the first one appeared.
It turned out that pg_filedump reported errors for 34 database files, but that was because all those files had one or more database pages (8k) at the end of the file zeroed out. So pg_filedump couldn't find the proper header and thus reported an error. This was also the case on the main server, so it wasn't some error present on the backup server only. Database files stay in this condition for a time (I'm not sure how long) and then the zeroed out pages disappear. Postgres doesn't report any kind of an error.
I want to ask if this behaviour is normal.
I know that autovacuum can truncate the db file if it happens that the last page(s) don't contain live data, so this behaviour could be that, but it seems weird that the pages would first be zeroed out and then, some time later, the file truncated. Why not truncate it immediately?
My checking incantation is:
cd $PGDATA
for f in `find base global -type f -name '[0-9]*' \! -empty`; do
pg_filedump $f | grep -q Error
[[ $? -eq 0 ]] && echo $f
done
And later, to verify that all errors are zeroed-out pages at the end of the file: