The situation appears to be stable now, but upon running REINDEX and VACUUM on one of the databases, I get the following:
WARNING: relation "pg_attrdef" TID 1/1: OID is invalid WARNING: relation "pg_attrdef" TID 1/2: OID is invalid WARNING: relation "pg_attrdef" TID 1/3: OID is invalid ...
Should I drop the database and restore it from a backup? My most recent backup is from late September, so I would lose some data. I also backed up what I could as soon as the disks started giving errors, but I don't know if I can trust that.
Should I drop the entire cluster?
are you receiving any kind of error messages,while taking database dump with "pg_dump" utility.
I am getting these when running vacuum on the live database. Earlier I ran pg_dump. It was not working for the whole database due to I/O errors, so i did a series of more selective dumps, schema by schema and excluding the offending tables, so that I didn't get any errors or warnings. In the live database I dropped the few tables that were responsible for the pg_dump errors.
If you are not receiving any kind of error message,try to take a database dump and restore database dump file in another database server and perform all sanity checks. If all sanity checks are working fine,you can create a new cluster and restore all database dump's.
Any suggestions on what sanity checks I should run? For now I am doing reindex and vacuum. I think that forces everything to be read. I am assuming that if vacuum completes without error then the table is fine.