Thread: invalid OID warning after disk failure
I'm running PostgreSQL 9.3.5 on Ubuntu 14.04 on x86_64. The database directory is on linux mdadm RAID10, using 4 4TB disks and a far=2 layout. While the RAID tolerates 1 drive failure nicely, I had the misfortune of 2 drives failing consecutively, one of which had many sectors reallocated and began failing SMART criteria. That one is out now. As a result of this some files were corrupted. I was getting the following errors on some tables: ERROR: could not read block 0 in file "base/27810/3995569": Input/output error but after dropping those tables the errors are gone. The situation appears to be stable now, but upon running REINDEX and VACUUM on one of the databases, I get the following: WARNING: relation "pg_attrdef" TID 1/1: OID is invalid WARNING: relation "pg_attrdef" TID 1/2: OID is invalid WARNING: relation "pg_attrdef" TID 1/3: OID is invalid ... Should I drop the database and restore it from a backup? My most recent backup is from late September, so I would lose some data. I also backed up what I could as soon as the disks started giving errors, but I don't know if I can trust that. Should I drop the entire cluster? Regarding hardware, I'm going to add hot standby drives to prevent this from happening in the future. Thanks in advance for your advice. Regards, Gabriel
On Mon, Nov 17, 2014 at 10:20 AM, Gabriel Sánchez Martínez <gabrielesanchez@gmail.com> wrote:
The situation appears to be stable now, but upon running REINDEX and VACUUM on one of the databases, I get the following:
WARNING: relation "pg_attrdef" TID 1/1: OID is invalid
WARNING: relation "pg_attrdef" TID 1/2: OID is invalid
WARNING: relation "pg_attrdef" TID 1/3: OID is invalid
...
Should I drop the database and restore it from a backup? My most recent backup is from late September, so I would lose some data. I also backed up what I could as soon as the disks started giving errors, but I don't know if I can trust that.
Should I drop the entire cluster?
are you receiving any kind of error messages,while taking database dump with "pg_dump" utility.
If you are not receiving any kind of error message,try to take a database dump and restore database dump file in another database server and perform all sanity checks. If all sanity checks are working fine,you can create a new cluster and restore all database dump's.
Thanks & Regards
Raghu Ram
On Mon, Nov 17, 2014 at 10:20 AM, Gabriel Sánchez Martínez <gabrielesanchez@gmail.com> wrote:
The situation appears to be stable now, but upon running REINDEX and VACUUM on one of the databases, I get the following:
WARNING: relation "pg_attrdef" TID 1/1: OID is invalid
WARNING: relation "pg_attrdef" TID 1/2: OID is invalid
WARNING: relation "pg_attrdef" TID 1/3: OID is invalid
...
Should I drop the database and restore it from a backup? My most recent backup is from late September, so I would lose some data. I also backed up what I could as soon as the disks started giving errors, but I don't know if I can trust that.
Should I drop the entire cluster?are you receiving any kind of error messages,while taking database dump with "pg_dump" utility.
I am getting these when running vacuum on the live database. Earlier I ran pg_dump. It was not working for the whole database due to I/O errors, so i did a series of more selective dumps, schema by schema and excluding the offending tables, so that I didn't get any errors or warnings. In the live database I dropped the few tables that were responsible for the pg_dump errors.
If you are not receiving any kind of error message,try to take a database dump and restore database dump file in another database server and perform all sanity checks. If all sanity checks are working fine,you can create a new cluster and restore all database dump's.Thanks & RegardsRaghu Ram
On Mon, Nov 17, 2014 at 10:20 AM, Gabriel Sánchez Martínez <gabrielesanchez@gmail.com> wrote:
The situation appears to be stable now, but upon running REINDEX and VACUUM on one of the databases, I get the following:
WARNING: relation "pg_attrdef" TID 1/1: OID is invalid
WARNING: relation "pg_attrdef" TID 1/2: OID is invalid
WARNING: relation "pg_attrdef" TID 1/3: OID is invalid
...
Should I drop the database and restore it from a backup? My most recent backup is from late September, so I would lose some data. I also backed up what I could as soon as the disks started giving errors, but I don't know if I can trust that.
Should I drop the entire cluster?are you receiving any kind of error messages,while taking database dump with "pg_dump" utility.I am getting these when running vacuum on the live database. Earlier I ran pg_dump. It was not working for the whole database due to I/O errors, so i did a series of more selective dumps, schema by schema and excluding the offending tables, so that I didn't get any errors or warnings. In the live database I dropped the few tables that were responsible for the pg_dump errors.If you are not receiving any kind of error message,try to take a database dump and restore database dump file in another database server and perform all sanity checks. If all sanity checks are working fine,you can create a new cluster and restore all database dump's.
Any suggestions on what sanity checks I should run? For now I am doing reindex and vacuum. I think that forces everything to be read. I am assuming that if vacuum completes without error then the table is fine.
Thanks & RegardsRaghu Ram
Gabriel Sánchez Martínez wrote: > Any suggestions on what sanity checks I should run? For now I am > doing reindex and vacuum. I think that forces everything to be read. I > am assuming that if vacuum completes without error then the table is > fine. The main sanity check is to make sure the disks are now in the recycling center and you have rebuilt the array with new ones. Then restore the whole database in a freshly initdb'd system, and praise yourself that you were able to get them before it all blew to pieces. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services