Thread: Corrupt view in PostgreSQL 9.0.9
Hi, First some info about the machine and the postgres version: os: Debian 6.0.5 (64-bit) postgres version: PostgreSQL 9.0.9 on x86_64-iso-8859-1-linux-gnu, compiled by GCC gcc (Debian 4.4.5-8) 4.4.5, 64-bit The problem: I first noticed pg_dump giving the following error message when trying to dump the database: pg_dump: invalid column numbering in table "communitytaxiexceptions" then when I connect to the database with psql and try to drop the view I get the following error: # drop view communitytaxiexceptions ; ERROR: invalid attribute number 0 for communitytaxiexceptions and a select gives the same error message: # select * from communitytaxiexceptions ; ERROR: invalid attribute number 0 for communitytaxiexceptions LINE 1: select * from communitytaxiexceptions ; ^ Is there anything thing I could do to manually remove the view or fix the problem? Thanks in advance, Mikael
Mikael Kjellström <mikael.kjellstrom@mksoft.nu> writes: > # drop view communitytaxiexceptions ; > ERROR: invalid attribute number 0 for communitytaxiexceptions That's pretty odd --- it implies something nasty has happened to the contents of pg_attribute. It would be interesting to have a look atselect * from pg_attribute where attrelid = 'communitytaxiexceptions'::regclass; > Is there anything thing I could do to manually remove the view or fix > the problem? Well, you could manually remove its pg_class and pg_type rows, and that should be close enough to being "gone". But what I'm worried about is what other damage there is. regards, tom lane
On 2012-09-28 16:35, Tom Lane wrote: > Mikael Kjellström <mikael.kjellstrom@mksoft.nu> writes: >> # drop view communitytaxiexceptions ; >> ERROR: invalid attribute number 0 for communitytaxiexceptions > > That's pretty odd --- it implies something nasty has happened to the > contents of pg_attribute. It would be interesting to have a look at > select * from pg_attribute where attrelid = 'communitytaxiexceptions'::regclass; Here is the result of the above query, I hope it's readable and not mangled: # select * from pg_attribute where attrelid = 'communitytaxiexceptions'::regclass; attrelid | attname | atttypid | attstattarget | attlen | attnum | attndims | attcacheoff | atttypmod | attbyva l | attstorage | attalign | attnotnull | atthasdef | attisdropped | attislocal | attinhcount | attacl | attoptions ----------+------------------+----------+---------------+--------+--------+----------+-------------+-----------+-------- --+------------+----------+------------+-----------+--------------+------------+-------------+--------+------------ 16547| commun | 25 | -1 | -1 | 1 | 0 | -1 | -1 | f | x |i | f | f | f | t | 0 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 16547 | unda | 0 | 700 | 0 | 0 | 0 | 2883584 | 1245245 | t | | | t | t | t | f | 0 | | 16547 | undantagid | 25 | -1 | -1 | 6 | 0 | -1 | -1 | f | x | i | f | f | f | t | 0 | | 16547 | numbernamn | 25 | -1 | -1 | 7 | 0 | -1 | -1 | f | x | i | f | f | f | t | 0 | | 16547 | telenummer | 25 | -1 | -1 | 8 | 0 | -1 | -1 | f | x | i | f | f | f | t | 0 | | 16547 |distributionname | 25 | -1 | -1 | 9 | 0 | -1 | -1 | f | x | i | f | f | f | t | 0 | | 16547 | zoneid | 23 | -1 | 4 | 10 | 0 | -1 | -1 | t | p | i | f | f | f | t | 0 | | (10 rows) >> Is there anything thing I could do to manually remove the view or fix >> the problem? > > Well, you could manually remove its pg_class and pg_type rows, and that > should be close enough to being "gone". But what I'm worried about is > what other damage there is. Yes, that is what I am afraid of also. We had a nasty power spike and that caused the machine to reboot. The raid controller is equipped with a BBU though so there shouldn't be any lost disk writes. But you never know. Any point of running a vacuum full on the database or is that a bad idea? /Mikael
Mikael Kjellström <mikael.kjellstrom@mksoft.nu> writes: > On 2012-09-28 16:35, Tom Lane wrote: >> That's pretty odd --- it implies something nasty has happened to the >> contents of pg_attribute. It would be interesting to have a look at >> select * from pg_attribute where attrelid = 'communitytaxiexceptions'::regclass; > Here is the result of the above query, I hope it's readable and not mangled: Well, you definitely lost a few rows there, and some of the ones that survived are partially incorrect. But it could be worse --- I was afraid that query wouldn't work at all. >>> Is there anything thing I could do to manually remove the view or fix >>> the problem? >> Well, you could manually remove its pg_class and pg_type rows, and that >> should be close enough to being "gone". But what I'm worried about is >> what other damage there is. > Yes, that is what I am afraid of also. We had a nasty power spike and > that caused the machine to reboot. The raid controller is equipped with > a BBU though so there shouldn't be any lost disk writes. But you never > know. Looks like you had some data corruption from the spike. If you're lucky, it's just this one block of pg_attribute and you can reconstruct things. If not, there may be more problems ... > Any point of running a vacuum full on the database or is that a bad idea? I wouldn't try that. It might be worth trying a REINDEX on pg_attribute. That will not fix the table damage but it will at least make sure the indexes are consistent with what's now in the table. After that, I'd suggest manually removing the broken view's pg_class row and then seeing if you can pg_dump with sane-looking results. If you can, a dump and restore of at least this database would be prudent. regards, tom lane
On 2012-09-28 19:01, Tom Lane wrote: > Looks like you had some data corruption from the spike. If you're > lucky, it's just this one block of pg_attribute and you can reconstruct > things. If not, there may be more problems ... Looks like there are more problems unfortunally. See below. >> Any point of running a vacuum full on the database or is that a bad idea? > > I wouldn't try that. It might be worth trying a REINDEX on > pg_attribute. That will not fix the table damage but it will at least > make sure the indexes are consistent with what's now in the table. > After that, I'd suggest manually removing the broken view's pg_class > row and then seeing if you can pg_dump with sane-looking results. > If you can, a dump and restore of at least this database would be > prudent. I tried the reindex and then the manual delete of the view from pg_class. That took care of the rows in pg_attribute but when I tried to run pg_dump I now get this: pg_dump: failed sanity check, parent table OID 16547 of pg_rewrite entry OID 16550 not found /Mikael
Mikael Kjellström <mikael.kjellstrom@mksoft.nu> writes: > I tried the reindex and then the manual delete of the view from > pg_class. That took care of the rows in pg_attribute but when I tried > to run pg_dump I now get this: > pg_dump: failed sanity check, parent table OID 16547 of pg_rewrite entry > OID 16550 not found Oh, okay, pg_dump is doing more cross-checking than I remembered. That's the ON SELECT rule for the broken view. You'll need to delete it manually too, and perhaps also the pg_type row if pg_dump chances to notice that. I doubt you'll need to clean up the pg_attribute rows. regards, tom lane
On 2012-09-28 19:26, Tom Lane wrote: > Mikael Kjellström <mikael.kjellstrom@mksoft.nu> writes: >> I tried the reindex and then the manual delete of the view from >> pg_class. That took care of the rows in pg_attribute but when I tried >> to run pg_dump I now get this: > >> pg_dump: failed sanity check, parent table OID 16547 of pg_rewrite entry >> OID 16550 not found > > Oh, okay, pg_dump is doing more cross-checking than I remembered. > That's the ON SELECT rule for the broken view. You'll need to delete it > manually too, and perhaps also the pg_type row if pg_dump chances to > notice that. I doubt you'll need to clean up the pg_attribute rows. Ok, that did the trick. Now the backup completed. Your recomendation is to restore the backup to a new database and drop the old one, right? /Mikael
Mikael Kjellström <mikael.kjellstrom@mksoft.nu> writes: > Your recomendation is to restore the backup to a new database and drop > the old one, right? Right. If you've got more than one active database in this installation, it might be prudent to dump and restore the whole thing --- but in any case, you don't want to trust this database's pg_attribute anymore. regards, tom lane
On 2012-09-28 19:39, Tom Lane wrote: > Mikael Kjellström <mikael.kjellstrom@mksoft.nu> writes: >> Your recomendation is to restore the backup to a new database and drop >> the old one, right? > > Right. If you've got more than one active database in this > installation, it might be prudent to dump and restore the whole thing > --- but in any case, you don't want to trust this database's > pg_attribute anymore. Ok. I've done just that. I dumped the database, dropped it and then created it and restored the backup. All the data seems to be there also at a first glance. I also took the oppertunity to compile and upgrade to 9.0.10 when I had the chance. Thanks for all the help. /Mikael