Thread: Please help - corrupt database

Please help - corrupt database

From
"Ertel, Steve"
Date:

Last night, while running a scheduled backup, our database became corrupt.  When I open the database backup file it includes the definition of all of the tables but only data for the first table. 
 
We are currently running version 8.1.11.
 
When I try to execute any code against the database, I get the following message:
 
db_name=>select count(0) from usage;
WARNING:  could not write block 1 of 1663/28004/49027
DETAIL:  Multiple failures --- write error may be permanent.
ERROR:  could not open relation 1663/28004/49027: No such file or directory
CONTEXT:  writing block 1 of relation 1663/28004/49027
 
From the log files you can see that the backup does a REINDEX DATABASE db_name
While this reindex is running, several other queries are running against the database wrapped in a BEGIN/COMMIT transaction.
After the transaction query was issued, the log file provides the message NOTICE:  table "pg_class" was reindexed
A few more queries run and then the log file provides the error message: ERROR:  could not find pg_class tuple for index 2663
 
Is there anything that we can do to recover from this issue?
 
A co-worker of mine sent in an email just a few minutes ago about this problem.  I am just providing more detail.
 
Thanks,
Steve
 

Re: Please help - corrupt database

From
Tom Lane
Date:
"Ertel, Steve" <Steve.Ertel@infimatic.com> writes:
> From the log files you can see that the backup does a REINDEX DATABASE db_name
> While this reindex is running, several other queries are running against the database wrapped in a BEGIN/COMMIT
transaction.
> After the transaction query was issued, the log file provides the message NOTICE:  table "pg_class" was reindexed
> A few more queries run and then the log file provides the error message: ERROR:  could not find pg_class tuple for
index2663 

> Is there anything that we can do to recover from this issue?

REINDEX DATABASE during active database operations is just asking for
trouble in 8.1 :-(.  There are race conditions that can be triggered
if another backend opens a reindexed index at the wrong time.

My advice is to stop doing the REINDEX DATABASE; I don't think that's
necessary or appropriate as a routine maintenance operation.  If you
insist on doing it, update to a recent 8.2.x or 8.3.x release, where
it should be reasonably safe.

            regards, tom lane

Re: Please help - corrupt database

From
"Ertel, Steve"
Date:
Thanks Tom.  We recently added REINDEX DATABASE to satisfy a customer that swears that databases need to be routinely
reindexed. In the customer's mind, this is especially true because we are not running sql server nor windows.  I'll
takethe command out of our backup script and pass on the info that you provided. 


Best regards,

Steve Ertel
859-534-4568
steve.ertel@infimatic.com


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, December 16, 2008 10:24 AM
To: Ertel, Steve
Cc: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Please help - corrupt database

"Ertel, Steve" <Steve.Ertel@infimatic.com> writes:
> From the log files you can see that the backup does a REINDEX DATABASE
> db_name While this reindex is running, several other queries are running against the database wrapped in a
BEGIN/COMMITtransaction. 
> After the transaction query was issued, the log file provides the
> message NOTICE:  table "pg_class" was reindexed A few more queries run
> and then the log file provides the error message: ERROR:  could not
> find pg_class tuple for index 2663

> Is there anything that we can do to recover from this issue?

REINDEX DATABASE during active database operations is just asking for trouble in 8.1 :-(.  There are race conditions
thatcan be triggered if another backend opens a reindexed index at the wrong time. 

My advice is to stop doing the REINDEX DATABASE; I don't think that's necessary or appropriate as a routine maintenance
operation. If you insist on doing it, update to a recent 8.2.x or 8.3.x release, where it should be reasonably safe. 

            regards, tom lane