Thread: Any way to bring up a PG instance with corrupted data in it?

Any way to bring up a PG instance with corrupted data in it?

From

Keaton Adams

Date:

08 June 2009, 15:58:33

This is a QA system and unfortunately there is no recent backup.... So as a last resort I am looking for any way to bring up Postgres when it has corrupt data in it:

FATAL: could not remove old lock file "postmaster.pid": Read-only file system
HINT: The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again.
Jun 8 06:43:16 mxlqa401 postgres[21401]: [1-1] FATAL: could not remove old lock file "postmaster.pid": Read-only file system
Jun 8 06:43:16 mxlqa401 postgres[21401]: [1-2] HINT: The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again
.
FATAL: could not remove old lock file "postmaster.pid": Read-only file system
HINT: The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again.
Jun 8 06:43:29 mxlqa401 postgres[21476]: [1-1] FATAL: could not remove old lock file "postmaster.pid": Read-only file system
Jun 8 06:43:29 mxlqa401 postgres[21476]: [1-2] HINT: The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again
.
Jun 8 06:44:23 mxlqa401 postgres[21520]: [1-1] LOG: database system was interrupted at 2009-06-05 21:52:54 MDT
Jun 8 06:44:24 mxlqa401 postgres[21520]: [2-1] LOG: checkpoint record is at 134/682530F0
Jun 8 06:44:24 mxlqa401 postgres[21520]: [3-1] LOG: redo record is at 134/682530F0; undo record is at 0/0; shutdown FALSE
Jun 8 06:44:24 mxlqa401 postgres[21520]: [4-1] LOG: next transaction ID: 3005778382; next OID: 103111004
Jun 8 06:44:24 mxlqa401 postgres[21520]: [5-1] LOG: next MultiXactId: 93647; next MultiXactOffset: 190825
Jun 8 06:44:24 mxlqa401 postgres[21520]: [6-1] LOG: database system was not properly shut down; automatic recovery in progress
Jun 8 06:44:24 mxlqa401 postgres[21520]: [7-1] LOG: redo starts at 134/68253134
Jun 8 06:44:24 mxlqa401 postgres[21520]: [8-1] PANIC: could not access status of transaction 3005778383
Jun 8 06:44:24 mxlqa401 postgres[21520]: [8-2] DETAIL: could not read from file "pg_clog/0B32" at offset 139264: Success
Jun 8 06:44:29 mxlqa401 postgres[21518]: [1-1] LOG: startup process (PID 21520) was terminated by signal 6
Jun 8 06:44:29 mxlqa401 postgres[21518]: [2-1] LOG: aborting startup due to startup process failure
Jun 8 06:44:36 mxlqa401 postgres[21574]: [1-1] LOG: database system was interrupted while in recovery at 2009-06-08 06:44:24 MDT
Jun 8 06:44:36 mxlqa401 postgres[21574]: [1-2] HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery.
Jun 8 06:44:36 mxlqa401 postgres[21574]: [2-1] LOG: checkpoint record is at 134/682530F0
Jun 8 06:44:36 mxlqa401 postgres[21574]: [3-1] LOG: redo record is at 134/682530F0; undo record is at 0/0; shutdown FALSE
Jun 8 06:44:36 mxlqa401 postgres[21574]: [4-1] LOG: next transaction ID: 3005778382; next OID: 103111004
Jun 8 06:44:36 mxlqa401 postgres[21574]: [5-1] LOG: next MultiXactId: 93647; next MultiXactOffset: 190825
Jun 8 06:44:36 mxlqa401 postgres[21574]: [6-1] LOG: database system was not properly shut down; automatic recovery in progress

I tried to bring up a postgres backend process to get into the database in single-user mode and that won’t work either:

bash-3.2$ postgres -D /mxl/var/pgsql/data
PANIC: could not access status of transaction 3005778382
DETAIL: could not read from file "pg_clog/0B32" at offset 139264: Success
Aborted

bash-3.2$ postgres -D /mxl/var/pgsql/data -d 5 postgres
PANIC: could not access status of transaction 3005778382
DETAIL: could not read from file "pg_clog/0B32" at offset 139264: Success
Aborted

Any suggestions other than the obvious (restore from backup) would be appreciated.

Thanks,

Keaton

Re: Any way to bring up a PG instance with corrupted data in it?

From

Tom Lane

Date:

08 June 2009, 17:46:19

Keaton Adams <kadams@mxlogic.com> writes:
> This is a QA system and unfortunately there is no recent backup.... So as a last resort I am looking for any way to
bringup Postgres when it has corrupt data in it: 

pg_resetxlog?

            regards, tom lane

Re: Any way to bring up a PG instance with corrupted data in it?

From

Boszormenyi Zoltan

Date:

08 June 2009, 18:00:51

Hi,

Keaton Adams írta:
> This is a QA system and unfortunately there is no recent backup.... So
> as a last resort I am looking for any way to bring up Postgres when it
> has corrupt data in it:
>
> FATAL: could not remove old lock file "postmaster.pid": Read-only file
> system
> HINT: The file seems accidentally left over, but it could not be
> removed. Please remove the file by hand and try again.

The message above should give you a clue.
Repair the file system first and remount read-write.
Then try again to bring up the postmaster.

> Jun 8 06:43:16 mxlqa401 postgres[21401]: [1-1] FATAL: could not remove
> old lock file "postmaster.pid": Read-only file system
> Jun 8 06:43:16 mxlqa401 postgres[21401]: [1-2] HINT: The file seems
> accidentally left over, but it could not be removed. Please remove the
> file by hand and try again
> .
> FATAL: could not remove old lock file "postmaster.pid": Read-only file
> system
> HINT: The file seems accidentally left over, but it could not be
> removed. Please remove the file by hand and try again.
> Jun 8 06:43:29 mxlqa401 postgres[21476]: [1-1] FATAL: could not remove
> old lock file "postmaster.pid": Read-only file system
> Jun 8 06:43:29 mxlqa401 postgres[21476]: [1-2] HINT: The file seems
> accidentally left over, but it could not be removed. Please remove the
> file by hand and try again
> .
> Jun 8 06:44:23 mxlqa401 postgres[21520]: [1-1] LOG: database system
> was interrupted at 2009-06-05 21:52:54 MDT
> Jun 8 06:44:24 mxlqa401 postgres[21520]: [2-1] LOG: checkpoint record
> is at 134/682530F0
> Jun 8 06:44:24 mxlqa401 postgres[21520]: [3-1] LOG: redo record is at
> 134/682530F0; undo record is at 0/0; shutdown FALSE
> Jun 8 06:44:24 mxlqa401 postgres[21520]: [4-1] LOG: next transaction
> ID: 3005778382; next OID: 103111004
> Jun 8 06:44:24 mxlqa401 postgres[21520]: [5-1] LOG: next MultiXactId:
> 93647; next MultiXactOffset: 190825
> Jun 8 06:44:24 mxlqa401 postgres[21520]: [6-1] LOG: database system
> was not properly shut down; automatic recovery in progress
> Jun 8 06:44:24 mxlqa401 postgres[21520]: [7-1] LOG: redo starts at
> 134/68253134
> Jun 8 06:44:24 mxlqa401 postgres[21520]: [8-1] PANIC: could not access
> status of transaction 3005778383
> Jun 8 06:44:24 mxlqa401 postgres[21520]: [8-2] DETAIL: could not read
> from file "pg_clog/0B32" at offset 139264: Success
> Jun 8 06:44:29 mxlqa401 postgres[21518]: [1-1] LOG: startup process
> (PID 21520) was terminated by signal 6
> Jun 8 06:44:29 mxlqa401 postgres[21518]: [2-1] LOG: aborting startup
> due to startup process failure
> Jun 8 06:44:36 mxlqa401 postgres[21574]: [1-1] LOG: database system
> was interrupted while in recovery at 2009-06-08 06:44:24 MDT
> Jun 8 06:44:36 mxlqa401 postgres[21574]: [1-2] HINT: This probably
> means that some data is corrupted and you will have to use the last
> backup for recovery.
> Jun 8 06:44:36 mxlqa401 postgres[21574]: [2-1] LOG: checkpoint record
> is at 134/682530F0
> Jun 8 06:44:36 mxlqa401 postgres[21574]: [3-1] LOG: redo record is at
> 134/682530F0; undo record is at 0/0; shutdown FALSE
> Jun 8 06:44:36 mxlqa401 postgres[21574]: [4-1] LOG: next transaction
> ID: 3005778382; next OID: 103111004
> Jun 8 06:44:36 mxlqa401 postgres[21574]: [5-1] LOG: next MultiXactId:
> 93647; next MultiXactOffset: 190825
> Jun 8 06:44:36 mxlqa401 postgres[21574]: [6-1] LOG: database system
> was not properly shut down; automatic recovery in progress
>
> I tried to bring up a postgres backend process to get into the
> database in single-user mode and that won’t work either:
>
> bash-3.2$ postgres -D /mxl/var/pgsql/data
> PANIC: could not access status of transaction 3005778382
> DETAIL: could not read from file "pg_clog/0B32" at offset 139264: Success
> Aborted
>
> bash-3.2$ postgres -D /mxl/var/pgsql/data -d 5 postgres
> PANIC: could not access status of transaction 3005778382
> DETAIL: could not read from file "pg_clog/0B32" at offset 139264: Success
> Aborted
>
> Any suggestions other than the obvious (restore from backup) would be
> appreciated.
>
> Thanks,
>
> Keaton
>
>
>
>


--
Bible has answers for everything. Proof:
"But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil." (Matthew 5:37) - basics of digital technology.
"May your kingdom come" - superficial description of plate tectonics

----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/

Re: Any way to bring up a PG instance with corrupted data in it?

From

Keaton Adams

Date:

08 June 2009, 19:31:05

I had to calculate out the next transaction ID and –f (force) the change, but once I did this the DB came back up. So, thanks for the info, and now I know how this works.

The plan now is to dump the databases and reload them to ensure overall database integrity.

Thanks for the reply,

Keaton

On 6/8/09 11:46 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Keaton Adams <kadams@mxlogic.com> writes:
> This is a QA system and unfortunately there is no recent backup.... So as a last resort I am looking for any way to bring up Postgres when it has corrupt data in it:

pg_resetxlog?

regards, tom lane