Thread: pg_clog corrupt, can't start postgres

pg_clog corrupt, can't start postgres

From

"Anjan Dave"

Date:

21 February 2005, 15:58:22

Hi,
 
I need some help in bringing back this db please.
 
The partition ran out of space from an import process. I cleared up the space and attempted to start the postgres
serviceagain, but it doesn't start and i get following in the message log.
 
 
HINT:  This probably means that some data is corrupted and you will have to use the last backup for recovery.
LOG:  checkpoint record is at 1B/27F23A6C
LOG:  redo record is at 1B/27751714; undo record is at 0/0; shutdown FALSE
LOG:  next transaction ID: 45279762; next OID: 43062083
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at 1B/27751714
PANIC:  could not access status of transaction 45514755
DETAIL:  could not read from file "/var/lib/pgsql/data/pg_clog/002B" at offset 106496: Success
LOG:  startup process (PID 23991) was terminated by signal 6
LOG:  aborting startup due to startup process failure

Postgres is 7.4.1 on this machine.
 
I saw some previous posts on this subject and so far the solution seems to be initialize and restore databases from the
dumps.
 
I can live with some aborted transactions, if it's possible to recover somehow.
 
$ psql
psql: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
 
# ls -l pg_xlog/
total 131232
-rw-------    1 postgres postgres 16777216 Feb 19 13:30 0000001B00000026
-rw-------    1 postgres postgres 16777216 Feb 19 13:34 0000001B00000027
-rw-------    1 postgres postgres 16777216 Feb 19 13:44 0000001B00000028
-rw-------    1 postgres postgres 16777216 Feb 19 13:15 0000001B00000029
-rw-------    1 postgres postgres 16777216 Feb 19 13:12 0000001B0000002A
-rw-------    1 postgres postgres 16777216 Feb 19 13:18 0000001B0000002B
-rw-------    1 postgres postgres 16777216 Feb 19 13:26 0000001B0000002C
-rw-------    1 postgres postgres 16777216 Feb 19 13:22 0000001B0000002D
# ls -l pg_clog/
total 628
-rw-------    1 postgres postgres   262144 Feb 19 04:31 0029
-rw-------    1 postgres postgres   262144 Feb 19 11:55 002A
-rw-------    1 postgres postgres   106496 Feb 20 22:34 002B

Thanks,
Anjan

Re: pg_clog corrupt, can't start postgres

From

Tom Lane

Date:

21 February 2005, 16:42:30

"Anjan Dave" <adave@vantage.com> writes:
> The partition ran out of space from an import process. I cleared up the space and attempted to start the postgres
serviceagain, but it doesn't start and i get following in the message log. 
> PANIC:  could not access status of transaction 45514755
> DETAIL:  could not read from file "/var/lib/pgsql/data/pg_clog/002B" at offset 106496: Success
> LOG:  startup process (PID 23991) was terminated by signal 6
> LOG:  aborting startup due to startup process failure

> Postgres is 7.4.1 on this machine.

> I saw some previous posts on this subject and so far the solution seems to be initialize and restore databases from
thedumps. 

Before that, try updating to 7.4.7 (or at least 7.4.2) --- this looks
like the same bug fixed here:

2004-01-26 14:16  tgl

    * src/backend/access/transam/varsup.c (REL7_4_STABLE): Repair
    incorrect order of operations in GetNewTransactionId().  We must
    complete ExtendCLOG() before advancing nextXid, so that if that
    routine fails, the next incoming transaction will try it again.
    Per trouble report from Christopher Kings-Lynne.

You might also go back to the mail list archives from that time and see
what advice was given to Chris about getting out of the problem he found
himself in.  I *think* that something along the line of forcibly
appending a page of zeroes to that clog file might be the best solution,
but this was more than a year ago and I don't recall for sure.

            regards, tom lane

Re: pg_clog corrupt, can't start postgres

From

"Anjan Dave"

Date:

21 February 2005, 17:44:58

Tom,
 
You're the man! 
 
I zeroed out the troubled pg_clog file and the db started up fine! Here's the link to the discussion, and a detailed
explanationof the issue by Tom:
 

http://groups-beta.google.com/group/comp.databases.postgresql.hackers/browse_thread/thread/c97c853f640b9ac1/d6bc3c75eed6c2a4?q=could+not+access+status+of+transaction#d6bc3c75eed6c2a4
 
Tom, is the issue resolved after 7.4.1?
 
Thanks,
Anjan
 
 
-----Original Message----- 
From: Tom Lane [mailto:tgl@sss.pgh.pa.us] 
Sent: Mon 2/21/2005 11:42 AM 
To: Anjan Dave 
Cc: pgsql-admin@postgresql.org 
Subject: Re: [ADMIN] pg_clog corrupt, can't start postgres 



    "Anjan Dave" <adave@vantage.com> writes: 
    > The partition ran out of space from an import process. I cleared up the space and attempted to start the postgres
serviceagain, but it doesn't start and i get following in the message log.
 

    > PANIC:  could not access status of transaction 45514755 
    > DETAIL:  could not read from file "/var/lib/pgsql/data/pg_clog/002B" at offset 106496: Success 
    > LOG:  startup process (PID 23991) was terminated by signal 6 
    > LOG:  aborting startup due to startup process failure 

    > Postgres is 7.4.1 on this machine. 
      
    > I saw some previous posts on this subject and so far the solution seems to be initialize and restore databases
fromthe dumps.
 

    Before that, try updating to 7.4.7 (or at least 7.4.2) --- this looks 
    like the same bug fixed here: 

    2004-01-26 14:16  tgl 

            * src/backend/access/transam/varsup.c (REL7_4_STABLE): Repair 
            incorrect order of operations in GetNewTransactionId().  We must 
            complete ExtendCLOG() before advancing nextXid, so that if that 
            routine fails, the next incoming transaction will try it again. 
            Per trouble report from Christopher Kings-Lynne. 

    You might also go back to the mail list archives from that time and see 
    what advice was given to Chris about getting out of the problem he found 
    himself in.  I *think* that something along the line of forcibly 
    appending a page of zeroes to that clog file might be the best solution, 
    but this was more than a year ago and I don't recall for sure. 

                            regards, tom lane

Re: pg_clog corrupt, can't start postgres

From

Tom Lane

Date:

21 February 2005, 20:19:50

"Anjan Dave" <adave@vantage.com> writes:
> Tom, is the issue resolved after 7.4.1?

Yes, that's why I told you to update.

            regards, tom lane