Thread: How do I recover from>> pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory
How do I recover from>> pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory
From
peter Willis
Date:
Hello, Is there a way to recover from the following error? I have (had) an existing database and wish not to lose the data tables. Thanks for any help, Pete [postgres@web2 /]$ pg_ctl start postmaster successfully started [postgres@web2 /]$ LOG: database system shutdown was interrupted at 2004-10-18 11:41:55 PDT LOG: open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory LOG: invalid primary checkpoint record LOG: open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory LOG: invalid secondary checkpoint record PANIC: unable to locate a valid checkpoint record LOG: startup process (pid 2803) was terminated by signal 6 LOG: aborting startup due to startup process failure [postgres@web2 /]$
Re: How do I recover from>> pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory
From
Tom Lane
Date:
peter Willis <peterw@borstad.com> writes: > [postgres@web2 /]$ LOG: database system shutdown was interrupted at > 2004-10-18 11:41:55 PDT > LOG: open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log > file 0, segment 0) failed: No such file or directory > LOG: invalid primary checkpoint record > LOG: open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log > file 0, segment 0) failed: No such file or directory > LOG: invalid secondary checkpoint record > PANIC: unable to locate a valid checkpoint record > LOG: startup process (pid 2803) was terminated by signal 6 > LOG: aborting startup due to startup process failure pg_resetxlog would probably get you to a point where you could start the server, but you should not have any great illusions about the consistency of your database afterward. How did you get into this state, anyway? And what PG version is it? regards, tom lane
Tom Lane wrote: >peter Willis <peterw@borstad.com> writes: > > >>[postgres@web2 /]$ LOG: database system shutdown was interrupted at >>2004-10-18 11:41:55 PDT >>LOG: open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log >>file 0, segment 0) failed: No such file or directory >>LOG: invalid primary checkpoint record >>LOG: open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log >>file 0, segment 0) failed: No such file or directory >>LOG: invalid secondary checkpoint record >>PANIC: unable to locate a valid checkpoint record >>LOG: startup process (pid 2803) was terminated by signal 6 >>LOG: aborting startup due to startup process failure >> >> > >pg_resetxlog would probably get you to a point where you could start >the server, but you should not have any great illusions about the >consistency of your database afterward. > >How did you get into this state, anyway? And what PG version is it? > > regards, tom lane > > The server was running with postgres on terabyte firewire 800 drive. A tech decided to 'hot-plug' another terabyte drive into the system without downing the server, umounting the first drive, and then remounting both drives. Since ohci drivers tend to enumerate and mount without using the hardware ID of the drive , the poor kernel got confused and decided that the new drive was first in line....clang! I had a database backup from the previous day. I just used that. I set up a cron job to pg_dump and gzip every hour and dump any backup gz files older than 1 week. I love that 'date' command .. :) date +%F-%H%M%S nice............ :) Peter
On Tue, Oct 19, 2004 at 03:49:04PM -0700, pw wrote: > I set up a cron job to pg_dump and gzip every hour and > dump any backup gz files older than 1 week. Huh ... be sure to keep some older backup anyway! There was just someone on a list (this one?) whose last two weeks of backups contained no data (a guy with OpenACS or something). -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "The eagle never lost so much time, as when he submitted to learn of the crow." (William Blake)
> On Tue, Oct 19, 2004 at 03:49:04PM -0700, pw wrote: > > > I set up a cron job to pg_dump and gzip every hour and > > dump any backup gz files older than 1 week. > > Huh ... be sure to keep some older backup anyway! There was just > someone on a list (this one?) whose last two weeks of backups contained > no data (a guy with OpenACS or something). Also, if you don't routinely test your backups every now and then, how can you be sure they'll work when you NEED them to? -- Mike Nolan
Mike Nolan wrote: >>On Tue, Oct 19, 2004 at 03:49:04PM -0700, pw wrote: >> >> >>>I set up a cron job to pg_dump and gzip every hour and >>>dump any backup gz files older than 1 week. >> >>Huh ... be sure to keep some older backup anyway! There was just >>someone on a list (this one?) whose last two weeks of backups contained >>no data (a guy with OpenACS or something). > > > Also, if you don't routinely test your backups every now and then, > how can you be sure they'll work when you NEED them to? > -- > Mike Nolan > > Hello, If vacuumdb and pg_dump don't work then I have bigger problems than just a hardware burp. It's just like any other (MS incuded) software. You have to trust it until it proves otherwise. I've seen oracle go south because of hardware, etc. too. At least I'm not spending $30,000 for the adventure. I don't get any more satisfaction for the $30 grand than rebuilding from a backup anyway. If I really felt paranoid about it I could have a test server set up and make a cron job that scps the current backup over and builds a database from it. Then queries every table for the last updated record and compares it to the local server. A days work tops. I'm pretty sure the current backup method is OK though. It can even move the database backup off site in case the place burns down. In the case of the fellow with no data, It's difficult to say whether that's real or not. I moved a DB over to another machine and had to open the tar file that came from pg_dump, edit the 'restore.sql' in several places, and run the script manually so I could watch the error logging. All the data was there, it just wasn't going through the COPY command properly (path issues). Also, the proceedural language that I was using for a trigger needed to be installed by 'postgres' user *first* before I was able to make part of the script work. It's pretty easy to forget all the schema stuff in a database over time. Did that guy look in the '.dat' files to see if there was data? Peter