Thread: How do I recover from>> pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory

Hello,

Is there a way to recover from the following error?
I have (had) an existing database and wish not
to lose the data tables.

Thanks for any help,

Pete


[postgres@web2 /]$ pg_ctl start
postmaster successfully started
[postgres@web2 /]$ LOG:  database system shutdown was interrupted at
2004-10-18 11:41:55 PDT
LOG:  open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log
file 0, segment 0) failed: No such file or directory
LOG:  invalid primary checkpoint record
LOG:  open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log
file 0, segment 0) failed: No such file or directory
LOG:  invalid secondary checkpoint record
PANIC:  unable to locate a valid checkpoint record
LOG:  startup process (pid 2803) was terminated by signal 6
LOG:  aborting startup due to startup process failure

[postgres@web2 /]$



peter Willis <peterw@borstad.com> writes:
> [postgres@web2 /]$ LOG:  database system shutdown was interrupted at
> 2004-10-18 11:41:55 PDT
> LOG:  open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log
> file 0, segment 0) failed: No such file or directory
> LOG:  invalid primary checkpoint record
> LOG:  open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log
> file 0, segment 0) failed: No such file or directory
> LOG:  invalid secondary checkpoint record
> PANIC:  unable to locate a valid checkpoint record
> LOG:  startup process (pid 2803) was terminated by signal 6
> LOG:  aborting startup due to startup process failure

pg_resetxlog would probably get you to a point where you could start
the server, but you should not have any great illusions about the
consistency of your database afterward.

How did you get into this state, anyway?  And what PG version is it?

            regards, tom lane

Re: How do I recover from>> pg_xlog/0000000000000000 (log

From
pw
Date:

Tom Lane wrote:

>peter Willis <peterw@borstad.com> writes:
>
>
>>[postgres@web2 /]$ LOG:  database system shutdown was interrupted at
>>2004-10-18 11:41:55 PDT
>>LOG:  open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log
>>file 0, segment 0) failed: No such file or directory
>>LOG:  invalid primary checkpoint record
>>LOG:  open of /web2-disk1/grip/database/pg_xlog/0000000000000000 (log
>>file 0, segment 0) failed: No such file or directory
>>LOG:  invalid secondary checkpoint record
>>PANIC:  unable to locate a valid checkpoint record
>>LOG:  startup process (pid 2803) was terminated by signal 6
>>LOG:  aborting startup due to startup process failure
>>
>>
>
>pg_resetxlog would probably get you to a point where you could start
>the server, but you should not have any great illusions about the
>consistency of your database afterward.
>
>How did you get into this state, anyway?  And what PG version is it?
>
>            regards, tom lane
>
>

The server was running with postgres on terabyte firewire 800 drive.
A tech decided to 'hot-plug'  another terabyte  drive into the system
without downing the server, umounting the first drive, and then remounting
both drives.
Since ohci drivers tend to enumerate and mount without using the
hardware ID of
the drive , the poor kernel got confused and decided that the new drive
was first in line....clang!

I had a database backup from the previous day. I just used that.

I set up a cron job to pg_dump and gzip every hour and
dump any backup gz files older than 1 week.
I love that 'date' command .. :)

date +%F-%H%M%S

nice............ :)

Peter




Re: How do I recover from>> pg_xlog/0000000000000000 (log

From
Alvaro Herrera
Date:
On Tue, Oct 19, 2004 at 03:49:04PM -0700, pw wrote:

> I set up a cron job to pg_dump and gzip every hour and
> dump any backup gz files older than 1 week.

Huh ... be sure to keep some older backup anyway!  There was just
someone on a list (this one?) whose last two weeks of backups contained
no data (a guy with OpenACS or something).

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"The eagle never lost so much time, as
when he submitted to learn of the crow." (William Blake)


Re: How do I recover from>> pg_xlog/0000000000000000 (log

From
Mike Nolan
Date:
> On Tue, Oct 19, 2004 at 03:49:04PM -0700, pw wrote:
>
> > I set up a cron job to pg_dump and gzip every hour and
> > dump any backup gz files older than 1 week.
>
> Huh ... be sure to keep some older backup anyway!  There was just
> someone on a list (this one?) whose last two weeks of backups contained
> no data (a guy with OpenACS or something).

Also, if you don't routinely test your backups every now and then,
how can you be sure they'll work when you NEED them to?
--
Mike Nolan


Re: How do I recover from>> pg_xlog/0000000000000000 (log

From
pw
Date:
Mike Nolan wrote:
>>On Tue, Oct 19, 2004 at 03:49:04PM -0700, pw wrote:
>>
>>
>>>I set up a cron job to pg_dump and gzip every hour and
>>>dump any backup gz files older than 1 week.
>>
>>Huh ... be sure to keep some older backup anyway!  There was just
>>someone on a list (this one?) whose last two weeks of backups contained
>>no data (a guy with OpenACS or something).
>
>
> Also, if you don't routinely test your backups every now and then,
> how can you be sure they'll work when you NEED them to?
> --
> Mike Nolan
>
>

Hello,

If vacuumdb and pg_dump don't work
then I have bigger problems than just a hardware burp.

It's just like any other (MS incuded) software. You have to trust
it until it proves otherwise.
I've seen oracle go south because of hardware, etc. too.
At least I'm not spending $30,000 for the adventure.
I don't get any more satisfaction for the $30 grand
than rebuilding from a backup anyway.


If I really felt paranoid about it I could have a
test server set up and make a cron job that scps
the current backup over and builds
a database from it. Then queries every table for the
last updated record and compares it to the local server.
A days work tops.

I'm pretty sure the current backup method is OK though.
It can even move the database backup off site in case the
place burns down.

    In the case of the fellow with no data, It's difficult
to say whether that's real or not.
I moved a DB over to another machine and had to open the tar
file that came from pg_dump, edit the 'restore.sql' in several
places, and run the script manually so I could watch the error logging.
    All the data was there, it just wasn't going through the
COPY command properly (path issues). Also, the proceedural language
that I was using for a trigger needed to be installed by 'postgres'
user *first* before I was able to make part of the script work.
    It's pretty easy to forget all the schema stuff in a
database over time.
    Did that guy look in the '.dat' files to see if there
was data?


Peter