Thread: [Fwd: zero_damaged_pages having no effect]

[Fwd: zero_damaged_pages having no effect]

From
Dan Hrabarchuk
Date:
Hi

Can someone point me in the right direction on where to start?

I'm not getting the documented behaviour. If this is not the correct
list, I'll ask else where.

Below is the entry quoted from the online manual for 7.3:

ZERO_DAMAGED_PAGES (boolean)

        Detection of a damaged page header normally causes PostgreSQL to
        report an error, aborting the current transaction. Setting
        zero_damaged_pages to true causes the system to instead report a
        warning, zero out the damaged page, and continue processing.
        This behavior will destroy data, namely all the rows on the
        damaged page. But it allows you to get past the error and
        retrieve rows from any undamaged pages that may be present in
        the table. So it is useful for recovering data if corruption has
        occurred due to hardware or software error. You should generally
        not set this true until you have given up hope of recovering
        data from the damaged page(s) of a table. The default setting is
        off, and it can only be changed by a superuser.

-----Forwarded Message-----
> From: dan@kwasar.biz
> To: pgsql-admin@postgresql.org
> Subject: [ADMIN] zero_damaged_pages having no effect
> Date: Mon, 11 Oct 2004 12:44:50 -0700
>
> Hello again.
>
> I have a database on 7.3.4, FreeBSD with corruption. dump & restore is definetly
> not an option. Hardware is fine. Registed ECC memory on a SCSI hardware RAID5.
> I got the corruption when the file system ran out of space.
>
> I want to simply delete the damaged data and then I can reinsert it from it's
> source.
>
> Again, all I want is to delete the damaged data. A dump restore is not an
> option.
>
> So I set zero_damaged_pages to true in my postgres.conf.
>
> sam=> select * from pg_settings where name ='zero_damaged_pages';
>         name        | setting
> --------------------+---------
>  zero_damaged_pages | on
> (1 row)
>
> When I run a query that hits the damaged area, I was expecting a log message
> saying that tuples were being zeroed, and a truncated result in my client.
>
> But I got the usual:
> sam=> SELECT * FROM channeldata where cd_id=6268 and tstamp<'2004-09-20' and
> tstamp >'2004-09-15';
> PANIC:  open of /psql/pg_clog/0C98 failed: No such file or directory
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
> Is there some extra trick here? I just want to delete the bad tuples. I can
> restore the data from another source.
>
> I really don't care if I need to take the database server down at 2am and use a
> hex editor to clean this. I need my boss off my back. Please help me delete
> these bad tuples.
>
> Thanks again
>
> Dan Hrabarchuk
>
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster


Re: [Fwd: zero_damaged_pages having no effect]

From
Tom Lane
Date:
Dan Hrabarchuk <dan@kwasar.biz> writes:
> I'm not getting the documented behaviour.

The documented behavior is that zero_damaged_pages changes the response
to (detectably) damaged page headers.  It is not a magic cureall for
every form of data corruption.

            regards, tom lane

Re: [Fwd: zero_damaged_pages having no effect]

From
Dan Hrabarchuk
Date:
Thank you Tom.

So in my case, where should I start? I want to be able to dump this
database to a new machine. Given my corruption, how should I go about
this? My current plan is to use Slony, but I'm worried that the
corruption will interfere with the transfer.

How can I get the db server to simply ignore errors? The data that is
lost can be recovered from the original source.

postgres[61834]: [184] PANIC:  open of /psql/pg_clog/0C95 failed: No
such file or directory

What would happen if I touched this file as the postgres user??

So there would be a 0 length file, with the proper ownership and
permissions.

Thanks once again.

Dan