Thread: Recovery tools

Recovery tools

From
mlw
Date:
Are there any hard recovery tools for PostgreSQL?

Just suppose that all the log files are gone, and the only thing left is 
some of the files in the /data directory. Is there any way to scan this 
data and dump it to a file which could subsequently be used with a "COPY 
FROM STDIN" on a new database?



Re: Recovery tools

From
Tom Lane
Date:
mlw <pgsql@mohawksoft.com> writes:
> Just suppose that all the log files are gone, and the only thing left is 
> some of the files in the /data directory. Is there any way to scan this 
> data and dump it to a file which could subsequently be used with a "COPY 
> FROM STDIN" on a new database?

There aren't separate tools, and I'm not sure there could or should be.
What I'd do in that situation is:* pg_resetxlog to get a minimally valid xlog* if clog is missing, gin up files
containing0x55 everywhere  (to make it look like every transaction has committed --- or  put 00 everywhere if you'd
ratherassume that recent  transactions all aborted)* start postmaster, look around, fix problems until I can pg_dump.
 

AFAICS, you can make tools that work at the page/item level (like
pg_filedump, see http://sources.redhat.com/rhdb/), but there is hardly
any scope for doing anything intermediate between that and a full
postmaster.  There's no hope of decoding the contents of a tuple without
access to the table's tuple descriptor, which means you need most of the
system catalog mechanisms; plus you'd need the I/O routines for the
datatypes involved.  Might as well just use the postmaster as your data
inspection tool.
        regards, tom lane



Re: Recovery tools

From
Hannu Krosing
Date:
Tom Lane kirjutas E, 14.04.2003 kell 17:26:
> mlw <pgsql@mohawksoft.com> writes:
> > Just suppose that all the log files are gone, and the only thing left is 
> > some of the files in the /data directory. Is there any way to scan this 
> > data and dump it to a file which could subsequently be used with a "COPY 
> > FROM STDIN" on a new database?
> 
> There aren't separate tools, and I'm not sure there could or should be.
> What I'd do in that situation is:
>     * pg_resetxlog to get a minimally valid xlog
>     * if clog is missing, gin up files containing 0x55 everywhere
>       (to make it look like every transaction has committed --- or
>       put 00 everywhere if you'd rather assume that recent
>       transactions all aborted)
>     * start postmaster, look around, fix problems until I can pg_dump.
> 
> AFAICS, you can make tools that work at the page/item level (like
> pg_filedump, see http://sources.redhat.com/rhdb/), but there is hardly
> any scope for doing anything intermediate between that and a full
> postmaster.  There's no hope of decoding the contents of a tuple without
> access to the table's tuple descriptor, which means you need most of the
> system catalog mechanisms; plus you'd need the I/O routines for the
> datatypes involved.  Might as well just use the postmaster as your data
> inspection tool.

I have a small python script that does dump at page/tuple level and lets
people write their own tuple decoding routines. 

This could probably be used here too, as usually the original scripts
that created the database can be found and the tuples internal structure
is not too complicated.

I have been sending this script out about once in two months for last
2-3 years to people who have accidentally deleted some data and need to
get it out of table files.

The better solution for accidentally deleted data would of course be
time travel, as it was implemented in early days of postgres. I guess it
would not be too hard to put it back, at least if we dont specify
wallclock times but transaction numbers as time qualification.

----------------
Hannu



Re: Recovery tools

From
mlw
Date:

Hannu Krosing wrote:

>Tom Lane kirjutas E, 14.04.2003 kell 17:26:
>  
>
>>mlw <pgsql@mohawksoft.com> writes:
>>    
>>
>>>Just suppose that all the log files are gone, and the only thing left is 
>>>some of the files in the /data directory. Is there any way to scan this 
>>>data and dump it to a file which could subsequently be used with a "COPY 
>>>FROM STDIN" on a new database?
>>>      
>>>
>>There aren't separate tools, and I'm not sure there could or should be.
>>What I'd do in that situation is:
>>    * pg_resetxlog to get a minimally valid xlog
>>    * if clog is missing, gin up files containing 0x55 everywhere
>>      (to make it look like every transaction has committed --- or
>>      put 00 everywhere if you'd rather assume that recent
>>      transactions all aborted)
>>    * start postmaster, look around, fix problems until I can pg_dump.
>>
>>AFAICS, you can make tools that work at the page/item level (like
>>pg_filedump, see http://sources.redhat.com/rhdb/), but there is hardly
>>any scope for doing anything intermediate between that and a full
>>postmaster.  There's no hope of decoding the contents of a tuple without
>>access to the table's tuple descriptor, which means you need most of the
>>system catalog mechanisms; plus you'd need the I/O routines for the
>>datatypes involved.  Might as well just use the postmaster as your data
>>inspection tool.
>>    
>>
>
>I have a small python script that does dump at page/tuple level and lets
>people write their own tuple decoding routines. 
>
>This could probably be used here too, as usually the original scripts
>that created the database can be found and the tuples internal structure
>is not too complicated.
>
>  
>
So, what if, we added a PostgreSQL routine that dumps the tuple layout 
when a table is created, and put it in a directory like: 
/data/layout/db/ffff.ext automatically.

Then we include a version of your program which can read the layout and 
dump a table. That would answer Tom's issues. It would certainly be an 
important "when all else fails" tool that may save your ass.

>  
>