Thread: recover corrupted pg_controldata from WAL

recover corrupted pg_controldata from WAL

From
yuanjia lee
Date:

Hi, All

 

I am preparing to enhance the pg_resetlog to support recover corrupted pg_control data from WAL. I had finised the code now and testing it, but before I bring it out for patch review, I want to discuss some issues here to get some advice.

 

The functionality of reset the xlog is the same as before, except for extracting the exactly information from WAL. I also had added a new option to recovery the pg_control  file only but not to touch the xlog files.  My question is that should we separate the funtionality of recovery the pg_control file and reset log? I am questioning about should we make a new name instead of pg_resetlog, we can use the name like pg_xlog and put the funtionalities (like reset log, recovery pg_control file, dump binary log) into the same tool.

 

The algorithm of searching the WAL is like this:

1.      Read name of the segment files from xlog directory, and put all of their name into an one way list, the list is descending according to the time line, xlog id, segement id. (Athough I use only the lastest file in the implementation, but the list can be used for the feature like dump log in future.)

2.       Scan the records from the beginning of the latest segement file, if checkpoint is found then update the lastcheckpoint information.

One concern for just using the last segement file is that, in some situation, the last checkpoint record may not in the last segement file but in the prevoius segement file of last segement file, this is the limitation. Although I can search from the prevoius segement file of last segement file, but the implementation now just using the last segement file.

 

 

Regards

Yuanjia Lee


Start your day with Yahoo! - make it your home page

Re: recover corrupted pg_controldata from WAL

From
"Brusser, Michael"
Date:

I can’t contribute with any technical advice, but as a user I had to resort to using pg_resetxlog few times.

I wonder if it is possible to make it more user-friendly. It was never clear to me whether it was sufficient to run it

without any arguments, and if not what these arguments should be.

 

Perhaps at the minimum the –help option could be extended?

Thank you,

Mike

 


From: yuanjia lee [mailto:yuanjia_pg@yahoo.com]
Sent: Thursday, July 21, 2005 7:09 AM
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] recover corrupted pg_controldata from WAL

 

Hi, All

 

I am preparing to enhance the pg_resetlog to support recover corrupted pg_control data from WAL. I had finised the code now and testing it, but before I bring it out for patch review, I want to discuss some issues here to get some advice.

… … … …

… .. … …

 

Regards

Yuanjia Lee


Re: recover corrupted pg_controldata from WAL

From
Tom Lane
Date:
yuanjia lee <yuanjia_pg@yahoo.com> writes:
> The algorithm of searching the WAL is like this:

> 1.      Read name of the segment files from xlog directory, and put all of their name into an one way list, the list
isdescending according to the time line, xlog id, segement id. (Athough I use only the lastest file in the
implementation,but the list can be used for the feature like dump log in future.)
 

You do realize that in most situations, the segment files with the
newest-looking names have not been used yet, and contain older rather
than newer data?

When multiple timelines are present, I'm not sure I care for the
heuristic "use the highest timeline number", either.
        regards, tom lane


Re: recover corrupted pg_controldata from WAL

From
yuanjia lee
Date:
Hi Tom

I agree that it is wrong to use the information from
the file name itself. I will try to read the
xlp_pageaddr out from the segment header to figure out
which one is the lastest one.

In the mutilple time lines scenario, if the pg_control
file crashed, and the current time line information
will be lost. Altough we can let the user the select
the possible time line, but the implementation until
now is using the highest time line number. 

--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

> yuanjia lee <yuanjia_pg@yahoo.com> writes:
> > The algorithm of searching the WAL is like this:
> 
> > 1.      Read name of the segment files from xlog
> directory, and put all of their name into an one way
> list, the list is descending according to the time
> line, xlog id, segement id. (Athough I use only the
> lastest file in the implementation, but the list can
> be used for the feature like dump log in future.)
> 
> You do realize that in most situations, the segment
> files with the
> newest-looking names have not been used yet, and
> contain older rather
> than newer data?
> 
> When multiple timelines are present, I'm not sure I
> care for the
> heuristic "use the highest timeline number", either.
> 
>             regards, tom lane
> 


    
____________________________________________________
Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs