recover corrupted pg_controldata from WAL

Started by yuanjia leeover 20 years ago4 messages
#1yuanjia lee
yuanjia_pg@yahoo.com

Hi, All

I am preparing to enhance the pg_resetlog to support recover corrupted pg_control data from WAL. I had finised the code now and testing it, but before I bring it out for patch review, I want to discuss some issues here to get some advice.

The functionality of reset the xlog is the same as before, except for extracting the exactly information from WAL. I also had added a new option to recovery the pg_control file only but not to touch the xlog files. My question is that should we separate the funtionality of recovery the pg_control file and reset log? I am questioning about should we make a new name instead of pg_resetlog, we can use the name like pg_xlog and put the funtionalities (like reset log, recovery pg_control file, dump binary log) into the same tool.

The algorithm of searching the WAL is like this:

1. Read name of the segment files from xlog directory, and put all of their name into an one way list, the list is descending according to the time line, xlog id, segement id. (Athough I use only the lastest file in the implementation, but the list can be used for the feature like dump log in future.)

2. Scan the records from the beginning of the latest segement file, if checkpoint is found then update the lastcheckpoint information.

One concern for just using the last segement file is that, in some situation, the last checkpoint record may not in the last segement file but in the prevoius segement file of last segement file, this is the limitation. Although I can search from the prevoius segement file of last segement file, but the implementation now just using the last segement file.

Regards

Yuanjia Lee

---------------------------------
Start your day with Yahoo! - make it your home page

#2Brusser, Michael
Michael.Brusser@matrixone.com
In reply to: yuanjia lee (#1)
Re: recover corrupted pg_controldata from WAL

I can't contribute with any technical advice, but as a user I had to resort
to using pg_resetxlog few times.

I wonder if it is possible to make it more user-friendly. It was never clear
to me whether it was sufficient to run it

without any arguments, and if not what these arguments should be.

Perhaps at the minimum the -help option could be extended?

Thank you,

Mike

_____

From: yuanjia lee [mailto:yuanjia_pg@yahoo.com]
Sent: Thursday, July 21, 2005 7:09 AM
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] recover corrupted pg_controldata from WAL

Hi, All

I am preparing to enhance the pg_resetlog to support recover corrupted
pg_control data from WAL. I had finised the code now and testing it, but
before I bring it out for patch review, I want to discuss some issues here
to get some advice.

... ... ... ...

... .. ... ...

Regards

Yuanjia Lee

_____

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: yuanjia lee (#1)
Re: recover corrupted pg_controldata from WAL

yuanjia lee <yuanjia_pg@yahoo.com> writes:

The algorithm of searching the WAL is like this:

1. Read name of the segment files from xlog directory, and put all of their name into an one way list, the list is descending according to the time line, xlog id, segement id. (Athough I use only the lastest file in the implementation, but the list can be used for the feature like dump log in future.)

You do realize that in most situations, the segment files with the
newest-looking names have not been used yet, and contain older rather
than newer data?

When multiple timelines are present, I'm not sure I care for the
heuristic "use the highest timeline number", either.

regards, tom lane

#4yuanjia lee
yuanjia_pg@yahoo.com
In reply to: Tom Lane (#3)
Re: recover corrupted pg_controldata from WAL

Hi Tom

I agree that it is wrong to use the information from
the file name itself. I will try to read the
xlp_pageaddr out from the segment header to figure out
which one is the lastest one.

In the mutilple time lines scenario, if the pg_control
file crashed, and the current time line information
will be lost. Altough we can let the user the select
the possible time line, but the implementation until
now is using the highest time line number.

--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

yuanjia lee <yuanjia_pg@yahoo.com> writes:

The algorithm of searching the WAL is like this:

1. Read name of the segment files from xlog

directory, and put all of their name into an one way
list, the list is descending according to the time
line, xlog id, segement id. (Athough I use only the
lastest file in the implementation, but the list can
be used for the feature like dump log in future.)

You do realize that in most situations, the segment
files with the
newest-looking names have not been used yet, and
contain older rather
than newer data?

When multiple timelines are present, I'm not sure I
care for the
heuristic "use the highest timeline number", either.

regards, tom lane

____________________________________________________
Start your day with Yahoo! - make it your home page
http://www.yahoo.com/r/hs