AW: WAL does not recover gracefully from out-of-disk-sp ace

Started by Zeugswetter Andreas SBalmost 25 years ago3 messages
#1Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at

Regardless of whether this particular behavior is fixable, this brings
up something that I think we *must* do before 7.1 release: create a
utility that blows away a corrupted logfile to allow the system to
restart with whatever is in the datafiles. Otherwise, there is no
recovery technique for WAL restart failures, short of initdb and
restore from last backup. I'd rather be able to get at data of
questionable up-to-dateness than not have any chance of recovery
at all.

It would imho be great if this utility would also have a means to
extend a logfile, that was not extended to the full 16Mb, and
revert the change that writes the whole file in the init phase.

Imho this write at logfile init time adds a substantial amount of IO,
that would better be avoided. If we really need this, it would imho
be better to preallocate N logfiles and reuse them after checkpoint.

Andreas

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zeugswetter Andreas SB (#1)
Re: AW: WAL does not recover gracefully from out-of-disk-sp ace

Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes:

Imho this write at logfile init time adds a substantial amount of IO,
that would better be avoided. If we really need this, it would imho
be better to preallocate N logfiles and reuse them after checkpoint.

Already done. See the WAL_FILES parameter.

regards, tom lane

#3Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Zeugswetter Andreas SB (#1)
Re: AW: WAL does not recover gracefully from out-of-disk-sp ace

[ Charset ISO-8859-1 unsupported, converting... ]

Regardless of whether this particular behavior is fixable, this brings
up something that I think we *must* do before 7.1 release: create a
utility that blows away a corrupted logfile to allow the system to
restart with whatever is in the datafiles. Otherwise, there is no
recovery technique for WAL restart failures, short of initdb and
restore from last backup. I'd rather be able to get at data of
questionable up-to-dateness than not have any chance of recovery
at all.

Update OPEN ITEMS list:

Source Code Changes
-------------------
Allow recovery from corrupted WAL file
Finalize commit_delay value

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026