RE: Re: Loading optimization

Started by Mikheev, Vadimabout 25 years ago8 messages
#1Mikheev, Vadim
vmikheev@SECTORBASE.COM

This is OK for table files, unless someone's broken the
code that will auto-initialize a zero page when it comes across one.

Hmmm, I don't see anything like auto-initialization in code -:(
Where did you put these changes?

I didn't put 'em in, it looked like your work to me: see vacuum.c,
lines 618-622 in current sources.

Oh, this code was there from 6.0 days.

Awhile back I did fix PageGetFreeSpace and some related macros to
deliver sane results when looking at an all-zero page header, so that
scans and inserts would ignore the page until vacuum fixes it.

I see now - PageGetMaxOffsetNumber... Ok.

Perhaps WAL redo needs to be prepared to do PageInit as well?

It calls PageIsNew and uses flag in record to know when a page could
be uninitialized.

Actually, I'd expect the CRC check to catch an all-zeroes page (if
it fails to complain, then you misimplemented the CRC), so that would
be the place to deal with it now.

I've used standard CRC32 implementation you pointed me to -:)
But CRC is used in WAL records only.

Vadim

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mikheev, Vadim (#1)
CRCs (was Re: [GENERAL] Re: Loading optimization)

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:

Actually, I'd expect the CRC check to catch an all-zeroes page (if
it fails to complain, then you misimplemented the CRC), so that would
be the place to deal with it now.

I've used standard CRC32 implementation you pointed me to -:)
But CRC is used in WAL records only.

Oh. I thought we'd agreed that a CRC on each stored disk block would
be a good idea as well. I take it you didn't do that.

Do we want to consider doing this (and forcing another initdb)?
Or shall we say "too late for 7.1"?

regards, tom lane

#3Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#2)
Re: CRCs (was Re: [GENERAL] Re: Loading optimization)

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:

Actually, I'd expect the CRC check to catch an all-zeroes page (if
it fails to complain, then you misimplemented the CRC), so that would
be the place to deal with it now.

I've used standard CRC32 implementation you pointed me to -:)
But CRC is used in WAL records only.

Oh. I thought we'd agreed that a CRC on each stored disk block would
be a good idea as well. I take it you didn't do that.

No, I thought we agreed disk block CRC was way overkill. If the CRC on
the WAL log checks for errors that are not checked anywhere else, then
fine, but I thought disk CRC would just duplicate the I/O subsystem/disk
checks.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#3)
Re: CRCs (was Re: [GENERAL] Re: Loading optimization)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Oh. I thought we'd agreed that a CRC on each stored disk block would
be a good idea as well. I take it you didn't do that.

No, I thought we agreed disk block CRC was way overkill. If the CRC on
the WAL log checks for errors that are not checked anywhere else, then
fine, but I thought disk CRC would just duplicate the I/O subsystem/disk
checks.

A disk-block CRC would detect partially written blocks (ie, power drops
after disk has written M of the N sectors in a block). The disk's own
checks will NOT consider this condition a failure. I'm not convinced
that WAL will reliably detect it either (Vadim?). Certainly WAL will
not help for corruption caused by external agents, away from any updates
that are actually being performed/logged.

regards, tom lane

#5Philip Warner
pjw@rhyme.com.au
In reply to: Tom Lane (#2)
Re: CRCs (was Re: [GENERAL] Re: Loading optimization)

At 21:55 11/01/01 -0500, Tom Lane wrote:

Oh. I thought we'd agreed that a CRC on each stored disk block would
be a good idea as well. I take it you didn't do that.

Do we want to consider doing this (and forcing another initdb)?
Or shall we say "too late for 7.1"?

I thought it was coming too. I'd like to see it - if it's not too hard in
this release.

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

#6Vadim Mikheev
vmikheev@sectorbase.com
In reply to: Mikheev, Vadim (#1)
Re: CRCs (was Re: [GENERAL] Re: Loading optimization)

But CRC is used in WAL records only.

Oh. I thought we'd agreed that a CRC on each stored disk block would
be a good idea as well. I take it you didn't do that.

Do we want to consider doing this (and forcing another initdb)?
Or shall we say "too late for 7.1"?

I personally was never agreed to this. Reasons?

Vadim

#7Vadim Mikheev
vmikheev@sectorbase.com
In reply to: Bruce Momjian (#3)
Re: CRCs (was Re: [GENERAL] Re: Loading optimization)

No, I thought we agreed disk block CRC was way overkill. If the CRC on
the WAL log checks for errors that are not checked anywhere else, then
fine, but I thought disk CRC would just duplicate the I/O subsystem/disk
checks.

A disk-block CRC would detect partially written blocks (ie, power drops
after disk has written M of the N sectors in a block). The disk's own
checks will NOT consider this condition a failure. I'm not convinced
that WAL will reliably detect it either (Vadim?). Certainly WAL will

Idea proposed by Andreas about "physical log" is implemented!
Now WAL saves whole data blocks on first after checkpoint
modification. This way on recovery modified data blocks will be
first restored *as a whole*. Isn't it much better than just
detection of partially writes?

Only one type of modification isn't covered at the moment -
updated t_infomask of heap tuples.

not help for corruption caused by external agents, away from any updates
that are actually being performed/logged.

What do you mean by "external agents"?

Vadim

#8Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Vadim Mikheev (#7)
AW: CRCs (was Re: [GENERAL] Re: Loading optimization)

A disk-block CRC would detect partially written blocks (ie, power drops
after disk has written M of the N sectors in a block). The disk's own
checks will NOT consider this condition a failure.

But physical log recovery will rewrite every page that was changed
after last checkpoint, thus this is not an issue anymore.

I'm not convinced
that WAL will reliably detect it either (Vadim?). Certainly WAL will
not help for corruption caused by external agents, away from any updates
that are actually being performed/logged.

The external agent (if malvolent) could write a correct CRC anyway.
If on the other hand the agent writes complete garbage, vacuum will notice.

Andreas