Page Checksums

Started by David Fetterover 14 years ago80 messageshackers

david@fetter.org

over 14 years ago

Folks,

What:

Please find attached a patch for 9.2-to-be which implements page
checksums. It changes the page format, so it's an initdb-forcing
change.

How:
In order to ensure that the checksum actually matches the hint
bits, this makes a copy of the page, calculates the checksum, then
sends the checksum and copy to the kernel, which handles sending
it the rest of the way to persistent storage.

Why:
My employer, VMware, thinks it's a good thing, and has dedicated
engineering resources to it. Lots of people's data is already in
cosmic ray territory, and many others' data will be soon. And
it's a TODO :)

If this introduces new failure modes, please detail, and preferably
demonstrate, just what those new modes are. As far as we've been able
to determine so far, it could expose on-disk corruption that wasn't
exposed before, but we see this as dealing with a previously
un-dealt-with failure rather than causing one.

Questions, comments and bug fixes are, of course, welcome.

Let the flames begin!

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: David Fetter (#1)

Re: Page Checksums

On 17.12.2011 23:33, David Fetter wrote:

What:

Please find attached a patch for 9.2-to-be which implements page
checksums. It changes the page format, so it's an initdb-forcing
change.

How:
In order to ensure that the checksum actually matches the hint
bits, this makes a copy of the page, calculates the checksum, then
sends the checksum and copy to the kernel, which handles sending
it the rest of the way to persistent storage.
...
If this introduces new failure modes, please detail, and preferably
demonstrate, just what those new modes are.

Hint bits, torn pages -> failed CRC. See earlier discussion:

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01975.php

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

David Fetter

david@fetter.org

over 14 years ago

In reply to: Heikki Linnakangas (#2)

Re: Page Checksums

On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:

On 17.12.2011 23:33, David Fetter wrote:

What:

Please find attached a patch for 9.2-to-be which implements page
checksums. It changes the page format, so it's an initdb-forcing
change.

How:
In order to ensure that the checksum actually matches the hint
bits, this makes a copy of the page, calculates the checksum, then
sends the checksum and copy to the kernel, which handles sending
it the rest of the way to persistent storage.
...
If this introduces new failure modes, please detail, and preferably
demonstrate, just what those new modes are.

Hint bits, torn pages -> failed CRC. See earlier discussion:

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01975.php

The patch requires that full page writes be on in order to obviate
this problem by never reading a torn page. Instead, copy of the page
has already hit storage before the torn write occurs.

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: David Fetter (#3)

Re: Page Checksums

On 18.12.2011 10:54, David Fetter wrote:

On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:

On 17.12.2011 23:33, David Fetter wrote:

If this introduces new failure modes, please detail, and preferably
demonstrate, just what those new modes are.

Hint bits, torn pages -> failed CRC. See earlier discussion:

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01975.php

The patch requires that full page writes be on in order to obviate
this problem by never reading a torn page.

Doesn't help. Hint bit updates are not WAL-logged.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

David Fetter

david@fetter.org

over 14 years ago

In reply to: Heikki Linnakangas (#4)

Re: Page Checksums

On Sun, Dec 18, 2011 at 12:19:32PM +0200, Heikki Linnakangas wrote:

On 18.12.2011 10:54, David Fetter wrote:

On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:

On 17.12.2011 23:33, David Fetter wrote:

If this introduces new failure modes, please detail, and preferably
demonstrate, just what those new modes are.

Hint bits, torn pages -> failed CRC. See earlier discussion:

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01975.php

The patch requires that full page writes be on in order to obviate
this problem by never reading a torn page.

Doesn't help. Hint bit updates are not WAL-logged.

What new failure modes are you envisioning for this case? Any way to
simulate them, even if it's by injecting faults into the source code?

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: David Fetter (#5)

Re: Page Checksums

On 18.12.2011 20:44, David Fetter wrote:

On Sun, Dec 18, 2011 at 12:19:32PM +0200, Heikki Linnakangas wrote:

On 18.12.2011 10:54, David Fetter wrote:

On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:

On 17.12.2011 23:33, David Fetter wrote:

If this introduces new failure modes, please detail, and preferably
demonstrate, just what those new modes are.

Hint bits, torn pages -> failed CRC. See earlier discussion:

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01975.php

The patch requires that full page writes be on in order to obviate
this problem by never reading a torn page.

Doesn't help. Hint bit updates are not WAL-logged.

What new failure modes are you envisioning for this case?

Umm, the one explained in the email I linked to... Let me try once more.
For the sake of keeping the example short, imagine that the PostgreSQL
block size is 8 bytes, and the OS block size is 4 bytes. The CRC is 1
byte, and is stored on the first byte of each page.

In the beginning, a page is in the buffer cache, and it looks like this:

AA 12 34 56 78 9A BC DE

AA is the checksum. Now a hint bit on the last byte is set, so that the
page in the shared buffer cache looks like this:

AA 12 34 56 78 9A BC DF

Now PostgreSQL wants to evict the page from the buffer cache, so it
recalculates the CRC. The page in the buffer cache now looks like this:

BB 12 34 56 78 9A BC DF

Now, PostgreSQL writes the page to the OS cache, with the write() system
call. It sits in the OS cache for a few seconds, and then the OS decides
to flush the first 4 bytes, ie. the first OS block, to disk. On disk,
you now have this:

BB 12 34 56 78 9A BC DE

If the server now crashes, before the OS has flushed the second half of
the PostgreSQL page to disk, you have a classic torn page. The updated
CRC made it to disk, but the hint bit did not. The CRC on disk is not
valid, for the rest of the contents of that page on disk.

Without CRCs, that's not a problem because the data is valid whether or
not the hint bit makes it to the disk. It's just a hint, after all. But
when you have a CRC on the page, the CRC is only valid if both the CRC
update *and* the hint bit update makes it to disk, or neither.

So you've just turned an innocent torn page, which PostgreSQL tolerates
just fine, into a block with bad CRC.

Any way to
simulate them, even if it's by injecting faults into the source code?

Hmm, it's hard to persuade the OS to suffer a torn page on purpose. What
you could do is split the write() call in mdwrite() into two. First
write the 1st half of the page, then the second. Then you can put a
breakpoint in between the writes, and kill the system before the 2nd
half is written.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Peter Eisentraut

peter_e@gmx.net

over 14 years ago

In reply to: Heikki Linnakangas (#6)

Re: Page Checksums

On sön, 2011-12-18 at 21:34 +0200, Heikki Linnakangas wrote:

On 18.12.2011 20:44, David Fetter wrote:

Any way to
simulate them, even if it's by injecting faults into the source code?

Hmm, it's hard to persuade the OS to suffer a torn page on purpose. What
you could do is split the write() call in mdwrite() into two. First
write the 1st half of the page, then the second. Then you can put a
breakpoint in between the writes, and kill the system before the 2nd
half is written.

Perhaps the Library-level Fault Injector (http://lfi.sf.net) could be
used to set up a test for this. (Not that I think you need one, but if
David wants to see it happen himself ...)

Jesper Krogh

jesper@krogh.cc

over 14 years ago

In reply to: Heikki Linnakangas (#4)

Re: Page Checksums

On 2011-12-18 11:19, Heikki Linnakangas wrote:

The patch requires that full page writes be on in order to obviate
this problem by never reading a torn page.

Doesn't help. Hint bit updates are not WAL-logged.

I dont know if it would be seen as a "half baked feature".. or similar,
and I dont know if the hint bit problem is solvable at all, but I could
easily imagine checksumming just "skipping" the hit bit entirely.

It would still provide checksumming for the majority of the "data" sitting
underneath the system, and would still be extremely usefull in my
eyes.

Jesper
--
Jesper

Bruce Momjian

bruce@momjian.us

over 14 years ago

In reply to: Jesper Krogh (#8)

Re: Page Checksums

On Sun, Dec 18, 2011 at 7:51 PM, Jesper Krogh <jesper@krogh.cc> wrote:

I dont know if it would be seen as a "half baked feature".. or similar,
and I dont know if the hint bit problem is solvable at all, but I could
easily imagine checksumming just "skipping" the hit bit entirely.

That was one approach discussed. The problem is that the hint bits are
currently in each heap tuple header which means the checksum code
would have to know a fair bit about the structure of the page format.
Also the closer people looked the more hint bits kept turning up
because the coding pattern had been copied to other places (the page
header has one, and index pointers have a hint bit indicating that the
target tuple is deleted, etc). And to make matters worse skipping
individual bits in varying places quickly becomes a big consumer of
cpu time since it means injecting logic into each iteration of the
checksum loop to mask out the bits.

So the general feeling was that we should move all the hint bits to a
dedicated part of the buffer so that they could all be skipped in a
simple way that doesn't depend on understanding the whole structure of
the page. That's not conceptually hard, it's just a fair amount of
work. I think that's where it was left off.

There is another way to look at this problem. Perhaps it's worth
having a checksum *even if* there are ways for the checksum to be
spuriously wrong. Obviously having an invalid checksum can't be a
fatal error then but it might still be useful information. Rright now
people don't really know if their system can experience torn pages or
not and having some way of detecting them could be useful. And if you
have other unexplained symptoms then having checksum errors might be
enough evidence that the investigation should start with the hardware
and get the sysadmin looking at hardware logs and running memtest
sooner.

--
greg

#10

Josh Berkus

josh@agliodbs.com

over 14 years ago

In reply to: Bruce Momjian (#9)

Re: Page Checksums

On 12/18/11 5:55 PM, Greg Stark wrote:

There is another way to look at this problem. Perhaps it's worth
having a checksum *even if* there are ways for the checksum to be
spuriously wrong. Obviously having an invalid checksum can't be a
fatal error then but it might still be useful information. Rright now
people don't really know if their system can experience torn pages or
not and having some way of detecting them could be useful. And if you
have other unexplained symptoms then having checksum errors might be
enough evidence that the investigation should start with the hardware
and get the sysadmin looking at hardware logs and running memtest
sooner.

Frankly, if I had torn pages, even if it was just hint bits missing, I
would want that to be logged. That's expected if you crash, but if you
start seeing bad CRC warnings when you haven't had a crash? That means
you have a HW problem.

As long as the CRC checks are by default warnings, then I don't see a
problem with this; it's certainly better than what we have now.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#11

Aidan Van Dyk

aidan@highrise.ca

over 14 years ago

In reply to: Josh Berkus (#10)

Re: Page Checksums

On Sun, Dec 18, 2011 at 11:21 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 12/18/11 5:55 PM, Greg Stark wrote:

There is another way to look at this problem. Perhaps it's worth
having a checksum *even if* there are ways for the checksum to be
spuriously wrong. Obviously having an invalid checksum can't be a
fatal error then but it might still be useful information. Rright now
people don't really know if their system can experience torn pages or
not and having some way of detecting them could be useful. And if you
have other unexplained symptoms then having checksum errors might be
enough evidence that the investigation should start with the hardware
and get the sysadmin looking at hardware logs and running memtest
sooner.

Frankly, if I had torn pages, even if it was just hint bits missing, I
would want that to be logged. That's expected if you crash, but if you
start seeing bad CRC warnings when you haven't had a crash? That means
you have a HW problem.

As long as the CRC checks are by default warnings, then I don't see a
problem with this; it's certainly better than what we have now.

But the scary part is you don't know how long *ago* the crash was.
Because a hint-bit-only change w/ a torn-page is a "non event" in
PostgreSQL *DESIGN*, on crash recovery, it doesn't do anything to try
and "scrub" every page in the database.

So you could have a crash, then a recovery, and a couple clean
shutdown-restart combinations before you happen to read the "needed"
page that was torn in the crash $X [ days | weeks | months ] ago.
It's specifically because PostgreSQL was *DESIGNED* to make torn pages
a non-event (because WAL/FPW fixes anything that's dangerous), that
the whole CRC issue is so complicated...

I'll through out a few random thoughts (some repeated) that people who
really want the CRC can fight over:

1) Find a way to not bother writing out hint-bit-only-dirty pages....
I know people like Kevin keep recommending a vacuum freeze after a
big load to avoid later problems anyways and I think that's probably
common in big OLAP shops, and OLTP people are likely to have real
changes on the page anyways. Does anybody want to try and measure
what type of performance trade-offs we'ld really have on a variety of
"normal" (ya, I know, what's normal) workloads? If the page has a
real change, it's got a WAL FPW, so we avoid the problem....

2) If the writer/checksummer knows it's a hint-bit-only-dirty page,
can it stuff a "cookie" checksum in it and not bother verifying?
Looses a bit of the CRC guarentee, especially around "crashes" which
is when we expect a torn page, but avoids the whole "scary! scary!
Your database is corrupt!" false-positives in the situation PostgreSQL
was specifically desinged to make not scary.

#) Anybody investigated putting the CRC in a relation fork, but not
right in the data block? If the CRC contains a timestamp, and is WAL
logged before the write, at least on reading a block with a wrong
checksum, if a warning is emitted, the timestamp could be looked at by
whoever is reading the warning and know tht the block was written
shortly before the crash $X $PERIODS ago....

The whole "CRC is only a warning" because we "expect to get them if we
ever crashed" means that the time when we most want them, we have to
assume they are bogus... And to make matters worse, we don't even
know when the perioud of "they may be bugus" ends, unless we have a
way to methodically force PG through ever buffer in the database after
the crash... And then that makes them very hard to consider
useful...

--
Aidan Van Dyk Create like a god,
aidan@highrise.ca command like a king,
http://www.highrise.ca/ work like a slave.

#12

Simon Riggs

simon@2ndQuadrant.com

over 14 years ago

In reply to: Josh Berkus (#10)

Re: Page Checksums

On Mon, Dec 19, 2011 at 4:21 AM, Josh Berkus <josh@agliodbs.com> wrote:

On 12/18/11 5:55 PM, Greg Stark wrote:

There is another way to look at this problem. Perhaps it's worth
having a checksum *even if* there are ways for the checksum to be
spuriously wrong. Obviously having an invalid checksum can't be a
fatal error then but it might still be useful information. Rright now
people don't really know if their system can experience torn pages or
not and having some way of detecting them could be useful. And if you
have other unexplained symptoms then having checksum errors might be
enough evidence that the investigation should start with the hardware
and get the sysadmin looking at hardware logs and running memtest
sooner.

Frankly, if I had torn pages, even if it was just hint bits missing, I
would want that to be logged. That's expected if you crash, but if you
start seeing bad CRC warnings when you haven't had a crash? That means
you have a HW problem.

As long as the CRC checks are by default warnings, then I don't see a
problem with this; it's certainly better than what we have now.

It is an important problem, and also a big one, hence why it still exists.

Throwing WARNINGs for normal events would not help anybody; thousands
of false positives would just make Postgres appear to be less robust
than it really is. That would be a credibility disaster. VMWare
already have their own distro, so if they like this patch they can use
it.

The only sensible way to handle this is to change the page format as
discussed. IMHO the only sensible way that can happen is if we also
support an online upgrade feature. I will take on the online upgrade
feature if others work on the page format issues, but none of this is
possible for 9.2, ISTM.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#13

Andres Freund

andres@anarazel.de

over 14 years ago

In reply to: Simon Riggs (#12)

Re: Page Checksums

On Monday, December 19, 2011 12:10:11 PM Simon Riggs wrote:

The only sensible way to handle this is to change the page format as
discussed. IMHO the only sensible way that can happen is if we also
support an online upgrade feature. I will take on the online upgrade
feature if others work on the page format issues, but none of this is
possible for 9.2, ISTM.

Totally with you that its not 9.2 material. But I think if somebody actually
wants to implement that that person would need to start discussing and
implementing rather soon if it should be ready for 9.3. Just because its not
geared towards the next release doesn't mean it OT.

Andres

#14

Robert Haas

robertmhaas@gmail.com

over 14 years ago

In reply to: Simon Riggs (#12)

Re: Page Checksums

On Mon, Dec 19, 2011 at 6:10 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

Throwing WARNINGs for normal events would not help anybody; thousands
of false positives would just make Postgres appear to be less robust
than it really is. That would be a credibility disaster. VMWare
already have their own distro, so if they like this patch they can use
it.

Agreed on all counts.

It seems to me that it would be possible to plug this hole by keeping
track of which pages in shared_buffers have had unlogged changes to
them since the last FPI. When you go to evict such a page, you write
some kind of WAL record for it - either an FPI, or maybe a partial
page image containing just the parts that might have been changed
(like all the tuple headers, or whatever). This would be expensive,
of course.

The only sensible way to handle this is to change the page format as
discussed. IMHO the only sensible way that can happen is if we also
support an online upgrade feature. I will take on the online upgrade
feature if others work on the page format issues, but none of this is
possible for 9.2, ISTM.

I'm not sure that I understand the dividing line you are drawing here.
However, with respect to the implementation of this particular
feature, it would be nice if we could arrange things so that space
cost of the feature need only be paid by people who are using it. I
think it would be regrettable if everyone had to give up 4 bytes per
page because some people want checksums. Maybe I'll feel differently
if it turns out that the overhead of turning on checksumming is
modest, but that's not what I'm expecting.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#15

Stephen Frost

sfrost@snowman.net

over 14 years ago

In reply to: Aidan Van Dyk (#11)

Re: Page Checksums

* Aidan Van Dyk (aidan@highrise.ca) wrote:

But the scary part is you don't know how long *ago* the crash was.
Because a hint-bit-only change w/ a torn-page is a "non event" in
PostgreSQL *DESIGN*, on crash recovery, it doesn't do anything to try
and "scrub" every page in the database.

Fair enough, but, could we distinguish these two cases? In other words,
would it be possible to detect if a page was torn due to a 'traditional'
crash and not complain in that case, but complain if there's a CRC
failure and it *doesn't* look like a torn page?

Perhaps that's a stretch, but if we can figure out that a page is torn
already, then perhaps it's not so far fetched..

Thanks,

Stephen
(who is no expert on WAL/torn pages/etc)

#16

Stephen Frost

sfrost@snowman.net

over 14 years ago

In reply to: Aidan Van Dyk (#11)

Re: Page Checksums

* Aidan Van Dyk (aidan@highrise.ca) wrote:

#) Anybody investigated putting the CRC in a relation fork, but not
right in the data block? If the CRC contains a timestamp, and is WAL
logged before the write, at least on reading a block with a wrong
checksum, if a warning is emitted, the timestamp could be looked at by
whoever is reading the warning and know tht the block was written
shortly before the crash $X $PERIODS ago....

I do like the idea of putting the CRC info in a relation fork, if it can
be made to work decently, as we might be able to then support it on a
per-relation basis, and maybe even avoid the on-disk format change..

Of course, I'm sure there's all kinds of problems with that approach,
but it might be worth some thinking about.

Thanks,

Stephen

#17

Alvaro Herrera

alvherre@2ndquadrant.com

over 14 years ago

In reply to: Stephen Frost (#16)

Re: Page Checksums

Excerpts from Stephen Frost's message of lun dic 19 11:18:21 -0300 2011:

* Aidan Van Dyk (aidan@highrise.ca) wrote:

#) Anybody investigated putting the CRC in a relation fork, but not
right in the data block? If the CRC contains a timestamp, and is WAL
logged before the write, at least on reading a block with a wrong
checksum, if a warning is emitted, the timestamp could be looked at by
whoever is reading the warning and know tht the block was written
shortly before the crash $X $PERIODS ago....

I do like the idea of putting the CRC info in a relation fork, if it can
be made to work decently, as we might be able to then support it on a
per-relation basis, and maybe even avoid the on-disk format change..

Of course, I'm sure there's all kinds of problems with that approach,
but it might be worth some thinking about.

I think the main objection to that idea was that if you lose a single
page of CRCs you have hundreds of data pages which no longer have good
CRCs.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#18

Robert Haas

robertmhaas@gmail.com

over 14 years ago

In reply to: Stephen Frost (#15)

Re: Page Checksums

On Mon, Dec 19, 2011 at 9:14 AM, Stephen Frost <sfrost@snowman.net> wrote:

* Aidan Van Dyk (aidan@highrise.ca) wrote:

But the scary part is you don't know how long *ago* the crash was.
Because a hint-bit-only change w/ a torn-page is a "non event" in
PostgreSQL *DESIGN*, on crash recovery, it doesn't do anything to try
and "scrub" every page in the database.

Fair enough, but, could we distinguish these two cases? In other words,
would it be possible to detect if a page was torn due to a 'traditional'
crash and not complain in that case, but complain if there's a CRC
failure and it *doesn't* look like a torn page?

No.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#19

David Fetter

david@fetter.org

over 14 years ago

In reply to: Robert Haas (#18)

Re: Page Checksums

On Mon, Dec 19, 2011 at 09:34:51AM -0500, Robert Haas wrote:

On Mon, Dec 19, 2011 at 9:14 AM, Stephen Frost <sfrost@snowman.net> wrote:

* Aidan Van Dyk (aidan@highrise.ca) wrote:

But the scary part is you don't know how long *ago* the crash was.
Because a hint-bit-only change w/ a torn-page is a "non event" in
PostgreSQL *DESIGN*, on crash recovery, it doesn't do anything to try
and "scrub" every page in the database.

Fair enough, but, could we distinguish these two cases? In other words,
would it be possible to detect if a page was torn due to a 'traditional'
crash and not complain in that case, but complain if there's a CRC
failure and it *doesn't* look like a torn page?

No.

Would you be so kind as to elucidate this a bit?

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#20

Andres Freund

andres@anarazel.de

over 14 years ago

In reply to: Alvaro Herrera (#17)

Re: Page Checksums

On Monday, December 19, 2011 03:33:22 PM Alvaro Herrera wrote:

Excerpts from Stephen Frost's message of lun dic 19 11:18:21 -0300 2011:

* Aidan Van Dyk (aidan@highrise.ca) wrote:

#) Anybody investigated putting the CRC in a relation fork, but not
right in the data block? If the CRC contains a timestamp, and is WAL
logged before the write, at least on reading a block with a wrong
checksum, if a warning is emitted, the timestamp could be looked at by
whoever is reading the warning and know tht the block was written
shortly before the crash $X $PERIODS ago....

I do like the idea of putting the CRC info in a relation fork, if it can
be made to work decently, as we might be able to then support it on a
per-relation basis, and maybe even avoid the on-disk format change..

Of course, I'm sure there's all kinds of problems with that approach,
but it might be worth some thinking about.

I think the main objection to that idea was that if you lose a single
page of CRCs you have hundreds of data pages which no longer have good
CRCs.

Which I find a pretty non-argument because there is lots of SPOF data in a
cluster (WAL, control record) anyway...
If recent data starts to fail you have to restore from backup anyway.

Andres

#21

Stephen Frost

sfrost@snowman.net

over 14 years ago

In reply to: David Fetter (#19)

#22

Stephen Frost

sfrost@snowman.net

over 14 years ago

In reply to: Andres Freund (#20)

#23

Greg Smith

gsmith@gregsmith.com

over 14 years ago

In reply to: Robert Haas (#14)

#24

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Greg Smith (#23)

#25

Robert Haas

robertmhaas@gmail.com

over 14 years ago

In reply to: David Fetter (#19)

#26

Robert Haas

robertmhaas@gmail.com

over 14 years ago

In reply to: Kevin Grittner (#24)

#27

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Robert Haas (#26)

#28

Greg Smith

gsmith@gregsmith.com

over 14 years ago

In reply to: Kevin Grittner (#27)

#29

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: Robert Haas (#25)

#30

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Greg Smith (#28)

#31

Robert Haas

robertmhaas@gmail.com

over 14 years ago

In reply to: Kevin Grittner (#27)

#32

Chris Browne

cbbrowne@acm.org

over 14 years ago

In reply to: Robert Haas (#31)

#33

Alvaro Herrera

alvherre@2ndquadrant.com

over 14 years ago

In reply to: Chris Browne (#32)

#34

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Robert Haas (#31)

#35

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Alvaro Herrera (#33)

#36

Andres Freund

andres@anarazel.de

over 14 years ago

In reply to: Kevin Grittner (#35)

#37

Simon Riggs

simon@2ndQuadrant.com

over 14 years ago

In reply to: Simon Riggs (#12)

#38

Aidan Van Dyk

aidan@highrise.ca

over 14 years ago

In reply to: Kevin Grittner (#35)

#39

Tom Lane

tgl@sss.pgh.pa.us

over 14 years ago

In reply to: Andres Freund (#36)

#40

Andres Freund

andres@anarazel.de

over 14 years ago

In reply to: Tom Lane (#39)

#41

Andres Freund

andres@anarazel.de

over 14 years ago

In reply to: Simon Riggs (#37)

#42

Jesper Krogh

jesper@krogh.cc

over 14 years ago

In reply to: Simon Riggs (#37)

#43

Jesper Krogh

jesper@krogh.cc

over 14 years ago

In reply to: Bruce Momjian (#9)

#44

Greg Smith

gsmith@gregsmith.com

over 14 years ago

In reply to: Kevin Grittner (#30)

#45

Leonardo Francalanci

m_lists@yahoo.it

over 14 years ago

In reply to: Kevin Grittner (#35)

#46

Stephen Frost

sfrost@snowman.net

over 14 years ago

In reply to: Leonardo Francalanci (#45)

#47

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Greg Smith (#44)

#48

Andres Freund

andres@anarazel.de

over 14 years ago

In reply to: Kevin Grittner (#47)

#49

Leonardo Francalanci

m_lists@yahoo.it

over 14 years ago

In reply to: Stephen Frost (#46)

#50

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: Kevin Grittner (#47)

#51

Robert Haas

robertmhaas@gmail.com

over 14 years ago

In reply to: Chris Browne (#32)

#52

Stephen Frost

sfrost@snowman.net

over 14 years ago

In reply to: Leonardo Francalanci (#49)

#53

Leonardo Francalanci

m_lists@yahoo.it

over 14 years ago

In reply to: Stephen Frost (#52)

#54

Tom Lane

tgl@sss.pgh.pa.us

over 14 years ago

In reply to: Heikki Linnakangas (#50)

#55

Greg Smith

gsmith@gregsmith.com

over 14 years ago

In reply to: Stephen Frost (#52)

#56

Martijn van Oosterhout

kleptog@svana.org

over 14 years ago

In reply to: Leonardo Francalanci (#45)

#57

Simon Riggs

simon@2ndQuadrant.com

over 14 years ago

In reply to: Greg Smith (#55)

#58

Leonardo Francalanci

m_lists@yahoo.it

over 14 years ago

In reply to: Simon Riggs (#57)

#59

Bruce Momjian

bruce@momjian.us

over 14 years ago

In reply to: Kevin Grittner (#24)

#60

Jeff Davis

pgsql@j-davis.com

over 14 years ago

In reply to: Robert Haas (#14)

#61

Jeff Davis

pgsql@j-davis.com

over 14 years ago

In reply to: Bruce Momjian (#9)

#62

Jeff Davis

pgsql@j-davis.com

over 14 years ago

In reply to: Heikki Linnakangas (#29)

#63

Jeff Davis

pgsql@j-davis.com

over 14 years ago

In reply to: Bruce Momjian (#59)

#64

Robert Haas

robertmhaas@gmail.com

over 14 years ago

In reply to: Jeff Davis (#60)

#65

Simon Riggs

simon@2ndQuadrant.com

over 14 years ago

In reply to: Robert Haas (#64)

#66

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: Robert Haas (#64)

#67

Jim Nasby

Jim.Nasby@BlueTreble.com

over 14 years ago

In reply to: Simon Riggs (#65)

#68

Simon Riggs

simon@2ndQuadrant.com

over 14 years ago

In reply to: Heikki Linnakangas (#29)

#69

Jim Nasby

Jim.Nasby@BlueTreble.com

over 14 years ago

In reply to: Simon Riggs (#68)

#70

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: Jim Nasby (#69)

#71

Simon Riggs

simon@2ndQuadrant.com

over 14 years ago

In reply to: Heikki Linnakangas (#70)

#72

Benedikt Grundmann

bgrundmann@janestreet.com

over 14 years ago

In reply to: Simon Riggs (#71)

#73

Jim Nasby

Jim.Nasby@BlueTreble.com

about 14 years ago

In reply to: Simon Riggs (#71)

#74

Robert Treat

xzilla@users.sourceforge.net

about 14 years ago

In reply to: Jim Nasby (#73)

#75

Florian Weimer

fweimer@bfk.de

about 14 years ago

In reply to: Robert Treat (#74)

#76

Jesper Krogh

jesper@krogh.cc

about 14 years ago

In reply to: Florian Weimer (#75)

#77

Florian Weimer

fweimer@bfk.de

about 14 years ago

In reply to: Jesper Krogh (#76)

#78

Robert Treat

xzilla@users.sourceforge.net

about 14 years ago

In reply to: Jesper Krogh (#76)

#79

Simon Riggs

simon@2ndQuadrant.com

about 14 years ago

In reply to: Robert Treat (#78)

#80

Jim Nasby

Jim.Nasby@BlueTreble.com

about 14 years ago

In reply to: Simon Riggs (#79)

Page Checksums

Attachments: