How to determine a database is intact?
On a nightly basis, we shut the database down and do a file system backup.
A short chronology of our database problem:
8/21 - count(*) of user tables succeeded (done once a week to get
statistics)
8/23 - A specific search on a specific value (one out of over 2 million)
caused postmaster to SEGV. I dropped the index in question and rebuilt it.
All appeared ok.
8/28 - count(*) failed - postmaster aborted on a SEGV. I attempted a
pg_dumpall. Postmaster aborted with SEGV about 240 MB into the dump (the
database is about 50 GB).
I reloaded 8/25 database and attempted a pg_dumpall. Same failure.
I reloaded from the 8/21 file system dump. Pg_dumpall worked on this one.
I spent the next almost 48 hours dumping and reloading 240 million rows and
reprocessing the several million additions since 8/21. I didn't dare use
the 8/21 database without reloading because I didn't know if it was good or
not.
Question:
How can we tell that a database is intact? In the above example, pg_dumpall
worked on the 8/21 database. Did it become corrupt between 8/21 and 8/23,
or was it already corrupt and got worse? Pg_dumpall tells you nothing about
the condition of indexes. Could a corrupt index corrupt data blocks?
I'm looking at doing a pg_dumpall on a weekly basis so that we have a point
in time where we know we have a recoverable database. When the database
reaches several hundred GB and over over a billion rows, this isn't a great
solution, and doesn't address the overall database integrity.
Back to the original question... How can I verify the complete integrity of
a database - especially a very large one where a reload or full index
rebuild could take on the order of days?
Wes
Hmm. I do a nightly dump of our production database, archive a copy
offsite, and verify the quality of the dump by running a little
verification script that is little more than a restore.
But if it would take you more than a day to do that, I'm not sure.
-tfo
On Sep 2, 2004, at 3:30 PM, Wes wrote:
Show quoted text
Back to the original question... How can I verify the complete
integrity of
a database - especially a very large one where a reload or full index
rebuild could take on the order of days?Wes
Wes wrote:
On a nightly basis, we shut the database down and do a file system backup.
A short chronology of our database problem:
[snip]
Question:
How can we tell that a database is intact? In the above example, pg_dumpall
worked on the 8/21 database. Did it become corrupt between 8/21 and 8/23,
or was it already corrupt and got worse? Pg_dumpall tells you nothing about
the condition of indexes. Could a corrupt index corrupt data blocks?I'm looking at doing a pg_dumpall on a weekly basis so that we have a point
in time where we know we have a recoverable database. When the database
reaches several hundred GB and over over a billion rows, this isn't a great
solution, and doesn't address the overall database integrity.Back to the original question... How can I verify the complete integrity of
a database - especially a very large one where a reload or full index
rebuild could take on the order of days?
You shouldn't have to verify anything. PG's job is to never corrupt your
data, and providing your hardware is good it should do so. If you are
getting problems almost daily that would suggest a RAM/disk problem to
me (sig 11 usually implies RAM). Can't guarantee it's not PG but it's
record of reliability is pretty good.
Steps I'd take:
1. Check your version number against the release notes and see if you
should upgrade. You don't mention your version, but it's always worth
having the last dot-release (7.2.5, 7.3.7, 7.4.5)
2. Schedule time to run memory/disk tests against your hardware. Finding
48 hours might not be easy, but you need to know where you stand.
3. Setup slony or some other replication so I can schedule my downtime.
--
Richard Huxton
Archonet Ltd
On 9/3/04 3:11 AM, "Richard Huxton" <dev@archonet.com> wrote:
You shouldn't have to verify anything. PG's job is to never corrupt your
data, and providing your hardware is good it should do so. If you are
getting problems almost daily that would suggest a RAM/disk problem to
me (sig 11 usually implies RAM). Can't guarantee it's not PG but it's
record of reliability is pretty good.
I believe SEGV typically just indicates it de-referenced a bad pointer (i.e.
NULL or out of range). The problem is not occurring on a daily basis. The
database has been in service since December of last year. It's just that
the symptoms progressed from no apparent symptoms, to a clearly corrupt DB.
My guess is that some minor corruption fed upon itself until the DB couldn't
even be dumped.
Steps I'd take:
1. Check your version number against the release notes and see if you
should upgrade. You don't mention your version, but it's always worth
having the last dot-release (7.2.5, 7.3.7, 7.4.5)
2. Schedule time to run memory/disk tests against your hardware. Finding
48 hours might not be easy, but you need to know where you stand.
3. Setup slony or some other replication so I can schedule my downtime.
I thought I mentioned the level in my original mail - 7.4.1. We are
planning on running some diagnostics.
Whether there is a bug in PostgreSQL, or there was a memory hit, or whatever
doesn't really matter to the original question. The database can become
corrupt. How can I tell that a database is fully intact at any given point
in time? If I reload from a system backup before the known corruption, how
can I be sure that the original corruption that precipitated the failure is
not still there and will again rear its ugly head?
Wes
Wes wrote:
On 9/3/04 3:11 AM, "Richard Huxton" <dev@archonet.com> wrote:
You shouldn't have to verify anything. PG's job is to never corrupt your
data, and providing your hardware is good it should do so. If you are
getting problems almost daily that would suggest a RAM/disk problem to
me (sig 11 usually implies RAM). Can't guarantee it's not PG but it's
record of reliability is pretty good.I believe SEGV typically just indicates it de-referenced a bad pointer (i.e.
NULL or out of range). The problem is not occurring on a daily basis. The
database has been in service since December of last year. It's just that
the symptoms progressed from no apparent symptoms, to a clearly corrupt DB.
My guess is that some minor corruption fed upon itself until the DB couldn't
even be dumped.
Or even just that block of index was never used.
Steps I'd take:
1. Check your version number against the release notes and see if you
should upgrade. You don't mention your version, but it's always worth
having the last dot-release (7.2.5, 7.3.7, 7.4.5)
2. Schedule time to run memory/disk tests against your hardware. Finding
48 hours might not be easy, but you need to know where you stand.
3. Setup slony or some other replication so I can schedule my downtime.I thought I mentioned the level in my original mail - 7.4.1. We are
planning on running some diagnostics.
Ah - first thing you can do is move to 7.4.5, that won't require a
dump/reload. Do read the release notes first though.
Whether there is a bug in PostgreSQL, or there was a memory hit, or whatever
doesn't really matter to the original question. The database can become
corrupt. How can I tell that a database is fully intact at any given point
in time? If I reload from a system backup before the known corruption, how
can I be sure that the original corruption that precipitated the failure is
not still there and will again rear its ugly head?
Put bluntly, you can't. The only way to verify the database as a whole
is to check every single value in it. If actual values get corrupted
then you may never even notice (e.g. a text field with a single
character corrupted).
However, if you dump and restore then three things can be guaranteed:
1. All values are valid for their type
2. All indexes are rebuilt
3. Constraints will be satisfied on all data.
Is that good enough in your case?
--
Richard Huxton
Archonet Ltd
On 9/3/04 10:14 AM, "Richard Huxton" <dev@archonet.com> wrote:
Put bluntly, you can't. The only way to verify the database as a whole
is to check every single value in it. If actual values get corrupted
then you may never even notice (e.g. a text field with a single
character corrupted).
However, if you dump and restore then three things can be guaranteed:
1. All values are valid for their type
2. All indexes are rebuilt
3. Constraints will be satisfied on all data.
Is that good enough in your case?
No, a dump/reload isn't feasible. Right now, it takes about 24 hours to do
the actual pg_dumpall and reload. When the database peaks in size and
records start being aged out, multiply that by 4 (at least). Obviously,
this isn't a check you can do weekly.
I was hoping there might be a utility to scan the entire database for
consistency (something akin to running ANALYZE FULL). Obviously, that would
require accessing every row and every index value.
Wes
On 9/3/2004 10:59 AM, Wes wrote:
On 9/3/04 3:11 AM, "Richard Huxton" <dev@archonet.com> wrote:
You shouldn't have to verify anything. PG's job is to never corrupt your
data, and providing your hardware is good it should do so. If you are
getting problems almost daily that would suggest a RAM/disk problem to
me (sig 11 usually implies RAM). Can't guarantee it's not PG but it's
record of reliability is pretty good.I believe SEGV typically just indicates it de-referenced a bad pointer (i.e.
NULL or out of range). The problem is not occurring on a daily basis. The
database has been in service since December of last year. It's just that
the symptoms progressed from no apparent symptoms, to a clearly corrupt DB.
My guess is that some minor corruption fed upon itself until the DB couldn't
even be dumped.
Right, that's what a SIGSEGV is. And the usual reason for the bad value
in that pointer is bad memory. What do you base your guess of a self
multiplying corruption on? Or is this pure handwaving in self-defense?
Steps I'd take:
1. Check your version number against the release notes and see if you
should upgrade. You don't mention your version, but it's always worth
having the last dot-release (7.2.5, 7.3.7, 7.4.5)
2. Schedule time to run memory/disk tests against your hardware. Finding
48 hours might not be easy, but you need to know where you stand.
3. Setup slony or some other replication so I can schedule my downtime.I thought I mentioned the level in my original mail - 7.4.1. We are
planning on running some diagnostics.
So you are running a Release that had 4 official bugfix releases from
the vendor on hardware that is in an unknown condition? Is the server at
least configured with ECC Ram, or is the data not important enough to
justify for quality hardware?
Whether there is a bug in PostgreSQL, or there was a memory hit, or whatever
doesn't really matter to the original question. The database can become
corrupt. How can I tell that a database is fully intact at any given point
in time? If I reload from a system backup before the known corruption, how
can I be sure that the original corruption that precipitated the failure is
not still there and will again rear its ugly head?
Dump and restore. You don't need to restore onto the same server. Any
test system with enough disk space would do.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
On 9/4/04 5:53 AM, "Jan Wieck" <JanWieck@yahoo.com> wrote:
[snip]
Well, I had a big long response composed to your snide remarks, but decided
I'm not getting in a flame war with you on something that is irrelevant to
the question I posed.
Is the server at
least configured with ECC Ram, or is the data not important enough to
justify for quality hardware?
As a matter of fact it does have ECC, and no errors have been reported by
the system or diagnostics. It's a Dell 6650. No disk errors have been
reported either (hardware RAID controller).
As it pertains to the question at hand, I don't care what caused the
corruption. It's totally irrelevant to the question. Perfect hardware and
perfect software don't exist. There will be hardware failures. There will
be software failures. The question was whether or not there is a feasible
way of determining at any given point in time that a database is fully
intact. We are already now doing weekly pg_dumpall's. Doing a restore of
each of those simply isn't reasonable. If there is no such integrity
utility to scan the DB in place, then that's the way it is. That's the
answer to my question. But, quit trying blame crappy hardware.
Wes
On 9/4/2004 2:26 PM, Wes wrote:
On 9/4/04 5:53 AM, "Jan Wieck" <JanWieck@yahoo.com> wrote:
intact. We are already now doing weekly pg_dumpall's. Doing a restore of
each of those simply isn't reasonable. If there is no such integrity
Why isn't doing a restore of those reasonable?
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
On 9/4/04 2:42 PM, "Jan Wieck" <JanWieck@yahoo.com> wrote:
Why isn't doing a restore of those reasonable?
Because of the size and time required. Right now, it takes at least 24
hours, with a full hardware configuration (multi-CPU, 8 disk SCSI RAID,
etc). That is going to do nothing but increase. Extrapolating linearly the
*current* load, it will take at least 4 days to load when the database size
peaks. But, based on past experience, the load will increase significantly
before then (probably by a factor of 2-4 to be conservative). When the
database gets that large, I have to consider that the reload time may not
increase linearly with the size. If we do a pg_dumpall once a week, it will
take longer to do a reload than the period between dumps. Just the pg_dump
alone could easily take close to a day. It also requires we have duplicate
fully configured hardware for each copy of the database we run just to
verify a pg_dumpall - if it takes that long to load, I can't share hardware.
Add to that the people time to monitor the systems and the process of 2x the
hardware... In short, I need a solution that scales to huge databases
(hundreds of gigabytes to over a terabyte), not one that works just for
small to medium databases.
pg_dumpall is hopefully reliable will presumably give me a snapshot that I
know I can restore from if the database becomes hopelessly corrupt. But I
can't individually test each one. I was hoping for a utility that would go
through and verify all indexes and data are consistent, and if not, attempt
to correct them.
As for your earlier question of cascading errors, consider a file system - a
type of database. If you get a file system error and correct it quickly,
you usually will lose nothing. If, however, you ignore that error, it is
likely to get worse over days or weeks. Other errors will crop up as a
result of the bad information in the first one. At some point, the file
system corruption may become so bad that it can't be recovered. Format and
reload. I have seen this on NTFS, UFS, HFS/HFS+, and even ReiserFS.
Journaling greatly reduces, but doesn't eliminate, this problem. There are
tools that will scan your file system and guarantee it's integrity, or fix
the errors (or attempt to fix them) if it finds any. I was looking for
something similar for a Postgres database.
Wes
Hi,
Am Sa, den 04.09.2004 schrieb Wes um 22:51:
On 9/4/04 2:42 PM, "Jan Wieck" <JanWieck@yahoo.com> wrote:
Why isn't doing a restore of those reasonable?
Because of the size and time required. Right now, it takes at least 24
hours, with a full hardware configuration (multi-CPU, 8 disk SCSI RAID,
etc). That is going to do nothing but increase. Extrapolating linearly the
*current* load, it will take at least 4 days to load when the database size
peaks.
...
As for your earlier question of cascading errors, consider a file system - a
type of database. If you get a file system error and correct it quickly,
you usually will lose nothing. If, however, you ignore that error, it is
likely to get worse over days or weeks. Other errors will crop up as a
result of the bad information in the first one. At some point, the file
system corruption may become so bad that it can't be recovered. Format and
reload. I have seen this on NTFS, UFS, HFS/HFS+, and even ReiserFS.
Journaling greatly reduces, but doesn't eliminate, this problem. There are
tools that will scan your file system and guarantee it's integrity, or fix
the errors (or attempt to fix them) if it finds any. I was looking for
something similar for a Postgres database.
Well, with such a huge database you probably should consider
different backup strategies, a filesystem with snapshot
support (XFS?) could help where you can copy a state of the database
at any time - so you can backup the database cluster without
stopping the postmaster. Also replication via slony could be
an option.
The best tool to verify the backup is probably the postmaster
itself. I really doubt any other program would be smaller and
faster :)
(Filesystems provide a tool because the actual filesystem code
is a kernel module)
Regards
Tino
On 9/4/04 5:28 PM, "Tino Wildenhain" <tino@wildenhain.de> wrote:
Well, with such a huge database you probably should consider
different backup strategies, a filesystem with snapshot
support (XFS?) could help where you can copy a state of the database
at any time - so you can backup the database cluster without
stopping the postmaster. Also replication via slony could be
an option.
Yes, we are looking into using file system snapshots. We are currently
using primarily file system backups (shut down the DB, back up the file
system). The problem we ran into was that we didn't have a specific point
in time where we knew with absolute certainty the backed up database was
good - snapshots would not help here.
I ended up starting with a recent backup, and working backwards until I
found one that wouldn't crash postmaster on a pg_dumpall. Rather than trust
that there was no corruption in that version (data blocks might be good, but
pg_dumpall doesn't test indexes), I did a pg_dumpall and reload.
The best tool to verify the backup is probably the postmaster
itself. I really doubt any other program would be smaller and
faster :)
Not really... Postmaster won't tell you if a structure is bad until it
stumbles on it and crashes (or politely reports an error). Just because
postmaster comes up doesn't mean your database is good. As far as I know,
there is no "verify database" command option on postmaster - postmaster
won't fsck your database.
Wes
Why isn't doing a restore of those reasonable?
Because of the size and time required. Right now, it takes at least 24
hours, with a full hardware configuration (multi-CPU, 8 disk SCSI RAID,
etc). That is going to do nothing but increase.
It seems to me then that you need a better primary solution.
--
Mike Nolan
On 9/4/2004 4:51 PM, Wes wrote:
On 9/4/04 2:42 PM, "Jan Wieck" <JanWieck@yahoo.com> wrote:
Why isn't doing a restore of those reasonable?
Because of the size and time required. Right now, it takes at least 24
hours, with a full hardware configuration (multi-CPU, 8 disk SCSI RAID,
24 hours to do what? The pg_dumpall, the restore or both? And what media
is used to save the dump? Same disk array, tape, separate disk array on
separate controller or separate archive server? What is the bottleneck?
Planning the backup solution out of the blue, I would have a separate
archive system. This system is connected via gigabit ethernet and has a
storage array large enough to hold at least 2 full dumps. It also has
the tape drive(s) to write off the dumps from the array for storage in
an alternate location. Preferrably, I would have a little more time to
finish the still missing features in Slony-I and take the dumps off a
replica :-)
Since another purpose is to verify if a dump is consistent, the archive
system would have another simple stripe set that can hold the entire
database, plus a ramdisk holding the xlog.
PostgreSQL will most probably not have any kind of MyISAMcheck or
however that utility is called. IIRC RedHat has something that can check
offline for the integrity of a single heap. But I don't know if it is
still under development or if they realized that this is a dead end
because of the offline limitation.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
I was looking for
something similar for a Postgres database.
There was this:
<http://svana.org/kleptog/pgsql/pgfsck.html>
Never used it, can't comment on it, doesn't appear to have been updated for
7.4. But it might be a jumping-off point.
--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 665-7007 voice
On 9/5/04 9:41 AM, "Scott Ribe" <scott_ribe@killerbytes.com> wrote:
There was this:
<http://svana.org/kleptog/pgsql/pgfsck.html>
Never used it, can't comment on it, doesn't appear to have been updated for
7.4. But it might be a jumping-off point.
The author of this dropped me a line last night. It looks like the type of
thing I was looking for, but doesn't quite work with 7.4. It appears the
changes necessary are probably minor, but I haven't had a chance to dig into
it yet. It also does not walk the index trees, but that could be added. As
you say, it looks like it might be a good starting point.
Wes
Why isn't doing a restore of those reasonable?
Because of the size and time required. Right now, it takes at least 24
hours, with a full hardware configuration (multi-CPU, 8 disk SCSI RAID,
etc). That is going to do nothing but increase. Extrapolating linearly
the
I understand you restore to a test server... can you configure your test
server with no fsync, pg will have only one connection so you can bump
sort_mem, and the like ? wil the times will be reasonable then ?
Doesn't a successful pg_dumpall prove the database is intact ?
Doesn't a successful pg_dumpall prove the database is intact ?
It is very difficult to prove a negative, that there are no errors.
What a successful pg_dumpall says is that pg_dumpall didn't find any
errors, not that there were none present.
Also, I thought part of the question was how to determine if the output
from the pg_dumpall was itself intact without doing a complete restore.
--
Mike Nolan
Pardon me for this, but it's Monday morning here and I'm in shallow
thinking mode.
I was just looking over Wes's questions about checking database
integrity, and the responses given, and I'm wondering if keeping a live
CRC on each record would make sense? Or is this already done, or maybe
an option? (Not that it would in any way solve Wes's company's current
problem, ...)
--
Joel <rees@ddcom.co.jp>
On 9/5/04 9:04 AM, "Jan Wieck" <JanWieck@yahoo.com> wrote:
24 hours to do what? The pg_dumpall, the restore or both?
There's more than 250 million rows. If I remember right, it's ballpark 25%
data reload, 75% index/foreign constraint rebuild. Pg_dumpall is something
like 3 hours or so.
Wes