database corruption questions

Started by Heine Ferreiraover 13 years ago14 messagesgeneral
Jump to latest
#1Heine Ferreira
heine.ferreira@gmail.com

Hi

Are there any best practices for avoiding database
corruption? I suppose the most obvious one is
to have a ups if it's a desktop machine.
How do you detect corruption in a Postgresql
database and are there any ways to fix it besides
restoring a backup?

Thanks

H.F.

#2Leif B. Kristensen
leif@solumslekt.org
In reply to: Heine Ferreira (#1)
Re: database corruption questions

Lørdag 13. oktober 2012 23.53.03 skrev Heine Ferreira :

Hi

Are there any best practices for avoiding database
corruption?

In my experience, database corruption always comes down to flaky disk drives.
Keep your disks new and shiny eg. less than 3 years, and go for some kind of
redundancy in a RAID configuration.

regards, Leif

#3John R Pierce
pierce@hogranch.com
In reply to: Leif B. Kristensen (#2)
Re: database corruption questions

On 10/13/12 3:04 PM, Leif Biberg Kristensen wrote:

Lørdag 13. oktober 2012 23.53.03 skrev Heine Ferreira :

Hi

Are there any best practices for avoiding database
corruption?

In my experience, database corruption always comes down to flaky disk drives.
Keep your disks new and shiny eg. less than 3 years, and go for some kind of
redundancy in a RAID configuration.

also, ECC RAM so creeping bit rot doesn't slip in from memory without
detection. if you use a raid controller with a write-back cache, be
sure it has BBU or flash-back.

--
john r pierce N 37, W 122
santa cruz ca mid-left coast

#4Craig Ringer
craig@2ndquadrant.com
In reply to: Heine Ferreira (#1)
Re: database corruption questions

On 10/14/2012 05:53 AM, Heine Ferreira wrote:

Hi

Are there any best practices for avoiding database
corruption?

* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week for the last 4 weeks, then one
a month for the rest of the year, then one a year.

* Use warm standby with log shipping and/or replication to maintain a
live copy of the DB.

* If you want point-in-time recovery, keep a few days or weeks worth of
WAL archives and a basebackup around. That'll help you recover from
those "oops I meant DROP TABLE unimportant; not DROP TABLE
vital_financial_records;" issues.

* Keep up to date with the latest PostgreSQL patch releases. Don't be
one of those people still running 9.0.0 when 9.0.10 is out.

* Plug-pull test your system when you're testing it before going live.
Put it under load with something like pgbench, then literally pull the
plug out. If your database doesn't come back up fine you have hardware,
OS or configuration problems.

* Don't `kill -9` the postmaster. It should be fine, but it's still not
smart.

* ABSOLUTELY NEVER DELETE postmaster.pid

* Use good quality hardware with proper cooling and a good quality power
supply. If possible, ECC RAM is a nice extra.

* Never, ever, ever use cheap SSDs. Use good quality hard drives or
(after proper testing) high end SSDs. Read the SSD reviews periodically
posted on this mailing list if considering using SSDs. Make sure the SSD
has a supercapacitor or other reliable option for flushing its write
cache on power loss. Always do repeated plug-pull testing when using SSDs.

* Use a solid, reliable file system. zfs-on-linux, btrfs, etc are not
the right choices for a database you care about. Never, ever, ever use
FAT32.

* If on Windows, do not run an anti-virus program on your
database server. Nobody should be using it for other things or running
programs on it anyway.

* Avoid RAID 5, mostly because the performance is terrible, but also
because I've seen corruption issues with rebuilds from parity on failing
disks.

* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
performance as reliability; a BBU will make an immense difference to
database performance.

* If you're going to have a UPS (you shouldn't need one as your system
should be crash-safe), don't waste your money on a cheap one. Get a good
online double-conversion unit that does proper power filtering. Cheap
UPSs are just a battery with a fast switch, they provide no power
filtering and what little surge protection they offer is done with a
component that wears out after absorbing a few surges, becoming totally
ineffective. Since your system should be crash-safe a cheap UPS will do
nothing for corruption protection, it'll only help with uptime.

--
Craig Ringer

#5John R Pierce
pierce@hogranch.com
In reply to: Craig Ringer (#4)
Re: database corruption questions

On 10/13/12 7:13 PM, Craig Ringer wrote:

* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
performance as reliability; a BBU will make an immense difference to
database performance.

a comment on this one.... I have some test servers with lots of SAS
and/or SATA drives on controllers like LSI Logic 9261-8i, with 512MB or
1GB battery-backed cache. I can configure the controller for JBOD
and use linux mdraid raid10 and get the same performance as the
controllers native raid10, as long as the write-back cache is
enabled. disable the writeback cache, and you might as well be using
SATA JBOD.

--
john r pierce N 37, W 122
santa cruz ca mid-left coast

#6Craig Ringer
craig@2ndquadrant.com
In reply to: John R Pierce (#5)
Re: database corruption questions

On 10/14/2012 11:00 AM, John R Pierce wrote:

On 10/13/12 7:13 PM, Craig Ringer wrote:

* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
performance as reliability; a BBU will make an immense difference to
database performance.

a comment on this one.... I have some test servers with lots of SAS
and/or SATA drives on controllers like LSI Logic 9261-8i, with 512MB or
1GB battery-backed cache. I can configure the controller for JBOD
and use linux mdraid raid10 and get the same performance as the
controllers native raid10, as long as the write-back cache is
enabled. disable the writeback cache, and you might as well be using
SATA JBOD.

Yeah, without the write-back cache you don't gain much. I run a couple
of DBs on plain old `md` RAID and I'm actually quite happy with it.

I've expanded this into a blog post and improved that section there.

http://blog.ringerc.id.au/2012/10/avoiding-postgresql-database-corruption.html

Comments appreciated.

--
Craig Ringer

#7Craig Ringer
craig@2ndquadrant.com
In reply to: Heine Ferreira (#1)
Re: database corruption questions

On 10/14/2012 05:53 AM, Heine Ferreira wrote:

Hi

Are there any best practices for avoiding database
corruption?

I forgot to mention, you should also read:

http://www.postgresql.org/docs/current/static/wal-reliability.html

--
Craig Ringer

#8Chris Angelico
rosuav@gmail.com
In reply to: Craig Ringer (#4)
Re: database corruption questions

On Sun, Oct 14, 2012 at 1:13 PM, Craig Ringer <ringerc@ringerc.id.au> wrote:

* Never, ever, ever use cheap SSDs. Use good quality hard drives or (after
proper testing) high end SSDs. Read the SSD reviews periodically posted on
this mailing list if considering using SSDs. Make sure the SSD has a
supercapacitor or other reliable option for flushing its write cache on
power loss. Always do repeated plug-pull testing when using SSDs.

Interesting. My boss just bought a set of SSDs for some test systems,
with the intention of using them for our next deployment. They're
giving really great performance under pgbench, but we haven't yet done
a plug-pull test on any of them. I'll make sure I do that next week.

Is there an article somewhere about how best to do a plug-pull test?
Or is it as simple as "fire up pgbench, kill the power, bring things
back up, and see if anything isn't working"?

ChrisA

#9Craig Ringer
craig@2ndquadrant.com
In reply to: Chris Angelico (#8)
Re: database corruption questions

On 10/14/2012 12:02 PM, Chris Angelico wrote:

Is there an article somewhere about how best to do a plug-pull test?
Or is it as simple as "fire up pgbench, kill the power, bring things
back up, and see if anything isn't working"?

That's what I'd do and what I've always done in the past, but others
here are much more experienced with testing gear into production.

You can also use pg_test_fsync and diskchecker.pl . See:

http://www.postgresql.org/docs/current/static/wal-reliability.html

I do repeated plug-pull tests and make sure fsync is being honoured.

--
Craig Ringer

#10Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#6)
Re: database corruption questions

On Sun, Oct 14, 2012 at 11:26:40AM +0800, Craig Ringer wrote:

On 10/14/2012 11:00 AM, John R Pierce wrote:

On 10/13/12 7:13 PM, Craig Ringer wrote:

* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
performance as reliability; a BBU will make an immense difference to
database performance.

a comment on this one.... I have some test servers with lots of SAS
and/or SATA drives on controllers like LSI Logic 9261-8i, with 512MB or
1GB battery-backed cache. I can configure the controller for JBOD
and use linux mdraid raid10 and get the same performance as the
controllers native raid10, as long as the write-back cache is
enabled. disable the writeback cache, and you might as well be using
SATA JBOD.

Yeah, without the write-back cache you don't gain much. I run a
couple of DBs on plain old `md` RAID and I'm actually quite happy
with it.

I've expanded this into a blog post and improved that section there.

http://blog.ringerc.id.au/2012/10/avoiding-postgresql-database-corruption.html

Craig, that is a great post. Can you get it on Planet Postgres?

http://planet.postgresql.org/

I think you would have to subscribe your RSS blog feed.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#11Daniel Serodio
dserodio@mandic.com.br
In reply to: Craig Ringer (#4)
Re: database corruption questions

Craig Ringer wrote:

On 10/14/2012 05:53 AM, Heine Ferreira wrote:

Hi

Are there any best practices for avoiding database
corruption?

* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week for the last 4 weeks, then
one a month for the rest of the year, then one a year.

What kind of "rolling backups"? From pg_basebackup?

* Use warm standby with log shipping and/or replication to maintain a
live copy of the DB.

* If you want point-in-time recovery, keep a few days or weeks worth
of WAL archives and a basebackup around. That'll help you recover from
those "oops I meant DROP TABLE unimportant; not DROP TABLE
vital_financial_records;" issues.

* Keep up to date with the latest PostgreSQL patch releases. Don't be
one of those people still running 9.0.0 when 9.0.10 is out.

The problem is that updating the database usually results in downtime.
Or can the downtime be avoided in a replication scenario?

* Plug-pull test your system when you're testing it before going live.
Put it under load with something like pgbench, then literally pull the
plug out. If your database doesn't come back up fine you have
hardware, OS or configuration problems.

* Don't `kill -9` the postmaster. It should be fine, but it's still
not smart.

* ABSOLUTELY NEVER DELETE postmaster.pid

* Use good quality hardware with proper cooling and a good quality
power supply. If possible, ECC RAM is a nice extra.

* Never, ever, ever use cheap SSDs. Use good quality hard drives or
(after proper testing) high end SSDs. Read the SSD reviews
periodically posted on this mailing list if considering using SSDs.
Make sure the SSD has a supercapacitor or other reliable option for
flushing its write cache on power loss. Always do repeated plug-pull
testing when using SSDs.

* Use a solid, reliable file system. zfs-on-linux, btrfs, etc are not
the right choices for a database you care about. Never, ever, ever use
FAT32.

* If on Windows, do not run an anti-virus program on your
database server. Nobody should be using it for other things or running
programs on it anyway.

* Avoid RAID 5, mostly because the performance is terrible, but also
because I've seen corruption issues with rebuilds from parity on
failing disks.

* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
performance as reliability; a BBU will make an immense difference to
database performance.

* If you're going to have a UPS (you shouldn't need one as your system
should be crash-safe), don't waste your money on a cheap one. Get a
good online double-conversion unit that does proper power filtering.
Cheap UPSs are just a battery with a fast switch, they provide no
power filtering and what little surge protection they offer is done
with a component that wears out after absorbing a few surges, becoming
totally ineffective. Since your system should be crash-safe a cheap
UPS will do nothing for corruption protection, it'll only help with
uptime.

--
Craig Ringer

Thanks,
Daniel Serodio

#12Daniel Serodio (lists)
daniel.lists@mandic.com.br
In reply to: Daniel Serodio (#11)
Re: database corruption questions

Craig Ringer wrote:

On 10/14/2012 05:53 AM, Heine Ferreira wrote:

Hi

Are there any best practices for avoiding database
corruption?

* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week for the last 4 weeks, then
one a month for the rest of the year, then one a year.

What kind of "rolling backups"? From pg_basebackup?

* Use warm standby with log shipping and/or replication to maintain a
live copy of the DB.

* If you want point-in-time recovery, keep a few days or weeks worth
of WAL archives and a basebackup around. That'll help you recover from
those "oops I meant DROP TABLE unimportant; not DROP TABLE
vital_financial_records;" issues.

* Keep up to date with the latest PostgreSQL patch releases. Don't be
one of those people still running 9.0.0 when 9.0.10 is out.

The problem is that updating the database usually results in downtime.
Or can the downtime be avoided in a replication scenario?

* Plug-pull test your system when you're testing it before going live.
Put it under load with something like pgbench, then literally pull the
plug out. If your database doesn't come back up fine you have
hardware, OS or configuration problems.

* Don't `kill -9` the postmaster. It should be fine, but it's still
not smart.

* ABSOLUTELY NEVER DELETE postmaster.pid

* Use good quality hardware with proper cooling and a good quality
power supply. If possible, ECC RAM is a nice extra.

* Never, ever, ever use cheap SSDs. Use good quality hard drives or
(after proper testing) high end SSDs. Read the SSD reviews
periodically posted on this mailing list if considering using SSDs.
Make sure the SSD has a supercapacitor or other reliable option for
flushing its write cache on power loss. Always do repeated plug-pull
testing when using SSDs.

* Use a solid, reliable file system. zfs-on-linux, btrfs, etc are not
the right choices for a database you care about. Never, ever, ever use
FAT32.

* If on Windows, do not run an anti-virus program on your
database server. Nobody should be using it for other things or running
programs on it anyway.

* Avoid RAID 5, mostly because the performance is terrible, but also
because I've seen corruption issues with rebuilds from parity on
failing disks.

* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
performance as reliability; a BBU will make an immense difference to
database performance.

* If you're going to have a UPS (you shouldn't need one as your system
should be crash-safe), don't waste your money on a cheap one. Get a
good online double-conversion unit that does proper power filtering.
Cheap UPSs are just a battery with a fast switch, they provide no
power filtering and what little surge protection they offer is done
with a component that wears out after absorbing a few surges, becoming
totally ineffective. Since your system should be crash-safe a cheap
UPS will do nothing for corruption protection, it'll only help with
uptime.

--
Craig Ringer

Thanks,
Daniel Serodio

#13Craig Ringer
craig@2ndquadrant.com
In reply to: Daniel Serodio (#11)
Re: database corruption questions

On 10/18/2012 01:06 AM, Daniel Serodio wrote:

Craig Ringer wrote:

On 10/14/2012 05:53 AM, Heine Ferreira wrote:

Hi

Are there any best practices for avoiding database
corruption?

* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week for the last 4 weeks, then
one a month for the rest of the year, then one a year.

What kind of "rolling backups"? From pg_basebackup?

I'd recommend good old `pg_dump`, that way you're not assuming that your
cluster's on-disk format is intact and happy. Regular dumps will also
help detect any damage that might've crept in from file system
corruption, HDD/RAID faults, etc. Not that that should happen, but we're
talking preventative action here.

I elaborated somewhat here:

http://blog.ringerc.id.au/2012/10/avoiding-postgresql-database-corruption.html

Alternately, check out barman: http://www.pgbarman.org/ . I haven't
tried it yet, but it looks very promising. I'd still want to take
periodic dumps, as I'm reluctant to rely on `pg_basebackup` style
cluster copies alone.

* Keep up to date with the latest PostgreSQL patch releases. Don't be one of those people still running 9.0.0 when 9.0.10 is out.

The problem is that updating the database usually results in downtime. Or can the downtime be avoided in a replication scenario?

Patches within the same minor release require extremely minimal downtime.

- Stop server
- Install new binaries
- Start server

How much downtime will a data corruption bug (yes, they've happened)
that's fixed in a new version cost you if you don't patch and it bites
you? Or a bug that causes a server crash and restart? Plan downtime, so
you don't have potentially much longer unplanned downtime at the worst
possible time.

You can do zero-downtime minor updates using hot standby and standby
promotion; see http://www.repmgr.org/ .

Updating to a new major release is a bigger job, but that's not what I'm
talking about.

BTW, please trim your replies to quote just the relevant context.

--
Craig Ringer

#14Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Craig Ringer (#13)
Re: database corruption questions

Heine Ferreira wrote:

Are there any best practices for avoiding database corruption?

First and foremost, do not turn off fsync or full_page_writes in your
configuration. After that the most common causes for database
corruption I've seen are bad RAM (ECC RAM is a requirement, not an
option for a serious database server in my book), failing hard drives
(use RAID and SMART monitoring, and have a sane hardware replacement
policy), or buggy device drivers (pay attention to releases which fix
bugs or security vulnerabilities which could affect you). It's
getting rare to see this now, but it used to be common for some
drives to lie to you about having written data when it was actually
just in cache -- a good BBU RAID controller will go out of its way to
keep the drives honest.

-Kevin