Does PostgreSQL check database integrity at startup?
Recently I had a problem with a base file with size 0 in a standby server.
This raised one question: does PostgreSQL (9.6.6) check base integrity
at startup?
At least if there are 0 byte size files in base dir? Or CRC? Something?
Regards,
--
*Edson Carlos Ericksson Richter*
/SimKorp Ltda/
Fone: (51) 3366-7964
Embedded Image
/"A mente que se abre a uma nova ideia jamais voltará ao seu tamanho
original"/
- Albert Einstein
Attachments:
El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
Recently I had a problem with a base file with size 0 in a standby server.
This raised one question: does PostgreSQL (9.6.6) check base integrity
at startup?At least if there are 0 byte size files in base dir? Or CRC? Something?
Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.
--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
2017-12-26 14:44 GMT+01:00 Martin Marques <martin.marques@2ndquadrant.com>:
El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
Recently I had a problem with a base file with size 0 in a standby
server.
This raised one question: does PostgreSQL (9.6.6) check base integrity
at startup?At least if there are 0 byte size files in base dir? Or CRC? Something?
Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.
It has CRC check, but it is used in runtime - when data are necessary
So Postgres usually check nothing on start - few system tables and indexes
Regards
Pavel
Show quoted text
--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Em 26/12/2017 12:25, Pavel Stehule escreveu:
2017-12-26 14:44 GMT+01:00 Martin Marques
<martin.marques@2ndquadrant.com <mailto:martin.marques@2ndquadrant.com>>:El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
Recently I had a problem with a base file with size 0 in a
standby server.
This raised one question: does PostgreSQL (9.6.6) check base
integrity
at startup?
At least if there are 0 byte size files in base dir? Or CRC?
Something?
Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.It has CRC check, but it is used in runtime - when data are necessary
So Postgres usually check nothing on start - few system tables and indexes
Regards
Pavel
--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Any tips to make database server don't start if corrupt?
If I can change the startup script to make some checks before
effectively starting the database, what would be the recommendations?
One that seems obvious to me are empty data files (something like "find
-size 0 $PG_DATA/base")...
But I'm sure that more experienced PostgreSQL DBA would have more tests
to check before startup.
Thanks,
Edson
2017-12-26 16:37 GMT+01:00 Edson Carlos Ericksson Richter <
richter@simkorp.com.br>:
Em 26/12/2017 12:25, Pavel Stehule escreveu:
2017-12-26 14:44 GMT+01:00 Martin Marques <martin.marques@2ndquadrant.com>
:El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
Recently I had a problem with a base file with size 0 in a standby
server.
This raised one question: does PostgreSQL (9.6.6) check base integrity
at startup?At least if there are 0 byte size files in base dir? Or CRC? Something?
Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.It has CRC check, but it is used in runtime - when data are necessary
So Postgres usually check nothing on start - few system tables and indexes
Regards
Pavel
--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & ServicesAny tips to make database server don't start if corrupt?
If I can change the startup script to make some checks before effectively
starting the database, what would be the recommendations?One that seems obvious to me are empty data files (something like "find
-size 0 $PG_DATA/base")...
But I'm sure that more experienced PostgreSQL DBA would have more tests to
check before startup.
I don't think so anybody does it. Reading 1TB database needs more then few
hours.
Regards
Show quoted text
Thanks,
Edson
Em 26/12/2017 13:40, Pavel Stehule escreveu:
2017-12-26 16:37 GMT+01:00 Edson Carlos Ericksson Richter
<richter@simkorp.com.br <mailto:richter@simkorp.com.br>>:Em 26/12/2017 12:25, Pavel Stehule escreveu:
2017-12-26 14:44 GMT+01:00 Martin Marques
<martin.marques@2ndquadrant.com
<mailto:martin.marques@2ndquadrant.com>>:El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
Recently I had a problem with a base file with size 0 in a
standby server.
This raised one question: does PostgreSQL (9.6.6) check
base integrity
at startup?
At least if there are 0 byte size files in base dir? Or
CRC? Something?
Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.It has CRC check, but it is used in runtime - when data are
necessarySo Postgres usually check nothing on start - few system tables
and indexesRegards
Pavel
--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & ServicesAny tips to make database server don't start if corrupt?
If I can change the startup script to make some checks before
effectively starting the database, what would be the recommendations?One that seems obvious to me are empty data files (something like
"find -size 0 $PG_DATA/base")...
But I'm sure that more experienced PostgreSQL DBA would have more
tests to check before startup.I don't think so anybody does it. Reading 1TB database needs more then
few hours.Regards
Thanks,
Edson
I'm rebuilding the standby server for two days already, with 23% of
completion status...
If lost the database and backups because of that failure, it would be a
giant disaster.
Few hours checking integrity would be acceptable... Specially if I can
run it on standby only.
Regards,
Edson
2017-12-26 16:50 GMT+01:00 Edson Carlos Ericksson Richter <
richter@simkorp.com.br>:
Em 26/12/2017 13:40, Pavel Stehule escreveu:
2017-12-26 16:37 GMT+01:00 Edson Carlos Ericksson Richter <
richter@simkorp.com.br>:Em 26/12/2017 12:25, Pavel Stehule escreveu:
2017-12-26 14:44 GMT+01:00 Martin Marques <martin.marques@2ndquadrant.com
:
El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
Recently I had a problem with a base file with size 0 in a standby
server.
This raised one question: does PostgreSQL (9.6.6) check base integrity
at startup?At least if there are 0 byte size files in base dir? Or CRC? Something?
Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.It has CRC check, but it is used in runtime - when data are necessary
So Postgres usually check nothing on start - few system tables and indexes
Regards
Pavel
--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & ServicesAny tips to make database server don't start if corrupt?
If I can change the startup script to make some checks before effectively
starting the database, what would be the recommendations?One that seems obvious to me are empty data files (something like "find
-size 0 $PG_DATA/base")...
But I'm sure that more experienced PostgreSQL DBA would have more tests
to check before startup.I don't think so anybody does it. Reading 1TB database needs more then few
hours.Regards
Thanks,
Edson
I'm rebuilding the standby server for two days already, with 23% of
completion status...
If lost the database and backups because of that failure, it would be a
giant disaster.
Few hours checking integrity would be acceptable... Specially if I can run
it on standby only.
very simple check
pgdumpall > /dev/null
but this doesn't check indexes.
Regards
Pavel
Show quoted text
Regards,
Edson
Em 26/12/2017 13:57, Pavel Stehule escreveu:
2017-12-26 16:50 GMT+01:00 Edson Carlos Ericksson Richter
<richter@simkorp.com.br <mailto:richter@simkorp.com.br>>:Em 26/12/2017 13:40, Pavel Stehule escreveu:
2017-12-26 16:37 GMT+01:00 Edson Carlos Ericksson Richter
<richter@simkorp.com.br <mailto:richter@simkorp.com.br>>:Em 26/12/2017 12:25, Pavel Stehule escreveu:
2017-12-26 14:44 GMT+01:00 Martin Marques
<martin.marques@2ndquadrant.com
<mailto:martin.marques@2ndquadrant.com>>:El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter
escribió:Recently I had a problem with a base file with size 0
in a standby server.
This raised one question: does PostgreSQL (9.6.6)
check base integrity
at startup?
At least if there are 0 byte size files in base dir?
Or CRC? Something?
Yes it has CRC check, but only if you initialize the
cluster with
--data-checksums, and there's a price to pay in performance.It has CRC check, but it is used in runtime - when data are
necessarySo Postgres usually check nothing on start - few system
tables and indexesRegards
Pavel
--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & ServicesAny tips to make database server don't start if corrupt?
If I can change the startup script to make some checks before
effectively starting the database, what would be the
recommendations?One that seems obvious to me are empty data files (something
like "find -size 0 $PG_DATA/base")...
But I'm sure that more experienced PostgreSQL DBA would have
more tests to check before startup.I don't think so anybody does it. Reading 1TB database needs more
then few hours.Regards
Thanks,
Edson
I'm rebuilding the standby server for two days already, with 23%
of completion status...
If lost the database and backups because of that failure, it would
be a giant disaster.
Few hours checking integrity would be acceptable... Specially if I
can run it on standby only.very simple check
pgdumpall > /dev/null
but this doesn't check indexes.
Regards
Pavel
I'll check it and make some tests.
Thanks!
Edson
Edson Carlos Ericksson Richter wrote:
I'm rebuilding the standby server for two days already, with 23% of
completion status...
So how do you build your standbys, exactly? Maybe there's a bug in your
procedure, rather than a bug in the software.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Em 26/12/2017 15:13, Alvaro Herrera escreveu:
Edson Carlos Ericksson Richter wrote:
I'm rebuilding the standby server for two days already, with 23% of
completion status...So how do you build your standbys, exactly? Maybe there's a bug in your
procedure, rather than a bug in the software.
Usually, or by "pg_basebackup" or by "rsync" as described in PostgreSQL
Wiki.
Right now, I'm doing via pg_basebackup.
Nothing extraordinary.
When using rsync:
rsync -e "ssh -2 -C -p slave_ssh_port" --progress --partial -az
/pgsql/9.6/master_data_folder/*
superuser_name@$slave_host:/home/pgsql/9.6/slave_data_folder/ --exclude
postmaster.pid --exclude postgresql.conf --exclude pg_log
When using pg_basebackup:
/usr/pgsql-9.6/bin/pg_basebackup -h master_host -p master_port -D
/home/pgsql/9.6/master_data_folder -R -S slot_name -X stream -P -d "
user=superuser_name password=superuser_pass sslmode=require "
Of course, this is inside a bash script with variables at right places
to make script generic as needed to make it works with dozens of
databases...
Regards,
Edson
On 12/26/17 10:57 AM, Pavel Stehule wrote:
2017-12-26 16:50 GMT+01:00 Edson Carlos Ericksson Richter
I'm rebuilding the standby server for two days already, with 23% of
completion status...
If lost the database and backups because of that failure, it would
be a giant disaster.
Few hours checking integrity would be acceptable... Specially if I
can run it on standby only.very simple check
pgdumpall > /dev/null
but this doesn't check indexes.
pgBackRest will validate all page checksums (including indexes, etc.) in
the cluster during backup. Full backups check everything,
incr/differential backups check only the files that have changed.
--
-David
david@pgmasters.net
David Steele wrote:
pgBackRest will validate all page checksums (including indexes, etc.) in the
cluster during backup. Full backups check everything, incr/differential
backups check only the files that have changed.
If a table or index file is of zero length when backed up, as in the
described case, nothing will be checked, right? I mean, there is
nothing externally indicating that the file ought to be of a different
size. Am I wrong? So Edson's situation here would not raise any red
flags.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello,
On Tue, 2017-12-26 at 18:58 -0300, Alvaro Herrera wrote:Hello,
David Steele wrote:
pgBackRest will validate all page checksums (including indexes,
etc.) in the
cluster during backup. Full backups check everything,
incr/differential
backups check only the files that have changed.If a table or index file is of zero length when backed up, as in the
described case, nothing will be checked, right? I mean, there is
nothing externally indicating that the file ought to be of a
different
size. Am I wrong? So Edson's situation here would not raise any red
flags.
Could the following occur:-
1) Your app. issues a BEGIN followed by an INSERT.
2) Postgres decides to open a new file in order to store the new row.
3) Your app. then does a ROLLBACK.
Wouldn't that leave you with a zero length file on disk?
There's no reason for Postgres to delete the file just because a
rollback was issued. All it has to do is clear the buffer in memory.
My 2 cents.
Rob
El 26/12/17 a las 14:46, Edson Carlos Ericksson Richter escribió:
Usually, or by "pg_basebackup" or by "rsync" as described in PostgreSQL
Wiki.
Right now, I'm doing via pg_basebackup.
Nothing extraordinary.When using rsync:
rsync -e "ssh -2 -C -p slave_ssh_port" --progress --partial -az
/pgsql/9.6/master_data_folder/*
superuser_name@$slave_host:/home/pgsql/9.6/slave_data_folder/ --exclude
postmaster.pid --exclude postgresql.conf --exclude pg_log
You don't say so anywhere, but I suspect you run the rsync between a
pg_start_backup() and pg_stop_backup(), right?
That is the way it's described in the wiki.
Regards,
--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Em 26/12/2017 20:42, Martin Marques escreveu:
El 26/12/17 a las 14:46, Edson Carlos Ericksson Richter escribió:
Usually, or by "pg_basebackup" or by "rsync" as described in PostgreSQL
Wiki.
Right now, I'm doing via pg_basebackup.
Nothing extraordinary.When using rsync:
rsync -e "ssh -2 -C -p slave_ssh_port" --progress --partial -az
/pgsql/9.6/master_data_folder/*
superuser_name@$slave_host:/home/pgsql/9.6/slave_data_folder/ --exclude
postmaster.pid --exclude postgresql.conf --exclude pg_logYou don't say so anywhere, but I suspect you run the rsync between a
pg_start_backup() and pg_stop_backup(), right?That is the way it's described in the wiki.
Regards,
Yes, sure. When using wiki, I'm following all instructions. Process
revisited many times do check if something changed (since 9.0 days :-) )
Right now, I do prefer to use pg_basebackup instead - but sometimes
(when database is just too large), rsync seems more reliable (maybe it
is the cause of the problem).
Anyway, instead digging into rsync functionality (or bugs - I doubt, but
who knows?), I do prefer to have a script I can run to check if there is
obvious failures in standby servers.
Looking for empty files would be a start point.
I'm learning from experienced people from the list that are other points
I would like to check as well.
Regards,
Edson.
Em 26/12/2017 20:11, rob stone escreveu:
Hello,
On Tue, 2017-12-26 at 18:58 -0300, Alvaro Herrera wrote:Hello,
David Steele wrote:
pgBackRest will validate all page checksums (including indexes,
etc.) in the
cluster during backup. Full backups check everything,
incr/differential
backups check only the files that have changed.If a table or index file is of zero length when backed up, as in the
described case, nothing will be checked, right? I mean, there is
nothing externally indicating that the file ought to be of a
different
size. Am I wrong? So Edson's situation here would not raise any red
flags.Could the following occur:-
1) Your app. issues a BEGIN followed by an INSERT.
2) Postgres decides to open a new file in order to store the new row.
3) Your app. then does a ROLLBACK.Wouldn't that leave you with a zero length file on disk?
There's no reason for Postgres to delete the file just because a
rollback was issued. All it has to do is clear the buffer in memory.My 2 cents.
Rob
You are right, in several databases I've looked there are many files
with 0 byte size in ./base folder.
This is not a good quest.
Regards,
Edson
Greetings,
* Edson Carlos Ericksson Richter (richter@simkorp.com.br) wrote:
Right now, I do prefer to use pg_basebackup instead - but sometimes
(when database is just too large), rsync seems more reliable (maybe
it is the cause of the problem).
I don't generally recommend rsync for various reasons, but if you make
sure to call pg_start/stop_backup(), keep track of all your WAL and
verify everything ends up written to disk then it should work. Don't
use it for incrementals though- there are cases where that can fail.
Anyway, instead digging into rsync functionality (or bugs - I doubt,
but who knows?), I do prefer to have a script I can run to check if
there is obvious failures in standby servers.
As mentioned, zero-byte files can be perfectly valid. PostgreSQL does
have page-level CRCs, if you initialized your database with them (which
I would strongly recommend). There are also backup tools which will
verify those checksums when performing a backup of the system. In
addition to that, you can do parallel backup and restore which can
reduce the downtime for doing restores quite a bit (though this will
depend on what you're bottleneck is, of course). I'd suggest you take a
look at pgBackRest, though I think that other solutions now also have
all of these features (though they all have their own features too).
Thanks!
Stephen
Stephen Frost wrote:
* Edson Carlos Ericksson Richter (richter@simkorp.com.br) wrote:
Anyway, instead digging into rsync functionality (or bugs - I doubt,
but who knows?), I do prefer to have a script I can run to check if
there is obvious failures in standby servers.As mentioned, zero-byte files can be perfectly valid. PostgreSQL does
have page-level CRCs, if you initialized your database with them (which
I would strongly recommend).
Page-level checksums would not detect the problem being complained in
this thread, however.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro,
* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
Stephen Frost wrote:
* Edson Carlos Ericksson Richter (richter@simkorp.com.br) wrote:
Anyway, instead digging into rsync functionality (or bugs - I doubt,
but who knows?), I do prefer to have a script I can run to check if
there is obvious failures in standby servers.As mentioned, zero-byte files can be perfectly valid. PostgreSQL does
have page-level CRCs, if you initialized your database with them (which
I would strongly recommend).Page-level checksums would not detect the problem being complained in
this thread, however.
It's entirely unclear to me what the problem being complained about in
this thread actually is. The complaint so far was about zero-byte
files, but those are entirely valid, so that isn't a problem that anyone
can solve..
Given the thread subject, if someone actually wanted to do a database
integrity check before startup, they could use pgBackRest to perform a
backup with a CRC-enabled database and at least verify that all of the
checksums are valid.
We could possibly look into adding some set of additional checks for
files which can't actually be zero-byte, perhaps.. I know we have some
other one-off checks already.
Thanks!
Stephen