TAP tests for pg_verify_checksums

Started by Michael Paquierover 7 years ago7 messages
#1Michael Paquier
michael@paquier.xyz
1 attachment(s)

Hi all,

The topic of $subject has been discussed a bit times, resulting in a
couple of patches on the way:
/messages/by-id/20180830200258.GG15446@paquier.xyz
/messages/by-id/CABUevEzEKrwPeGK2o-OV4z2MjfT6Cu8cLfA-v1S1j4z8x7WJjg@mail.gmail.com

However nothing has actually happened. Based on the previous feedback,
attached is an updated patch to do the actual job.

Magnus Hagander and Michael Banck need to be credited as authors, as
this patch is roughly a merge of what they proposed.

Thoughts?
--
Michael

Attachments:

verify-checksums-tap-v2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 5dc629fd5e..610887e5d4 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -8,7 +8,7 @@ use Fcntl ':mode';
 use File::stat qw{lstat};
 use PostgresNode;
 use TestLib;
-use Test::More tests => 18;
+use Test::More tests => 21;
 
 my $tempdir = TestLib::tempdir;
 my $xlogdir = "$tempdir/pgxlog";
@@ -58,6 +58,12 @@ mkdir $datadir;
 			"check PGDATA permissions");
 	}
 }
+
+# Control file should tell that data checksums are disabled by default.
+command_like(['pg_controldata', $datadir],
+			 qr/Data page checksum version:.*0/,
+			 'checksums are disabled in control file');
+
 command_ok([ 'initdb', '-S', $datadir ], 'sync only');
 command_fails([ 'initdb', $datadir ], 'existing data directory');
 
diff --git a/src/bin/pg_verify_checksums/.gitignore b/src/bin/pg_verify_checksums/.gitignore
index d1dcdaf0dd..0e5e569a54 100644
--- a/src/bin/pg_verify_checksums/.gitignore
+++ b/src/bin/pg_verify_checksums/.gitignore
@@ -1 +1,3 @@
 /pg_verify_checksums
+
+/tmp_check/
diff --git a/src/bin/pg_verify_checksums/Makefile b/src/bin/pg_verify_checksums/Makefile
index d16261571f..cfe4ab1b8b 100644
--- a/src/bin/pg_verify_checksums/Makefile
+++ b/src/bin/pg_verify_checksums/Makefile
@@ -34,3 +34,9 @@ uninstall:
 clean distclean maintainer-clean:
 	rm -f pg_verify_checksums$(X) $(OBJS)
 	rm -rf tmp_check
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
diff --git a/src/bin/pg_verify_checksums/t/001_basic.pl b/src/bin/pg_verify_checksums/t/001_basic.pl
new file mode 100644
index 0000000000..1fa2e12db2
--- /dev/null
+++ b/src/bin/pg_verify_checksums/t/001_basic.pl
@@ -0,0 +1,8 @@
+use strict;
+use warnings;
+use TestLib;
+use Test::More tests => 8;
+
+program_help_ok('pg_verify_checksums');
+program_version_ok('pg_verify_checksums');
+program_options_handling_ok('pg_verify_checksums');
diff --git a/src/bin/pg_verify_checksums/t/002_actions.pl b/src/bin/pg_verify_checksums/t/002_actions.pl
new file mode 100644
index 0000000000..7423595181
--- /dev/null
+++ b/src/bin/pg_verify_checksums/t/002_actions.pl
@@ -0,0 +1,52 @@
+# Do basic sanity checks supported by pg_verify_checksums using
+# an initialized cluster.
+
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 8;
+
+# Initialize node with checksums enabled.
+my $node = get_new_node('node_checksum');
+$node->init(extra => ['--data-checksums']);
+my $pgdata = $node->data_dir;
+
+# Control file should know that checksums are disabled.
+command_like(['pg_controldata', $pgdata],
+	     qr/Data page checksum version:.*1/,
+		 'checksums enabled in control file');
+
+# Checksums pass on a newly-created cluster
+command_ok(['pg_verify_checksums',  '-D', $pgdata],
+		   "checksum checks done and passing");
+
+# Checks cannot happen for an online cluster
+$node->start;
+command_fails(['pg_verify_checksums',  '-D', $pgdata],
+			  "checksum checks not done");
+
+# Create table to corrupt and get its relfilenode
+$node->safe_psql('postgres',
+	"SELECT a INTO corrupt1 FROM generate_series(1,10000) AS a;
+	ALTER TABLE corrupt1 SET (autovacuum_enabled=false);");
+
+my $file_corrupted = $node->safe_psql('postgres',
+	"SELECT pg_relation_filepath('corrupt1')");
+
+# Set page header and block size
+my $pageheader_size = 24;
+my $block_size = $node->safe_psql('postgres', 'SHOW block_size;');
+$node->stop;
+
+# Time to create a corruption
+open my $file, '+<', "$pgdata/$file_corrupted";
+seek($file, $pageheader_size, 0);
+syswrite($file, '\0\0\0\0\0\0\0\0\0');
+close $file;
+
+$node->command_checks_all([ 'pg_verify_checksums', '-D', $pgdata],
+						  1,
+						  [qr/Bad checksums:  1/s],
+						  [qr/checksum verification failed/s],
+						  'pg_checksums reports checksum mismatch');
#2Michael Banck
michael.banck@credativ.de
In reply to: Michael Paquier (#1)
Re: TAP tests for pg_verify_checksums

Hi,

On Fri, Oct 05, 2018 at 10:26:45AM +0900, Michael Paquier wrote:

The topic of $subject has been discussed a bit times, resulting in a
couple of patches on the way:
/messages/by-id/20180830200258.GG15446@paquier.xyz
/messages/by-id/CABUevEzEKrwPeGK2o-OV4z2MjfT6Cu8cLfA-v1S1j4z8x7WJjg@mail.gmail.com

However nothing has actually happened. Based on the previous feedback,
attached is an updated patch to do the actual job.

Thanks!

It's too late for v11 though at this point I guess?

I think it would be easy to also test the -r command-line option, as we
already create a table. Something like

|my $relfilenode_corrupted = $node->safe_psql('postgres',
| "SELECT relfilenode FROM pg_class WHERE relname = 'corrupt1'");

[...]

|# Checksums pass solely on that table
|command_ok(['pg_verify_checksums', '-D', $pgdata, '-r', $relfilenode_corrupted],
| "checksum checks for table relfilenode done and passing");

While making sure the $node->stop is between the two.

One nitpick:

diff --git a/src/bin/pg_verify_checksums/t/002_actions.pl b/src/bin/pg_verify_checksums/t/002_actions.pl
new file mode 100644
index 0000000000..7423595181
--- /dev/null
+++ b/src/bin/pg_verify_checksums/t/002_actions.pl

[...]

+# Control file should know that checksums are disabled.
+command_like(['pg_controldata', $pgdata],
+	     qr/Data page checksum version:.*1/,
+		 'checksums enabled in control file');

That comment should read 'that checksums are enabled', right?

Otherwise, LGTM and I've tested it without finding any problems.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB M�nchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

#3Michael Paquier
michael@paquier.xyz
In reply to: Michael Banck (#2)
1 attachment(s)
Re: TAP tests for pg_verify_checksums

On Fri, Oct 05, 2018 at 01:38:05PM +0200, Michael Banck wrote:

It's too late for v11 though at this point I guess?

Unfortunately yes.

I think it would be easy to also test the -r command-line option, as we
already create a table.

Good idea. Let's add this test.

That comment should read 'that checksums are enabled', right?

Indeed. Fixed.

Otherwise, LGTM and I've tested it without finding any problems.

What do you think about the updated version attached?
--
Michael

Attachments:

verify-checksums-tap-v3.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 5dc629fd5e..610887e5d4 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -8,7 +8,7 @@ use Fcntl ':mode';
 use File::stat qw{lstat};
 use PostgresNode;
 use TestLib;
-use Test::More tests => 18;
+use Test::More tests => 21;
 
 my $tempdir = TestLib::tempdir;
 my $xlogdir = "$tempdir/pgxlog";
@@ -58,6 +58,12 @@ mkdir $datadir;
 			"check PGDATA permissions");
 	}
 }
+
+# Control file should tell that data checksums are disabled by default.
+command_like(['pg_controldata', $datadir],
+			 qr/Data page checksum version:.*0/,
+			 'checksums are disabled in control file');
+
 command_ok([ 'initdb', '-S', $datadir ], 'sync only');
 command_fails([ 'initdb', $datadir ], 'existing data directory');
 
diff --git a/src/bin/pg_verify_checksums/.gitignore b/src/bin/pg_verify_checksums/.gitignore
index d1dcdaf0dd..0e5e569a54 100644
--- a/src/bin/pg_verify_checksums/.gitignore
+++ b/src/bin/pg_verify_checksums/.gitignore
@@ -1 +1,3 @@
 /pg_verify_checksums
+
+/tmp_check/
diff --git a/src/bin/pg_verify_checksums/Makefile b/src/bin/pg_verify_checksums/Makefile
index d16261571f..cfe4ab1b8b 100644
--- a/src/bin/pg_verify_checksums/Makefile
+++ b/src/bin/pg_verify_checksums/Makefile
@@ -34,3 +34,9 @@ uninstall:
 clean distclean maintainer-clean:
 	rm -f pg_verify_checksums$(X) $(OBJS)
 	rm -rf tmp_check
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
diff --git a/src/bin/pg_verify_checksums/t/001_basic.pl b/src/bin/pg_verify_checksums/t/001_basic.pl
new file mode 100644
index 0000000000..1fa2e12db2
--- /dev/null
+++ b/src/bin/pg_verify_checksums/t/001_basic.pl
@@ -0,0 +1,8 @@
+use strict;
+use warnings;
+use TestLib;
+use Test::More tests => 8;
+
+program_help_ok('pg_verify_checksums');
+program_version_ok('pg_verify_checksums');
+program_options_handling_ok('pg_verify_checksums');
diff --git a/src/bin/pg_verify_checksums/t/002_actions.pl b/src/bin/pg_verify_checksums/t/002_actions.pl
new file mode 100644
index 0000000000..2c329bbf8b
--- /dev/null
+++ b/src/bin/pg_verify_checksums/t/002_actions.pl
@@ -0,0 +1,69 @@
+# Do basic sanity checks supported by pg_verify_checksums using
+# an initialized cluster.
+
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 12;
+
+# Initialize node with checksums enabled.
+my $node = get_new_node('node_checksum');
+$node->init(extra => ['--data-checksums']);
+my $pgdata = $node->data_dir;
+
+# Control file should know that checksums are enabled.
+command_like(['pg_controldata', $pgdata],
+	     qr/Data page checksum version:.*1/,
+		 'checksums enabled in control file');
+
+# Checksums pass on a newly-created cluster
+command_ok(['pg_verify_checksums',  '-D', $pgdata],
+		   "checksum checks done and passing");
+
+# Checks cannot happen for an online cluster
+$node->start;
+command_fails(['pg_verify_checksums',  '-D', $pgdata],
+			  "checksum checks not done");
+
+# Create table to corrupt and get its relfilenode
+$node->safe_psql('postgres',
+	"SELECT a INTO corrupt1 FROM generate_series(1,10000) AS a;
+	ALTER TABLE corrupt1 SET (autovacuum_enabled=false);");
+
+my $file_corrupted = $node->safe_psql('postgres',
+	"SELECT pg_relation_filepath('corrupt1')");
+my $relfilenode_corrupted =  $node->safe_psql('postgres',
+	"SELECT relfilenode FROM pg_class WHERE relname = 'corrupt1';");
+
+# Set page header and block size
+my $pageheader_size = 24;
+my $block_size = $node->safe_psql('postgres', 'SHOW block_size;');
+$node->stop;
+
+# Checksums pass for single relfilenode as the table is not corrupted
+# yet.
+command_ok(['pg_verify_checksums',  '-D', $pgdata,
+	'-r', $relfilenode_corrupted],
+	"checksum checks for table relfilenode done and passing");
+
+# Time to create a corruption
+open my $file, '+<', "$pgdata/$file_corrupted";
+seek($file, $pageheader_size, 0);
+syswrite($file, '\0\0\0\0\0\0\0\0\0');
+close $file;
+
+# Checksum checks on single relfilenode fail
+$node->command_checks_all([ 'pg_verify_checksums', '-D', $pgdata, '-r',
+							$relfilenode_corrupted],
+						  1,
+						  [qr/Bad checksums:.*1/],
+						  [qr/checksum verification failed/],
+						  'pg_checksums reports checksum mismatch');
+
+# Global checksum checks fail
+$node->command_checks_all([ 'pg_verify_checksums', '-D', $pgdata],
+						  1,
+						  [qr/Bad checksums:.*1/],
+						  [qr/checksum verification failed/],
+						  'pg_checksums reports checksum mismatch');
#4Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Michael Paquier (#3)
Re: TAP tests for pg_verify_checksums

On 06/10/2018 13:46, Michael Paquier wrote:

On Fri, Oct 05, 2018 at 01:38:05PM +0200, Michael Banck wrote:

It's too late for v11 though at this point I guess?

Unfortunately yes.

I think it would be easy to also test the -r command-line option, as we
already create a table.

Good idea. Let's add this test.

That comment should read 'that checksums are enabled', right?

Indeed. Fixed.

Otherwise, LGTM and I've tested it without finding any problems.

What do you think about the updated version attached?

Looks pretty good to me.

Let's make sure the test names are useful:

+# Checks cannot happen for an online cluster
+$node->start;
+command_fails(['pg_verify_checksums',  '-D', $pgdata],
+			  "checksum checks not done");

The test name should be something like "fails with online cluster".

I would also like to see a test that runs against a cluster without
checksums enabled.

Other than that, it appears to cover everything.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Michael Banck
michael.banck@credativ.de
In reply to: Peter Eisentraut (#4)
Re: TAP tests for pg_verify_checksums

Hi,

Am Dienstag, den 09.10.2018, 16:54 +0200 schrieb Peter Eisentraut:

On 06/10/2018 13:46, Michael Paquier wrote:

What do you think about the updated version attached?

+# Time to create a corruption

That looks a bit weird, maybe "some corupption"? Or maybe it's just me
not being a native speaker.

Other than that, it appears to cover everything.

One more thing we could check is the relfilenode after we corrupted it,
it should also catch the corruption then. Or is that too trivial?

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

#6Michael Paquier
michael@paquier.xyz
In reply to: Michael Banck (#5)
1 attachment(s)
Re: TAP tests for pg_verify_checksums

On Tue, Oct 09, 2018 at 05:14:50PM +0200, Michael Banck wrote:

Am Dienstag, den 09.10.2018, 16:54 +0200 schrieb Peter Eisentraut:

On 06/10/2018 13:46, Michael Paquier wrote:

+# Time to create a corruption

That looks a bit weird, maybe "some corruption"? Or maybe it's just me
not being a native speaker.

Okay, let's do with your suggestion.

I would also like to see a test that runs against a cluster without
checksums enabled.

OK. I have added a test within initdb to save one initdb run.

+# Checks cannot happen for an online cluster
+$node->start;
+command_fails(['pg_verify_checksums',  '-D', $pgdata],
+                         "checksum checks not done");

The test name should be something like "fails with online cluster".

Done. I have put more thoughts into those.

One more thing we could check is the relfilenode after we corrupted it,
it should also catch the corruption then. Or is that too trivial?

There is one as of v3:
+# Checksum checks on single relfilenode fail
+$node->command_checks_all([ 'pg_verify_checksums', '-D', $pgdata, '-r',
+                           $relfilenode_corrupted],
+                         1,
+                         [qr/Bad checksums:.*1/],
+                         [qr/checksum verification failed/],
+                         '');

The resulting patch is attached. Does that look good?
--
Michael

Attachments:

verify-checksums-tap-v4.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 5dc629fd5e..759779adb2 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -8,7 +8,7 @@ use Fcntl ':mode';
 use File::stat qw{lstat};
 use PostgresNode;
 use TestLib;
-use Test::More tests => 18;
+use Test::More tests => 22;
 
 my $tempdir = TestLib::tempdir;
 my $xlogdir = "$tempdir/pgxlog";
@@ -58,6 +58,18 @@ mkdir $datadir;
 			"check PGDATA permissions");
 	}
 }
+
+# Control file should tell that data checksums are disabled by default.
+command_like(['pg_controldata', $datadir],
+			 qr/Data page checksum version:.*0/,
+			 'checksums are disabled in control file');
+# pg_verify_checksums fails with checksums disabled by default.  This is
+# not part of the tests included in pg_verify_checksums to save from
+# the creation of an extra instance.
+command_fails(
+	[ 'pg_verify_checksums', '-D', $datadir],
+	"pg_verify_checksums fails with data checksum disabled");
+
 command_ok([ 'initdb', '-S', $datadir ], 'sync only');
 command_fails([ 'initdb', $datadir ], 'existing data directory');
 
diff --git a/src/bin/pg_verify_checksums/.gitignore b/src/bin/pg_verify_checksums/.gitignore
index d1dcdaf0dd..0e5e569a54 100644
--- a/src/bin/pg_verify_checksums/.gitignore
+++ b/src/bin/pg_verify_checksums/.gitignore
@@ -1 +1,3 @@
 /pg_verify_checksums
+
+/tmp_check/
diff --git a/src/bin/pg_verify_checksums/Makefile b/src/bin/pg_verify_checksums/Makefile
index d16261571f..cfe4ab1b8b 100644
--- a/src/bin/pg_verify_checksums/Makefile
+++ b/src/bin/pg_verify_checksums/Makefile
@@ -34,3 +34,9 @@ uninstall:
 clean distclean maintainer-clean:
 	rm -f pg_verify_checksums$(X) $(OBJS)
 	rm -rf tmp_check
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
diff --git a/src/bin/pg_verify_checksums/t/001_basic.pl b/src/bin/pg_verify_checksums/t/001_basic.pl
new file mode 100644
index 0000000000..1fa2e12db2
--- /dev/null
+++ b/src/bin/pg_verify_checksums/t/001_basic.pl
@@ -0,0 +1,8 @@
+use strict;
+use warnings;
+use TestLib;
+use Test::More tests => 8;
+
+program_help_ok('pg_verify_checksums');
+program_version_ok('pg_verify_checksums');
+program_options_handling_ok('pg_verify_checksums');
diff --git a/src/bin/pg_verify_checksums/t/002_actions.pl b/src/bin/pg_verify_checksums/t/002_actions.pl
new file mode 100644
index 0000000000..dc29da09af
--- /dev/null
+++ b/src/bin/pg_verify_checksums/t/002_actions.pl
@@ -0,0 +1,69 @@
+# Do basic sanity checks supported by pg_verify_checksums using
+# an initialized cluster.
+
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 12;
+
+# Initialize node with checksums enabled.
+my $node = get_new_node('node_checksum');
+$node->init(extra => ['--data-checksums']);
+my $pgdata = $node->data_dir;
+
+# Control file should know that checksums are enabled.
+command_like(['pg_controldata', $pgdata],
+	     qr/Data page checksum version:.*1/,
+		 'checksums enabled in control file');
+
+# Checksums pass on a newly-created cluster
+command_ok(['pg_verify_checksums',  '-D', $pgdata],
+		   "succeeds with offline cluster");
+
+# Checks cannot happen with an online cluster
+$node->start;
+command_fails(['pg_verify_checksums',  '-D', $pgdata],
+			  "fails with online cluster");
+
+# Create table to corrupt and get its relfilenode
+$node->safe_psql('postgres',
+	"SELECT a INTO corrupt1 FROM generate_series(1,10000) AS a;
+	ALTER TABLE corrupt1 SET (autovacuum_enabled=false);");
+
+my $file_corrupted = $node->safe_psql('postgres',
+	"SELECT pg_relation_filepath('corrupt1')");
+my $relfilenode_corrupted =  $node->safe_psql('postgres',
+	"SELECT relfilenode FROM pg_class WHERE relname = 'corrupt1';");
+
+# Set page header and block size
+my $pageheader_size = 24;
+my $block_size = $node->safe_psql('postgres', 'SHOW block_size;');
+$node->stop;
+
+# Checksums are correct for single relfilenode as the table is not
+# corrupted yet.
+command_ok(['pg_verify_checksums',  '-D', $pgdata,
+	'-r', $relfilenode_corrupted],
+	"succeeds for single relfilenode with offline cluster");
+
+# Time to create some corruption
+open my $file, '+<', "$pgdata/$file_corrupted";
+seek($file, $pageheader_size, 0);
+syswrite($file, '\0\0\0\0\0\0\0\0\0');
+close $file;
+
+# Global checksum checks fail
+$node->command_checks_all([ 'pg_verify_checksums', '-D', $pgdata],
+						  1,
+						  [qr/Bad checksums:.*1/],
+						  [qr/checksum verification failed/],
+						  'fails with corrupted data');
+
+# Checksum checks on single relfilenode fail
+$node->command_checks_all([ 'pg_verify_checksums', '-D', $pgdata, '-r',
+							$relfilenode_corrupted],
+						  1,
+						  [qr/Bad checksums:.*1/],
+						  [qr/checksum verification failed/],
+						  'fails for corrupted data on single relfilenode');
#7Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#6)
Re: TAP tests for pg_verify_checksums

On Wed, Oct 10, 2018 at 10:50:02AM +0900, Michael Paquier wrote:

The resulting patch is attached. Does that look good?

And committed. Thanks all for taking the time to review.
--
Michael