verify_heapam for sequences?
Is there a reason why contrib/amcheck/verify_heapam.c does not want to
run on sequences? If I take out the checks, it appears to work. Is
this an oversight? Or if there is a reason, maybe it could be stated in
a comment, at least.
On Aug 26, 2021, at 3:03 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
Is there a reason why contrib/amcheck/verify_heapam.c does not want to run on sequences? If I take out the checks, it appears to work. Is this an oversight? Or if there is a reason, maybe it could be stated in a comment, at least.
Testing the corruption checking logic on all platforms is a bit arduous, because the data layout on disk changes with alignment difference, endianness, etc. The work I did with Tom's help finally got good test coverage across the entire buildfarm, but that test (contrib/amcheck/t/001_verify_heapam.pl) doesn't work for sequences even on my one platform (mac laptop).
I have added a modicum of test coverage for sequences in the attached WIP patch, which is enough to detect sequence corruption on my laptop. It would have to be tested across the buildfarm after being extended to cover more cases. As it stands now, it uses blunt force to corrupt the relation, and only verifies that verify_heapam() returns some corruption, not that it reports the right corruption.
I understand that sequences are really just heap tables, and since we already test corrupted heap tables, we could assume that we already have sufficient coverage. I'm not entirely comfortable with that, though, because future patch authors who modify how tables or sequences work are not necessarily going to think carefully about whether their modifications invalidate that assumption.
Attachments:
v1-0001-WIP-patch-to-support-amcheck-of-sequences.patch.WIPapplication/octet-stream; name=v1-0001-WIP-patch-to-support-amcheck-of-sequences.patch.WIP; x-unix-mode=0644Download
From 6f45093233d79bf719b3c6d1df1d2eae971eead2 Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Thu, 26 Aug 2021 11:52:50 -0700
Subject: [PATCH v1] WIP patch to support amcheck of sequences
---
contrib/amcheck/expected/check_heap.out | 6 +-
contrib/amcheck/t/001_verify_heapam.pl | 102 +++++++++++++++++++++++-
contrib/amcheck/verify_heapam.c | 10 ++-
3 files changed, 113 insertions(+), 5 deletions(-)
diff --git a/contrib/amcheck/expected/check_heap.out b/contrib/amcheck/expected/check_heap.out
index ad3086d2aa..c010361025 100644
--- a/contrib/amcheck/expected/check_heap.out
+++ b/contrib/amcheck/expected/check_heap.out
@@ -180,8 +180,10 @@ CREATE SEQUENCE test_sequence;
SELECT * FROM verify_heapam('test_sequence',
startblock := NULL,
endblock := NULL);
-ERROR: cannot check relation "test_sequence"
-DETAIL: This operation is not supported for sequences.
+ blkno | offnum | attnum | msg
+-------+--------+--------+-----
+(0 rows)
+
-- Check that foreign tables are rejected
CREATE FOREIGN DATA WRAPPER dummy;
CREATE SERVER dummy_server FOREIGN DATA WRAPPER dummy;
diff --git a/contrib/amcheck/t/001_verify_heapam.pl b/contrib/amcheck/t/001_verify_heapam.pl
index 4f720a7ed0..174a132264 100644
--- a/contrib/amcheck/t/001_verify_heapam.pl
+++ b/contrib/amcheck/t/001_verify_heapam.pl
@@ -8,7 +8,7 @@ use PostgresNode;
use TestLib;
use Fcntl qw(:seek);
-use Test::More tests => 80;
+use Test::More tests => 133;
my ($node, $result);
@@ -21,6 +21,30 @@ $node->append_conf('postgresql.conf', 'autovacuum=off');
$node->start;
$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+#
+# Check a sequence with no corruption
+#
+fresh_test_sequence('test_seq');
+check_all_options_uncorrupted('test_seq', 'sequence');
+
+#
+# Check a corrupt sequence
+#
+corrupt_sequence_relfile('test_seq');
+detects_sequence_corruption("verify_heapam('test_seq')", "corrupted sequence");
+detects_sequence_corruption(
+ "verify_heapam('test_seq', skip := 'all-visible')",
+ "corrupted sequence skipping all-visible");
+detects_sequence_corruption(
+ "verify_heapam('test_seq', skip := 'all-frozen')",
+ "corrupted sequence skipping all-frozen");
+detects_sequence_corruption(
+ "verify_heapam('test_seq', check_toast := false)",
+ "corrupted sequence skipping toast");
+detects_sequence_corruption(
+ "verify_heapam('test_seq', startblock := 0, endblock := 0)",
+ "corrupted sequence checking only block zero");
+
#
# Check a table with data loaded but no corruption, freezing, etc.
#
@@ -110,6 +134,71 @@ sub fresh_test_table
));
}
+# (Re)create and populate a test sequence of the given name
+sub fresh_test_sequence
+{
+ my ($seqname) = @_;
+
+ return $node->safe_psql(
+ 'postgres', qq(
+ DROP SEQUENCE IF EXISTS $seqname CASCADE;
+ CREATE SEQUENCE $seqname
+ INCREMENT BY 13
+ MINVALUE 17
+ START WITH 23;
+ BEGIN;
+ SELECT nextval('$seqname');
+ COMMIT;
+ ALTER SEQUENCE $seqname RESTART;
+ SELECT setval('$seqname', 40);
+ SELECT nextval('$seqname');
+ SELECT nextval('$seqname');
+ SELECT nextval('$seqname');
+ ROLLBACK;
+ BEGIN;
+ SELECT setval('$seqname', 35);
+ SELECT nextval('$seqname');
+ COMMIT;
+ ));
+}
+
+# Stops the test node, corrupts the first page of the named sequence, and
+# restarts the node.
+sub corrupt_sequence_relfile
+{
+ my ($relname) = @_;
+ my $relpath = relation_filepath($relname);
+
+ $node->stop;
+
+ my $fh;
+ open($fh, '+<', $relpath)
+ or BAIL_OUT("open failed: $!");
+ binmode $fh;
+
+ my $size = (stat($fh))[7];
+
+ # Loop over 2k chunks, to be compatible with all supported page sizes
+ for (my $start = 0; $start < $size; $start += 2048)
+ {
+ # Don't touch the page header
+ seek($fh, $start + 32, SEEK_SET)
+ or BAIL_OUT("seek failed: $!");
+
+ # Corrupt the remainder of the 2k chunk
+ syswrite(
+ $fh,
+ pack("L*", (0xAAA15550, 0xAAA0D550, 0x00010000,
+ 0x00008000, 0x0000800F, 0x001e8000) x 84)
+ ) or BAIL_OUT("syswrite failed: $!");
+ }
+
+ close($fh)
+ or BAIL_OUT("close failed: $!");
+
+ $node->start;
+}
+
# Stops the test node, corrupts the first page of the named relation, and
# restarts the node.
sub corrupt_first_page
@@ -141,6 +230,17 @@ sub corrupt_first_page
$node->start;
}
+sub detects_sequence_corruption
+{
+ my ($function, $testname) = @_;
+
+ detects_corruption(
+ $function,
+ $testname,
+ qr/\w+/ # accept any corruption message
+ );
+}
+
sub detects_heap_corruption
{
my ($function, $testname) = @_;
diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index 226271923a..85832183e6 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -305,14 +305,20 @@ verify_heapam(PG_FUNCTION_ARGS)
*/
if (ctx.rel->rd_rel->relkind != RELKIND_RELATION &&
ctx.rel->rd_rel->relkind != RELKIND_MATVIEW &&
- ctx.rel->rd_rel->relkind != RELKIND_TOASTVALUE)
+ ctx.rel->rd_rel->relkind != RELKIND_TOASTVALUE &&
+ ctx.rel->rd_rel->relkind != RELKIND_SEQUENCE)
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot check relation \"%s\"",
RelationGetRelationName(ctx.rel)),
errdetail_relkind_not_supported(ctx.rel->rd_rel->relkind)));
- if (ctx.rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+ /*
+ * Sequences always use heap AM, but they don't show that in the catalogs.
+ * Other relkinds might be using a different AM, so check.
+ */
+ if (ctx.rel->rd_rel->relkind != RELKIND_SEQUENCE &&
+ ctx.rel->rd_rel->relam != HEAP_TABLE_AM_OID)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("only heap AM is supported")));
--
2.21.1 (Apple Git-122.3)
On 26.08.21 21:02, Mark Dilger wrote:
I understand that sequences are really just heap tables, and since we already test corrupted heap tables, we could assume that we already have sufficient coverage. I'm not entirely comfortable with that, though, because future patch authors who modify how tables or sequences work are not necessarily going to think carefully about whether their modifications invalidate that assumption.
Well, if we enabled verify_heapam to check sequences, and then someone
were to change the sequence storage, a test that currently reports no
corruption would probably report corruption then?
On Aug 30, 2021, at 1:22 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
On 26.08.21 21:02, Mark Dilger wrote:
I understand that sequences are really just heap tables, and since we already test corrupted heap tables, we could assume that we already have sufficient coverage. I'm not entirely comfortable with that, though, because future patch authors who modify how tables or sequences work are not necessarily going to think carefully about whether their modifications invalidate that assumption.
Well, if we enabled verify_heapam to check sequences, and then someone were to change the sequence storage, a test that currently reports no corruption would probably report corruption then?
It might. More to the point, any corruption test we create now will be geared towards corrupting the page in a way that verify_heapam will detect, which will be detected whether or not the implementation of sequences changes. That kind of testing won't really do anything.
Perhaps the best we can do is to create a sequence, testing both before and after exercising it a bit. We can't properly guess which exercises (nextval, setval, etc.) will cause corruption testing to fail for some unknown future implementation change, so we just try all the obvious stuff.
The attached patch changes both contrib/amcheck/ and src/bin/pg_amcheck/ to allow checking sequences. In both cases, the changes required are fairly minor, though they both entail some documentation changes.
It seems fairly straightforward that if a user calls verify_heapam() on a sequence, then the new behavior is what they want. It is not quite so clear for pg_amcheck.
In pg_amcheck, the command-line arguments allow discriminating between tables and indexes with materialized views quietly treated as tables (which, of course, they are.) In v14, sequences were not treated as tables, nor checked at all. In this new patch, sequences are quietly treated the same way as tables. By "quietly", I mean there are no command-line switches to specifically filter them in or out separately from filtering ordinary tables.
This is a user-facing behavioral change, and the user might not be imagining sequences specifically when specifying a table name pattern that matches both tables and sequences. Do you see any problem with that? It was already true that materialized views matching a table name pattern would be checked, so this new behavior is not entirely out of line with the old behavior.
The new behavior is documented, and since I'm updating the docs, I made the behavior with respect to materialized views more explicit.
Attachments:
v2-0001-Support-amcheck-of-sequences.patchapplication/octet-stream; name=v2-0001-Support-amcheck-of-sequences.patch; x-unix-mode=0644Download
From 1f679759a5711b4fa55cd7e021d92872def1646e Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Thu, 26 Aug 2021 11:52:50 -0700
Subject: [PATCH v2] Support amcheck of sequences.
Sequences were left out of the list of relation kinds that
verify_heapam knew how to check, though it is fairly trivial to
allow them. Doing that, and while at it, updating pg_amcheck to
include sequences in relations matched by table and relation
patterns.
---
contrib/amcheck/expected/check_heap.out | 6 ++-
contrib/amcheck/t/001_verify_heapam.pl | 68 ++++++++++++++++++++++++-
contrib/amcheck/verify_heapam.c | 10 +++-
doc/src/sgml/amcheck.sgml | 8 +--
doc/src/sgml/ref/pg_amcheck.sgml | 15 +++---
src/bin/pg_amcheck/pg_amcheck.c | 6 +--
6 files changed, 95 insertions(+), 18 deletions(-)
diff --git a/contrib/amcheck/expected/check_heap.out b/contrib/amcheck/expected/check_heap.out
index ad3086d2aa..c010361025 100644
--- a/contrib/amcheck/expected/check_heap.out
+++ b/contrib/amcheck/expected/check_heap.out
@@ -180,8 +180,10 @@ CREATE SEQUENCE test_sequence;
SELECT * FROM verify_heapam('test_sequence',
startblock := NULL,
endblock := NULL);
-ERROR: cannot check relation "test_sequence"
-DETAIL: This operation is not supported for sequences.
+ blkno | offnum | attnum | msg
+-------+--------+--------+-----
+(0 rows)
+
-- Check that foreign tables are rejected
CREATE FOREIGN DATA WRAPPER dummy;
CREATE SERVER dummy_server FOREIGN DATA WRAPPER dummy;
diff --git a/contrib/amcheck/t/001_verify_heapam.pl b/contrib/amcheck/t/001_verify_heapam.pl
index 4f720a7ed0..ba40f64b58 100644
--- a/contrib/amcheck/t/001_verify_heapam.pl
+++ b/contrib/amcheck/t/001_verify_heapam.pl
@@ -8,7 +8,7 @@ use PostgresNode;
use TestLib;
use Fcntl qw(:seek);
-use Test::More tests => 80;
+use Test::More tests => 272;
my ($node, $result);
@@ -60,6 +60,22 @@ detects_no_corruption(
"verify_heapam('test', skip := 'all-frozen')",
"all-frozen corrupted table skipping all-frozen");
+#
+# Check a sequence with no corruption. The current implementation of sequences
+# doesn't require its own test setup, since sequences are really just heap
+# tables under-the-hood. To guard against future implementation changes made
+# without remembering to update verify_heapam, we create and exercise a
+# sequence, checking along the way that it passes corruption checks.
+#
+fresh_test_sequence('test_seq');
+check_all_options_uncorrupted('test_seq', 'plain');
+advance_test_sequence('test_seq');
+check_all_options_uncorrupted('test_seq', 'plain');
+set_test_sequence('test_seq');
+check_all_options_uncorrupted('test_seq', 'plain');
+reset_test_sequence('test_seq');
+check_all_options_uncorrupted('test_seq', 'plain');
+
# Returns the filesystem path for the named relation.
sub relation_filepath
{
@@ -110,6 +126,56 @@ sub fresh_test_table
));
}
+# Create a test sequence of the given name.
+sub fresh_test_sequence
+{
+ my ($seqname) = @_;
+
+ return $node->safe_psql(
+ 'postgres', qq(
+ DROP SEQUENCE IF EXISTS $seqname CASCADE;
+ CREATE SEQUENCE $seqname
+ INCREMENT BY 13
+ MINVALUE 17
+ START WITH 23;
+ SELECT nextval('$seqname');
+ SELECT setval('$seqname', currval('$seqname') + nextval('$seqname'));
+ ));
+}
+
+# Call SQL functions to increment the sequence
+sub advance_test_sequence
+{
+ my ($seqname) = @_;
+
+ return $node->safe_psql(
+ 'postgres', qq(
+ SELECT nextval('$seqname');
+ ));
+}
+
+# Call SQL functions to set the sequence
+sub set_test_sequence
+{
+ my ($seqname) = @_;
+
+ return $node->safe_psql(
+ 'postgres', qq(
+ SELECT setval('$seqname', 102);
+ ));
+}
+
+# Call SQL functions to reset the sequence
+sub reset_test_sequence
+{
+ my ($seqname) = @_;
+
+ return $node->safe_psql(
+ 'postgres', qq(
+ ALTER SEQUENCE $seqname RESTART WITH 51
+ ));
+}
+
# Stops the test node, corrupts the first page of the named relation, and
# restarts the node.
sub corrupt_first_page
diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index 226271923a..85832183e6 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -305,14 +305,20 @@ verify_heapam(PG_FUNCTION_ARGS)
*/
if (ctx.rel->rd_rel->relkind != RELKIND_RELATION &&
ctx.rel->rd_rel->relkind != RELKIND_MATVIEW &&
- ctx.rel->rd_rel->relkind != RELKIND_TOASTVALUE)
+ ctx.rel->rd_rel->relkind != RELKIND_TOASTVALUE &&
+ ctx.rel->rd_rel->relkind != RELKIND_SEQUENCE)
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot check relation \"%s\"",
RelationGetRelationName(ctx.rel)),
errdetail_relkind_not_supported(ctx.rel->rd_rel->relkind)));
- if (ctx.rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+ /*
+ * Sequences always use heap AM, but they don't show that in the catalogs.
+ * Other relkinds might be using a different AM, so check.
+ */
+ if (ctx.rel->rd_rel->relkind != RELKIND_SEQUENCE &&
+ ctx.rel->rd_rel->relam != HEAP_TABLE_AM_OID)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("only heap AM is supported")));
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index c570690b59..11d1eb5af2 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -220,10 +220,10 @@ SET client_min_messages = DEBUG1;
</term>
<listitem>
<para>
- Checks a table for structural corruption, where pages in the relation
- contain data that is invalidly formatted, and for logical corruption,
- where pages are structurally valid but inconsistent with the rest of the
- database cluster.
+ Checks a table, sequence, or materialized view for structural corruption,
+ where pages in the relation contain data that is invalidly formatted, and
+ for logical corruption, where pages are structurally valid but
+ inconsistent with the rest of the database cluster.
</para>
<para>
The following optional arguments are recognized:
diff --git a/doc/src/sgml/ref/pg_amcheck.sgml b/doc/src/sgml/ref/pg_amcheck.sgml
index d00c48d0e7..1fd0ecd911 100644
--- a/doc/src/sgml/ref/pg_amcheck.sgml
+++ b/doc/src/sgml/ref/pg_amcheck.sgml
@@ -41,8 +41,9 @@ PostgreSQL documentation
</para>
<para>
- Only table relations and btree indexes are currently supported. Other
- relation types are silently skipped.
+ Only ordinary and toast table relations, materialized views, sequences, and
+ btree indexes are currently supported. Other relation types are silently
+ skipped.
</para>
<para>
@@ -124,7 +125,7 @@ PostgreSQL documentation
</para>
<para>
This is similar to the <option>--relation</option> option, except that
- it applies only to indexes, not tables.
+ it applies only to indexes, not to other relation types.
</para>
</listitem>
</varlistentry>
@@ -140,7 +141,7 @@ PostgreSQL documentation
</para>
<para>
This is similar to the <option>--exclude-relation</option> option,
- except that it applies only to indexes, not tables.
+ except that it applies only to indexes, not other relation types.
</para>
</listitem>
</varlistentry>
@@ -236,7 +237,8 @@ PostgreSQL documentation
</para>
<para>
This is similar to the <option>--relation</option> option, except that
- it applies only to tables, not indexes.
+ it applies only to tables, materialized views, and sequences, not to
+ indexes.
</para>
</listitem>
</varlistentry>
@@ -252,7 +254,8 @@ PostgreSQL documentation
</para>
<para>
This is similar to the <option>--exclude-relation</option> option,
- except that it applies only to tables, not indexes.
+ except that it applies only to tables, materialized views, and
+ sequences, not to indexes.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index f4b1ef95e9..9d9c3392d9 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -1915,14 +1915,14 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
if (opts.allrel)
appendPQExpBuffer(&sql,
" AND c.relam = %u "
- "AND c.relkind IN ('r', 'm', 't') "
+ "AND c.relkind IN ('r', 'S', 'm', 't') "
"AND c.relnamespace != %u",
HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
else
appendPQExpBuffer(&sql,
" AND c.relam IN (%u, %u)"
- "AND c.relkind IN ('r', 'm', 't', 'i') "
- "AND ((c.relam = %u AND c.relkind IN ('r', 'm', 't')) OR "
+ "AND c.relkind IN ('r', 'S', 'm', 't', 'i') "
+ "AND ((c.relam = %u AND c.relkind IN ('r', 'S', 'm', 't')) OR "
"(c.relam = %u AND c.relkind = 'i'))",
HEAP_TABLE_AM_OID, BTREE_AM_OID,
HEAP_TABLE_AM_OID, BTREE_AM_OID);
--
2.21.1 (Apple Git-122.3)
On 30.08.21 21:00, Mark Dilger wrote:
The attached patch changes both contrib/amcheck/ and src/bin/pg_amcheck/ to allow checking sequences. In both cases, the changes required are fairly minor, though they both entail some documentation changes.
It seems fairly straightforward that if a user calls verify_heapam() on a sequence, then the new behavior is what they want. It is not quite so clear for pg_amcheck.
In pg_amcheck, the command-line arguments allow discriminating between tables and indexes with materialized views quietly treated as tables (which, of course, they are.) In v14, sequences were not treated as tables, nor checked at all. In this new patch, sequences are quietly treated the same way as tables. By "quietly", I mean there are no command-line switches to specifically filter them in or out separately from filtering ordinary tables.
This is a user-facing behavioral change, and the user might not be imagining sequences specifically when specifying a table name pattern that matches both tables and sequences. Do you see any problem with that? It was already true that materialized views matching a table name pattern would be checked, so this new behavior is not entirely out of line with the old behavior.
The new behavior is documented, and since I'm updating the docs, I made the behavior with respect to materialized views more explicit.
committed