verify_heapam for sequences?

Started by Peter Eisentrautover 4 years ago5 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

Is there a reason why contrib/amcheck/verify_heapam.c does not want to
run on sequences? If I take out the checks, it appears to work. Is
this an oversight? Or if there is a reason, maybe it could be stated in
a comment, at least.

#2Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Peter Eisentraut (#1)
Re: verify_heapam for sequences?

On Aug 26, 2021, at 3:03 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:

Is there a reason why contrib/amcheck/verify_heapam.c does not want to run on sequences? If I take out the checks, it appears to work. Is this an oversight? Or if there is a reason, maybe it could be stated in a comment, at least.

Testing the corruption checking logic on all platforms is a bit arduous, because the data layout on disk changes with alignment difference, endianness, etc. The work I did with Tom's help finally got good test coverage across the entire buildfarm, but that test (contrib/amcheck/t/001_verify_heapam.pl) doesn't work for sequences even on my one platform (mac laptop).

I have added a modicum of test coverage for sequences in the attached WIP patch, which is enough to detect sequence corruption on my laptop. It would have to be tested across the buildfarm after being extended to cover more cases. As it stands now, it uses blunt force to corrupt the relation, and only verifies that verify_heapam() returns some corruption, not that it reports the right corruption.

I understand that sequences are really just heap tables, and since we already test corrupted heap tables, we could assume that we already have sufficient coverage. I'm not entirely comfortable with that, though, because future patch authors who modify how tables or sequences work are not necessarily going to think carefully about whether their modifications invalidate that assumption.

Attachments:

v1-0001-WIP-patch-to-support-amcheck-of-sequences.patch.WIPapplication/octet-stream; name=v1-0001-WIP-patch-to-support-amcheck-of-sequences.patch.WIP; x-unix-mode=0644Download+113-6
#3Peter Eisentraut
peter_e@gmx.net
In reply to: Mark Dilger (#2)
Re: verify_heapam for sequences?

On 26.08.21 21:02, Mark Dilger wrote:

I understand that sequences are really just heap tables, and since we already test corrupted heap tables, we could assume that we already have sufficient coverage. I'm not entirely comfortable with that, though, because future patch authors who modify how tables or sequences work are not necessarily going to think carefully about whether their modifications invalidate that assumption.

Well, if we enabled verify_heapam to check sequences, and then someone
were to change the sequence storage, a test that currently reports no
corruption would probably report corruption then?

#4Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Peter Eisentraut (#3)
Re: verify_heapam for sequences?

On Aug 30, 2021, at 1:22 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:

On 26.08.21 21:02, Mark Dilger wrote:

I understand that sequences are really just heap tables, and since we already test corrupted heap tables, we could assume that we already have sufficient coverage. I'm not entirely comfortable with that, though, because future patch authors who modify how tables or sequences work are not necessarily going to think carefully about whether their modifications invalidate that assumption.

Well, if we enabled verify_heapam to check sequences, and then someone were to change the sequence storage, a test that currently reports no corruption would probably report corruption then?

It might. More to the point, any corruption test we create now will be geared towards corrupting the page in a way that verify_heapam will detect, which will be detected whether or not the implementation of sequences changes. That kind of testing won't really do anything.

Perhaps the best we can do is to create a sequence, testing both before and after exercising it a bit. We can't properly guess which exercises (nextval, setval, etc.) will cause corruption testing to fail for some unknown future implementation change, so we just try all the obvious stuff.

The attached patch changes both contrib/amcheck/ and src/bin/pg_amcheck/ to allow checking sequences. In both cases, the changes required are fairly minor, though they both entail some documentation changes.

It seems fairly straightforward that if a user calls verify_heapam() on a sequence, then the new behavior is what they want. It is not quite so clear for pg_amcheck.

In pg_amcheck, the command-line arguments allow discriminating between tables and indexes with materialized views quietly treated as tables (which, of course, they are.) In v14, sequences were not treated as tables, nor checked at all. In this new patch, sequences are quietly treated the same way as tables. By "quietly", I mean there are no command-line switches to specifically filter them in or out separately from filtering ordinary tables.

This is a user-facing behavioral change, and the user might not be imagining sequences specifically when specifying a table name pattern that matches both tables and sequences. Do you see any problem with that? It was already true that materialized views matching a table name pattern would be checked, so this new behavior is not entirely out of line with the old behavior.

The new behavior is documented, and since I'm updating the docs, I made the behavior with respect to materialized views more explicit.

Attachments:

v2-0001-Support-amcheck-of-sequences.patchapplication/octet-stream; name=v2-0001-Support-amcheck-of-sequences.patch; x-unix-mode=0644Download+95-19
#5Peter Eisentraut
peter_e@gmx.net
In reply to: Mark Dilger (#4)
Re: verify_heapam for sequences?

On 30.08.21 21:00, Mark Dilger wrote:

The attached patch changes both contrib/amcheck/ and src/bin/pg_amcheck/ to allow checking sequences. In both cases, the changes required are fairly minor, though they both entail some documentation changes.

It seems fairly straightforward that if a user calls verify_heapam() on a sequence, then the new behavior is what they want. It is not quite so clear for pg_amcheck.

In pg_amcheck, the command-line arguments allow discriminating between tables and indexes with materialized views quietly treated as tables (which, of course, they are.) In v14, sequences were not treated as tables, nor checked at all. In this new patch, sequences are quietly treated the same way as tables. By "quietly", I mean there are no command-line switches to specifically filter them in or out separately from filtering ordinary tables.

This is a user-facing behavioral change, and the user might not be imagining sequences specifically when specifying a table name pattern that matches both tables and sequences. Do you see any problem with that? It was already true that materialized views matching a table name pattern would be checked, so this new behavior is not entirely out of line with the old behavior.

The new behavior is documented, and since I'm updating the docs, I made the behavior with respect to materialized views more explicit.

committed