BUG #17212: pg_amcheck fails on checking temporary relations
The following bug has been logged on the website:
Bug reference: 17212
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 14.0
Operating system: Ubuntu 20.04
Description:
When pg_amcheck runs against a database containing temporary tables:
echo "
CREATE TEMP TABLE t(i int);
CREATE INDEX t_idx ON t(i);
INSERT INTO t VALUES (1);
SELECT pg_sleep(5);
" | psql &
pg_amcheck --install-missing -a --heapallindexed --parent-check
--rootdescend --progress || echo "FAIL"
it fails with the following errors:
btree index "regression.pg_temp_4.t_idx":0%)
ERROR: cannot access temporary tables of other sessions
DETAIL: Index "t_idx" is associated with temporary relation.
heap table "regression.pg_temp_4.t":
ERROR: cannot access temporary tables of other sessions
779/779 relations (100%), 2806/2806 pages (100%)
FAIL
Although you can add --exclude-relation=*.pg_temp*.*, this behaviour differs
from the behaviour of pg_dump and friends, which skip such relations
silently.
On Sat, Oct 2, 2021 at 4:49 AM PG Bug reporting form
<noreply@postgresql.org> wrote:
Although you can add --exclude-relation=*.pg_temp*.*, this behaviour differs
from the behaviour of pg_dump and friends, which skip such relations
silently.
I agree -- this behavior is a bug.
Can you propose a fix, Mark?
--
Peter Geoghegan
On Oct 2, 2021, at 10:32 AM, Peter Geoghegan <pg@bowt.ie> wrote:
On Sat, Oct 2, 2021 at 4:49 AM PG Bug reporting form
<noreply@postgresql.org> wrote:Although you can add --exclude-relation=*.pg_temp*.*, this behaviour differs
from the behaviour of pg_dump and friends, which skip such relations
silently.I agree -- this behavior is a bug.
Can you propose a fix, Mark?
The attached patch includes a test case for this, which shows the problems against the current pg_amcheck.c, and a new version of pg_amcheck.c which fixes the bug. Could you review it?
Thanks for bringing this to my attention.
Attachments:
v1-0001-Bug-fix-do-not-check-temp-tables-from-pg_amcheck.patchapplication/octet-stream; name=v1-0001-Bug-fix-do-not-check-temp-tables-from-pg_amcheck.patch; x-unix-mode=0644Download+41-8
On Oct 3, 2021, at 10:04 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
On Oct 2, 2021, at 10:32 AM, Peter Geoghegan <pg@bowt.ie> wrote:
On Sat, Oct 2, 2021 at 4:49 AM PG Bug reporting form
<noreply@postgresql.org> wrote:Although you can add --exclude-relation=*.pg_temp*.*, this behaviour differs
from the behaviour of pg_dump and friends, which skip such relations
silently.I agree -- this behavior is a bug.
Can you propose a fix, Mark?
The attached patch includes a test case for this, which shows the problems against the current pg_amcheck.c, and a new version of pg_amcheck.c which fixes the bug. Could you review it?
Thanks for bringing this to my attention.
Reposting to pgsql-hackers in preparation for making a commitfest entry.
Attachments:
v1-0001-Bug-fix-do-not-check-temp-tables-from-pg_amcheck.patchapplication/octet-stream; name=v1-0001-Bug-fix-do-not-check-temp-tables-from-pg_amcheck.patch; x-unix-mode=0644Download+41-8
Hello Mark,
04.10.2021 01:20, Mark Dilger wrote:
The attached patch includes a test case for this, which shows the problems against the current pg_amcheck.c, and a new version of pg_amcheck.c which fixes the bug. Could you review it?
Thanks for bringing this to my attention.
There is another issue, that maybe should be discussed separately (or
this thread could be renamed to "... on checking specific relations"),
but the solution could be similar to that.
pg_amcheck also fails on checking invalid indexes, that could be created
legitimately by the CREATE INDEX CONCURRENTLY command.
For example, consider the following script:
psql -c "CREATE TABLE t(i numeric); INSERT INTO t VALUES
(generate_series(1, 10000000));"
psql -c "CREATE INDEX CONCURRENTLY t_idx ON t(i);" &
pg_amcheck -a --install-missing --heapallindexed --rootdescend
--progress || echo "FAIL"
pg_amcheck fails with:
btree index "regression.public.t_idx":
��� ERROR:� cannot check index "t_idx"
��� DETAIL:� Index is not valid.
781/781 relations (100%), 2806/2806 pages (100%)
FAIL
When an index created without CONCURRENTLY, it runs successfully.
Beside that, it seems that pg_amcheck produces a deadlock in such a case:
2021-10-04 11:23:38.584 MSK [1451296] ERROR:� deadlock detected
2021-10-04 11:23:38.584 MSK [1451296] DETAIL:� Process 1451296 waits for
ShareLock on virtual transaction 5/542; blocked by process 1451314.
��� Process 1451314 waits for ShareLock on relation 16385 of database
16384; blocked by process 1451296.
��� Process 1451296: CREATE INDEX CONCURRENTLY t_idx ON t(i);
��� Process 1451314: SELECT * FROM
"pg_catalog".bt_index_parent_check(index := '16390'::regclass,
heapallindexed := true, rootdescend := true)
2021-10-04 11:23:38.584 MSK [1451296] HINT:� See server log for query
details.
2021-10-04 11:23:38.584 MSK [1451296] STATEMENT:� CREATE INDEX
CONCURRENTLY t_idx ON t(i);
I think that the deadlock is yet another issue, as invalid indexes could
appear in other circumstances too.
Best regards,
Alexander
On Oct 4, 2021, at 2:00 AM, Alexander Lakhin <exclusion@gmail.com> wrote:
Thank you, Alexander, for these bug reports.
There is another issue, that maybe should be discussed separately (or
this thread could be renamed to "... on checking specific relations"),
but the solution could be similar to that.
pg_amcheck also fails on checking invalid indexes, that could be created
legitimately by the CREATE INDEX CONCURRENTLY command.
I believe this is a bug in amcheck's btree checking functions. Peter, can you take a look?
For example, consider the following script:
psql -c "CREATE TABLE t(i numeric); INSERT INTO t VALUES
(generate_series(1, 10000000));"
psql -c "CREATE INDEX CONCURRENTLY t_idx ON t(i);" &
pg_amcheck -a --install-missing --heapallindexed --rootdescend
--progress || echo "FAIL"pg_amcheck fails with:
btree index "regression.public.t_idx":
ERROR: cannot check index "t_idx"
DETAIL: Index is not valid.
781/781 relations (100%), 2806/2806 pages (100%)
FAIL
Yes, I can reproduce this following your steps. (It's always appreciated to have steps to reproduce.)
I can also get this failure without pg_amcheck, going directly to the btree checking code. Having already built the table as you prescribe:
amcheck % psql -c "CREATE INDEX CONCURRENTLY t_idx ON t(i);" & sleep 0.1 && psql -c "SELECT * FROM pg_catalog.bt_index_parent_check(index := 't_idx'::regclass, heapallindexed := true, rootdescend := true)"
[1]: + exit 1 psql -c "CREATE INDEX CONCURRENTLY t_idx ON t(i);"
ERROR: deadlock detected
DETAIL: Process 9555 waits for ShareLock on virtual transaction 5/11; blocked by process 9558.
Process 9558 waits for ShareLock on relation 16406 of database 16384; blocked by process 9555.
HINT: See server log for query details.
ERROR: cannot check index "t_idx"
DETAIL: Index is not valid.
[1]: + exit 1 psql -c "CREATE INDEX CONCURRENTLY t_idx ON t(i);"
If Peter agrees that this is not pg_amcheck specific, then we should start a new thread to avoid confusing the commitfest tickets for these two items.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Oct 4, 2021 at 8:10 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
There is another issue, that maybe should be discussed separately (or
this thread could be renamed to "... on checking specific relations"),
but the solution could be similar to that.
pg_amcheck also fails on checking invalid indexes, that could be created
legitimately by the CREATE INDEX CONCURRENTLY command.I believe this is a bug in amcheck's btree checking functions. Peter, can you take a look?
Why do you say that? verify_nbtree.c will throw an error when called
with an invalid index -- which is what we actually see here. Obviously
that is the intended behavior, and always has been. This hasn't been a
problem before now, probably because the sample verification query in
the docs (under bt_index_check()) accounts for this directly.
Why shouldn't we expect pg_amcheck to do the same thing, at the SQL
level? It's practically the same thing as the temp table issue.
Indeed, verify_nbtree.c will throw an error on a temp table (at least
if it's from another session).
I can also get this failure without pg_amcheck, going directly to the btree checking code. Having already built the table as you prescribe:
ERROR: deadlock detected
DETAIL: Process 9555 waits for ShareLock on virtual transaction 5/11; blocked by process 9558.
Process 9558 waits for ShareLock on relation 16406 of database 16384; blocked by process 9555.
HINT: See server log for query details.
ERROR: cannot check index "t_idx"
DETAIL: Index is not valid.
I think that the deadlock is just another symptom of the same problem.
--
Peter Geoghegan
On Mon, Oct 4, 2021 at 2:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:
There is another issue, that maybe should be discussed separately (or
this thread could be renamed to "... on checking specific relations"),
but the solution could be similar to that.
Thanks for the report!
I wonder if verify_heapam.c does the right thing with unlogged tables
when verification runs on a standby -- a brief glance at the code
leaves me with the impression that it's not handled there. Note that
verify_nbtree.c initially got it wrong. The issue was fixed by bugfix
commit 6754fe65. Before then nbtree verification could throw a nasty
low-level smgr error, just because we had an unlogged table in hot
standby mode.
Note that we deliberately skip indexes when this happens (we don't
error out), unlike the temp buffers (actually temp table) case. This
seems like the right set of behaviors. We really don't want to have to
throw an "invalid object type" style error just because verification
runs during recovery. Plus it just seems logical to assume that
unlogged indexes/tables don't have storage when we're in hot standby
mode, and so must simply have nothing for us to verify.
--
Peter Geoghegan
On Oct 4, 2021, at 10:58 AM, Peter Geoghegan <pg@bowt.ie> wrote:
On Mon, Oct 4, 2021 at 8:10 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
There is another issue, that maybe should be discussed separately (or
this thread could be renamed to "... on checking specific relations"),
but the solution could be similar to that.
pg_amcheck also fails on checking invalid indexes, that could be created
legitimately by the CREATE INDEX CONCURRENTLY command.I believe this is a bug in amcheck's btree checking functions. Peter, can you take a look?
Why do you say that?
Because REINDEX CONCURRENTLY and the bt_index_parent_check() function seem to have lock upgrade hazards that are unrelated to pg_amcheck.
This hasn't been a
problem before now, probably because the sample verification query in
the docs (under bt_index_check()) accounts for this directly.
It doesn't say anything about deadlocks, but yes, it mentions errors will be raised unless the caller filters out indexes that are invalid or not ready.
On to pg_amcheck's behavior....
I see no evidence in the OP's complaint that pg_amcheck is misbehaving. It launches a worker to check each relation, prints for the user's benefit any errors those checks raise, and finally returns 0 if they all pass and 2 otherwise. Since not all relations could be checked, 2 is returned. Returning 0 would be misleading, as it implies everything was checked and passed, and it can't honestly say that. The return value 2 does not mean that anything failed. It means that not all checks passed. When a 2 is returned, the user is expected to read the output and decide what, if anything, they want to do about it. In this case, the user might decide to wait until the reindex finishes and check again, or they might decide they don't care.
It is true that pg_amcheck is calling bt_index_parent_check() on an invalid index, but so what? If it chose not to do so, it would still need to print a message about the index being unavailable for checking, and it would still have to return 2. It can't return 0, and it is unhelpful to leave the user in the dark about the fact that not all indexes are in the right state for checking. So it would still print the same error message and still return 2.
I think this bug report is really a feature request. The OP appears to want an option to toggle on/off the printing of such information, perhaps with not printing it as the default.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Oct 4, 2021 at 3:36 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
I believe this is a bug in amcheck's btree checking functions. Peter, can you take a look?
Why do you say that?
Because REINDEX CONCURRENTLY and the bt_index_parent_check() function seem to have lock upgrade hazards that are unrelated to pg_amcheck.
The problem with that argument is that the bt_index_parent_check()
function isn't doing anything particularly special, apart from
dropping the lock. That has been its behavior for many years now.
On to pg_amcheck's behavior....
I see no evidence in the OP's complaint that pg_amcheck is misbehaving. It launches a worker to check each relation, prints for the user's benefit any errors those checks raise, and finally returns 0 if they all pass and 2 otherwise. Since not all relations could be checked, 2 is returned. Returning 0 would be misleading, as it implies everything was checked and passed, and it can't honestly say that. The return value 2 does not mean that anything failed. It means that not all checks passed. When a 2 is returned, the user is expected to read the output and decide what, if anything, they want to do about it. In this case, the user might decide to wait until the reindex finishes and check again, or they might decide they don't care.
It is true that pg_amcheck is calling bt_index_parent_check() on an invalid index, but so what? If it chose not to do so, it would still need to print a message about the index being unavailable for checking, and it would still have to return 2.
Why would it have to print such a message? You seem to be presenting
this as if there is some authoritative, precise, relevant definition
of "the relations that pg_amcheck sees". But to me the relevant
details look arbitrary at best.
It can't return 0, and it is unhelpful to leave the user in the dark about the fact that not all indexes are in the right state for checking.
Why is that unhelpful? More to the point, *why* would this alternative
behavior constitute "leaving the user in the dark"?
What about the case where the pg_class entry isn't visible to our MVCC
snapshot? Why is "skipping" such a relation not just as unhelpful?
I think this bug report is really a feature request. The OP appears to want an option to toggle on/off the printing of such information, perhaps with not printing it as the default.
And I don't understand why you think that clearly-accidental
implementation details (really just bugs) should be treated as
axiomatic truths about how pg_amcheck must work. Should we now "fix"
pg_dump so that it matches pg_amcheck?
All of the underlying errors are cases that were clearly intended to
catch user error -- every single one. But apparently pg_amcheck is
incapable of error, by definition. Like HAL 9000.
--
Peter Geoghegan
On Oct 4, 2021, at 4:10 PM, Peter Geoghegan <pg@bowt.ie> wrote:
And I don't understand why you think that clearly-accidental
implementation details (really just bugs) should be treated as
axiomatic truths about how pg_amcheck must work. Should we now "fix"
pg_dump so that it matches pg_amcheck?All of the underlying errors are cases that were clearly intended to
catch user error -- every single one. But apparently pg_amcheck is
incapable of error, by definition. Like HAL 9000.
On the contrary, I got all the way finished writing a patch to have pg_amcheck do as you suggest before it dawned on me to wonder if that was the right way to go. I certainly don't assume pg_amcheck is correct by definition. I already posted a patch for the temporary tables bug upthread having never argued that it was anything other than a bug. I also wrote a patch for verify_heapam to fix the problem with unlogged tables on standbys, and was developing a test for that, when I got your email. I'm not arguing against that being a bug, either. Hopefully, I can get that properly tested and post it before too long.
I am concerned about giving the user the false impression that an index (or table) was checked when it was not. I don't see the logic in
pg_amcheck -i idx1 -i idx2 -i idx3
skipping all three indexes and then reporting success. What if the user launches the pg_amcheck command precisely because they see error messages in the logs during a long running reindex command, and are curious if the index so generated is corrupt. You can't assume the user knows the index is still being reindexed. If the last message logged was some time ago, they might assume the process has finished. So something other than a silent success is needed to let them know what is going on.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Oct 4, 2021, at 4:28 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
pg_amcheck -i idx1 -i idx2 -i idx3
I forgot to mention: There's a continuum between `pg_amcheck -a` which checks everything in all databases of the cluster, and `pg_amcheck -i just_one_index`. There are any number of combinations of object names, schema names, database names, and patterns over the same, which select anything from an empty set to a huge set of things to check. I'm trying to keep the behavior the same for all of those, and that's why I'm trying to avoid having `pg_amcheck -a` silently skip indexes that are unavailable for checking while having `pg_amcheck -i just_one_index` give a report about the index. I wouldn't know where to draw the line between reporting the issue and not, and I doubt whatever line I choose will be intuitive to users.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Oct 4, 2021 at 4:28 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
I am concerned about giving the user the false impression that an index (or table) was checked when it was not. I don't see the logic in
pg_amcheck -i idx1 -i idx2 -i idx3
skipping all three indexes and then reporting success.
This is the first time that anybody mentioned the -i option on the
thread. I read your previous remarks as making a very broad statement,
about every single issue.
Anyway, the issue with -i doesn't seem like it changes that much. Why
not just behave as if there is no such "visible" index? That's the
same condition, for all practical purposes. If that approach doesn't
seem good enough, then the error message can be refined to make the
user aware of the specific issue.
What if the user launches the pg_amcheck command precisely because they see error messages in the logs during a long running reindex command, and are curious if the index so generated is corrupt.
I'm guessing that you meant REINDEX CONCURRENTLY.
Since you're talking about the case where it has an error, the whole
index build must have failed. So the user would get exactly what
they'd expect -- verification of the original index, without any
hindrance from the new/failed index.
(Thinks for a moment...)
Actually, I think that we'd only verify the original index, even
before the error with CONCURRENTLY (though I've not checked that point
myself).
--
Peter Geoghegan
On Oct 4, 2021, at 4:45 PM, Peter Geoghegan <pg@bowt.ie> wrote:
I'm guessing that you meant REINDEX CONCURRENTLY.
Yes.
Since you're talking about the case where it has an error
Sorry, I realized after hitting <send> that you might take it that way, but I mean the logs generally, not just postgres logs. If somebody runs "reindex concurrently" on all tables at midnight every night, and they see one morning in the (non-postgres) logs from the midnight hour weird error messages about their RAID controller, they may well want to check all their indexes to see if any of them were corrupted. This is a totally made-up example, but the idea that a user might want to check their indexes, tables, or both owing to errors of some kind is not far-fetched.
, the whole
index build must have failed. So the user would get exactly what
they'd expect -- verification of the original index, without any
hindrance from the new/failed index.
Right, in that case, but not if hardware errors corrupted the index, and generated logs, without happening to trip up the reindex concurrently statement itself.
(Thinks for a moment...)
Actually, I think that we'd only verify the original index, even
before the error with CONCURRENTLY (though I've not checked that point
myself).
To get back on track, let me say that I'm not taking the position that what pg_amcheck currently does is necessarily correct, but just that I'd like to be careful about what we change, if anything. There are three things that seem to irritate people:
1) A non-zero exit code means "not everything was checked and passed" rather than "at least one thing is definitely corrupt".
2) Skipping of indexes is reported to the user with the word 'ERROR' in the report rather than, say, 'NOTICE'.
3) Deadlocks can occur
I have resisted changing #1 on the theory that `pg_amcheck --all && ./post_all_checks_pass.sh` should only run the post_all_checks_pass.sh if indeed all checks have passed, and I'm interpreting skipping an index check as being contrary to that. But maybe that's wrong of me. I don't know. There is already sloppiness between the time that pg_amcheck resolves which database relations are matched by --all, --table, --index, etc. and the time that all the checks are started, and again between that time and when the last one is complete. Database objects could be created or dropped during those spans of time, in which case --all doesn't have quite so well defined a meaning. But the user running pg_amcheck might also *know* that they aren't running any such DDL, and therefore expect --all to actually result in everything being checked.
I find it strange that I should do anything about #2 in pg_amcheck, since it's the function in verify_nbtree that phrases the situation as an error. But I suppose I can just ignore that and have it print as a notice. I'm genuinely not trying to give you grief here -- I simply don't like that pg_amcheck is adding commentary atop what the checking functions are doing. I see a clean division between what pg_amcheck is doing and what amcheck is doing, and this feels to me to put that on the wrong side of the divide. If refusing to check the index because it is not in the requisite state is a notice, then why wouldn't verify_nbtree raise it as one and return early rather than raising an error?
I also find it strange that #3 is being attributed to pg_amcheck's choice of how to call the checking function, because I can't think of any other function where we require the SQL caller to do anything like what you are requiring here in order to prevent deadlocks, and also because the docs for the functions don't say that a deadlock is possible, merely that the function may return with an error. I was totally content to get an error back, since errors are how the verify_nbtree functions communicate everything else, and the handler for those functions is already prepared to deal with the error messages so returned. But it clearly annoys you that pg_amcheck is doing this, so I'll go forward with the patch that I already have written which does otherwise.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Oct 4, 2021 at 5:32 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
Sorry, I realized after hitting <send> that you might take it that way, but I mean the logs generally, not just postgres logs. If somebody runs "reindex concurrently" on all tables at midnight every night, and they see one morning in the (non-postgres) logs from the midnight hour weird error messages about their RAID controller, they may well want to check all their indexes to see if any of them were corrupted.
I don't see what the point of this example is. Why is the REINDEX
CONCURRENTLY index special here? Presumably the user is using
pg_amcheck with its -i option in this scenario, since you've scoped it
that way. Where did they get that index name from? Presumably it's
just the original familiar index name? How did an error message that's
not from the Postgres logs (or something similar) contain any index
name?
If the overnight rebuild completed successfully then we'll verify the
newly rebuilt smgr relfilenode for the index. It if failed then we'll
just verify the original. In other words, if we treat the validity of
indexes as a "visibility concern", everything works out just fine.
If there is an orphaned index (because of the implementation issue
with CONCURRENTLY) then it is *definitely* "corrupt" -- but not in any
sense that pg_amcheck ought to concern itself with. Such an orphaned
index can never actually be used by anybody. (We should fix this wart
in the CONCURRENTLY implementation some day.)
To get back on track, let me say that I'm not taking the position that what pg_amcheck currently does is necessarily correct, but just that I'd like to be careful about what we change, if anything. There are three things that seem to irritate people:
1) A non-zero exit code means "not everything was checked and passed" rather than "at least one thing is definitely corrupt".
Right.
2) Skipping of indexes is reported to the user with the word 'ERROR' in the report rather than, say, 'NOTICE'.
Right -- but it's also the specifics of the error. These are errors
that only make sense when there was specific human error. Which is
clearly not the case at all, except perhaps in the narrow -i case.
3) Deadlocks can occur
Right.
I have resisted changing #1 on the theory that `pg_amcheck --all && ./post_all_checks_pass.sh` should only run the post_all_checks_pass.sh if indeed all checks have passed, and I'm interpreting skipping an index check as being contrary to that.
You're also interpreting it as "skipping". This is a subjective
interpretation. Which is fair enough - I can see why you'd put it that
way. But that's not how I see it. Again, consider that pg_dump cares
about the "indisready" status of indexes, for a variety of reasons.
Now, the pg_dump example doesn't necessarily mean that pg_amcheck
*must* do the same thing (though it certainly suggests as much). To me
the important point is that we are perfectly entitled to define "the
indexes that pg_amcheck can see" in whatever way seems to make most
sense overall, based on practical considerations.
But the user running pg_amcheck might also *know* that they aren't running any such DDL, and therefore expect --all to actually result in everything being checked.
The user would also have to know precisely how the system catalogs
work during DDL. They'd have to know that the pg_class entry might
become visible very early on, rather than at the end, in some cases.
They'd know all that, but still be surprised by the current pg_amcheck
behavior. Which is itself not consistent with pg_dump.
I find it strange that I should do anything about #2 in pg_amcheck, since it's the function in verify_nbtree that phrases the situation as an error.
I don't find it strange. It does that because it *is* an error. There
is simply no alternative.
The solution for amcheck is the same as it has always been: just write
the SQL query in a way that avoids it entirely.
I'm genuinely not trying to give you grief here -- I simply don't like that pg_amcheck is adding commentary atop what the checking functions are doing.
pg_amcheck would not be adding commentary if this was addressed in the
way that I have in mind. It would merely be dealing with the issue in
the way that the amcheck docs have recommended, for years. The problem
here (as I see it) is that pg_amcheck is already adding commentary, by
not just doing that.
I also find it strange that #3 is being attributed to pg_amcheck's choice of how to call the checking function, because I can't think of any other function where we require the SQL caller to do anything like what you are requiring here in order to prevent deadlocks, and also because the docs for the functions don't say that a deadlock is possible, merely that the function may return with an error.
I will need to study the deadlock issue separately.
--
Peter Geoghegan
On Oct 4, 2021, at 6:19 PM, Peter Geoghegan <pg@bowt.ie> wrote:
I don't see what the point of this example is.
It doesn't matter.
I am changing pg_amcheck to filter out indexes as you say. Since the btree check should no longer error in these cases, the issue of pg_amcheck exit(2) sorts itself out without further code changes.
I am changing verify_heapam to skip unlogged tables during recovery. In testing, checking such a table results in a simple notice:
NOTICE: cannot verify unlogged relation "u_tbl" during recovery, skipping
While testing, I also created an index on the unlogged table and checked that index using bt_index_parent_check, and was surprised that checking it using bt_index_parent_check raises an error:
ERROR: cannot acquire lock mode ShareLock on database objects while recovery is in progress
HINT: Only RowExclusiveLock or less can be acquired on database objects during recovery.
It doesn't get as far as btree_index_mainfork_expected(). So I am changing pg_amcheck to filter out indexes when pg_is_in_recovery() is true and relpersistence='u'. Does that sound right to you?
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Oct 4, 2021 at 8:19 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
I am changing pg_amcheck to filter out indexes as you say. Since the btree check should no longer error in these cases, the issue of pg_amcheck exit(2) sorts itself out without further code changes.
Cool.
I am changing verify_heapam to skip unlogged tables during recovery. In testing, checking such a table results in a simple notice:
NOTICE: cannot verify unlogged relation "u_tbl" during recovery, skipping
That makes sense to me.
While testing, I also created an index on the unlogged table and checked that index using bt_index_parent_check, and was surprised that checking it using bt_index_parent_check raises an error:
ERROR: cannot acquire lock mode ShareLock on database objects while recovery is in progress
HINT: Only RowExclusiveLock or less can be acquired on database objects during recovery.
Calling bt_index_parent_check() in hot standby mode is kind of asking
for it to error-out. It requires a ShareLock on the relation, which is
inherently not possible during recovery. So I don't feel too badly
about letting it just happen.
So I am changing pg_amcheck to filter out indexes when pg_is_in_recovery() is true and relpersistence='u'. Does that sound right to you?
Yes, that all sounds right to me.
Thanks
--
Peter Geoghegan
On Mon, Oct 4, 2021 at 7:10 PM Peter Geoghegan <pg@bowt.ie> wrote:
All of the underlying errors are cases that were clearly intended to
catch user error -- every single one. But apparently pg_amcheck is
incapable of error, by definition. Like HAL 9000.
After some thought, I agree with the idea that pg_amcheck ought to
skip relations that can't be expected to be valid -- which includes
both unlogged relations while in recovery, and also invalid indexes
left behind by failed index builds. Otherwise it can only find
non-problems, which we don't want to do.
But this comment seems like mockery to me, and I don't like that.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Oct 5, 2021 at 9:41 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Oct 4, 2021 at 7:10 PM Peter Geoghegan <pg@bowt.ie> wrote:
All of the underlying errors are cases that were clearly intended to
catch user error -- every single one. But apparently pg_amcheck is
incapable of error, by definition. Like HAL 9000.
But this comment seems like mockery to me, and I don't like that.
It was certainly not a constructive way of getting my point across.
I apologize to Mark.
--
Peter Geoghegan
On Oct 5, 2021, at 9:58 AM, Peter Geoghegan <pg@bowt.ie> wrote:
I apologize to Mark.
I took no offense. Actually, I owe you a thank-you for having put so much effort into debating the behavior with me. I think the patch to fix bug #17212 will be better for it.
(And thanks to Robert for the concern.)
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company