BUG #17959: amcheck fails to find a matching index tuple for an invisible heap tuple

Started by PG Bug reporting formalmost 3 years ago5 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 17959
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 16beta1
Operating system: Ubuntu 22.04
Description:

After some DDL/transactional operations (a reproducer to follow)
pg_amcheck detects an anomaly:
btree index "regress001.pg_catalog.pg_depend_reference_index":
ERROR: heap tuple (13,35) from table "pg_depend" lacks matching index
tuple within index "pg_depend_reference_index"

The corresponding table heap page contains:
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid |
t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |
t_attrs |
raw_flags |
combined_flags
35 | 6232 | 1 | 49 | 734 | 736 | 2 | (13,35) |
8199 | 1280 | 24 | | |
{"\\xeb040000","\\x38400000","\\x00000000","\\x370a0000","\\x36400000","\\x00000000","\\x6e"}
| {HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_KEYS_UPDATED}
| {}

pg_depend_reference_index contains:
itemoffset | ctid | itemlen | nulls | vars | data
| dead | htid |
157 | (13,35) | 24 | f | f | 37 0a 00 00 36 40 00 00
00 00 00 00 00 00 00 00 | t | (13,35) |

SELECT ctid, * FROM pg_depend WHERE refclassid = 0x0a37 AND refobjid =
0x4036 AND refobjsubid = 0
doesn't return any rows.

Shouldn't amcheck ignore invisible tuples?

#2Alexander Lakhin
exclusion@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #17959: amcheck fails to find a matching index tuple for an invisible heap tuple

03.06.2023 22:00, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 17959
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 16beta1
Operating system: Ubuntu 22.04
Description:

After some DDL/transactional operations (a reproducer to follow)
pg_amcheck detects an anomaly:
btree index "regress001.pg_catalog.pg_depend_reference_index":
ERROR: heap tuple (13,35) from table "pg_depend" lacks matching index
tuple within index "pg_depend_reference_index"

The operations that trigger that anomaly are as follows:
my $bsession1 = $node->background_psql('regress001');
$bsession1->query_safe("create temp table t1(a int)");
my $bsession2 = $node->background_psql('regress002');
$bsession2->query_safe("begin transaction");
$bsession2->query_safe("prepare transaction 'pt1'");
$bsession1->quit;
my $bsession3 = $node->background_psql('regress001');
$bsession3->query_safe("create temp table t1(a int)");
$bsession3->query_safe("vacuum t1");
$bsession2->quit;
$bsession3->quit;

A complete TAP test to reproduce the issue is attached.
I put it in src/bin/pg_amcheck/t/, run (on master, e6a254c0d)
PROVE_TESTS=t/099_pdri_error.pl make -s check -C src/bin/pg_amcheck/
and get:

#   Failed test 'pg_amcheck after manipulations stdout /(?^:^$)/'
#   at t/099_pdri_error.pl line 60.
#                   'btree index "regress001.pg_catalog.pg_depend_reference_index":
#     ERROR:  heap tuple (13,35) from table "pg_depend" lacks matching index tuple within index "pg_depend_reference_index"
# '
#     doesn't match '(?^:^$)'

Best regards,
Alexander

Attachments:

099_pdri_error.plapplication/x-perl; name=099_pdri_error.plDownload
In reply to: PG Bug reporting form (#1)
Re: BUG #17959: amcheck fails to find a matching index tuple for an invisible heap tuple

On Mon, Jun 5, 2023 at 12:29 AM PG Bug reporting form
<noreply@postgresql.org> wrote:

SELECT ctid, * FROM pg_depend WHERE refclassid = 0x0a37 AND refobjid =
0x4036 AND refobjsubid = 0
doesn't return any rows.

Shouldn't amcheck ignore invisible tuples?

It should -- so there must be a bug. This is a system catalog index,
so I wonder if this issue is in any way related to this known issue:

/messages/by-id/CAH2-WzkjjCoq5Y4LeeHJcjYJVxGm3M3SAWZ0=6J8K1FPSC9K0w@mail.gmail.com

(I've been meaning to get around to finally fixing it.)

Admittedly this is a fairly wild guess -- the details don't really
match. Even still, the fact that this is a system catalog index seems
very unlikely to be incidental to the problem. There are some
significant differences between how system indexes and other indexes
are built in heapam_index_build_range_scan(). Those differences seem
like they could easily be relevant.

--
Peter Geoghegan

In reply to: Alexander Lakhin (#2)
Re: BUG #17959: amcheck fails to find a matching index tuple for an invisible heap tuple

On Mon, Jun 5, 2023 at 2:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:

A complete TAP test to reproduce the issue is attached.
I put it in src/bin/pg_amcheck/t/, run (on master, e6a254c0d)
PROVE_TESTS=t/099_pdri_error.pl make -s check -C src/bin/pg_amcheck/
and get:

# Failed test 'pg_amcheck after manipulations stdout /(?^:^$)/'
# at t/099_pdri_error.pl line 60.
# 'btree index "regress001.pg_catalog.pg_depend_reference_index":
# ERROR: heap tuple (13,35) from table "pg_depend" lacks matching index tuple within index "pg_depend_reference_index"
# '
# doesn't match '(?^:^$)'

I can easily reproduce this result using your test case.

I notice that the test case will pass if I remove your use of the
"--rootdescend" option from your tap test script. This suggests that
the problem is somehow limited to heapallindexed verification when run
through the bt_index_parent_check() interface -- bt_index_check()
heapallindexed verification seems unaffected. The former works rather
like a CREATE INDEX internally (and so can just use SnapshotAny),
while the latter works more like CREATE INDEX CONCURRENTLY (and so
must use an MVCC snapshot).

--
Peter Geoghegan

#5Alexander Lakhin
exclusion@gmail.com
In reply to: Peter Geoghegan (#4)
Re: BUG #17959: amcheck fails to find a matching index tuple for an invisible heap tuple

Hello Peter,

Thanks for looking into this!

05.06.2023 19:27, Peter Geoghegan wrote:

On Mon, Jun 5, 2023 at 2:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:

A complete TAP test to reproduce the issue is attached.
I put it in src/bin/pg_amcheck/t/, run (on master, e6a254c0d)
PROVE_TESTS=t/099_pdri_error.pl make -s check -C src/bin/pg_amcheck/
and get:

# Failed test 'pg_amcheck after manipulations stdout /(?^:^$)/'
# at t/099_pdri_error.pl line 60.
# 'btree index "regress001.pg_catalog.pg_depend_reference_index":
# ERROR: heap tuple (13,35) from table "pg_depend" lacks matching index tuple within index "pg_depend_reference_index"
# '
# doesn't match '(?^:^$)'

I can easily reproduce this result using your test case.

I notice that the test case will pass if I remove your use of the
"--rootdescend" option from your tap test script. This suggests that
the problem is somehow limited to heapallindexed verification when run
through the bt_index_parent_check() interface -- bt_index_check()
heapallindexed verification seems unaffected. The former works rather
like a CREATE INDEX internally (and so can just use SnapshotAny),
while the latter works more like CREATE INDEX CONCURRENTLY (and so
must use an MVCC snapshot).

Yes, I can confirm that bt_index_parent_check() calls
bt_check_every_level(... readonly = true ...) and in this case snapshot = SnapshotAny
is used.
SELECT * FROM bt_index_parent_check('pg_catalog.pg_depend_reference_index'::regclass, true, false)
gives the same error (and it looks like the parameter rootdescend of
bt_index_parent_check() doesn't affect this).

BTW, with the DEBUG2 log level I get a message:
verifying that tuples from index "pg_depend_reference_index" are present in "pg_depend"
but doesn't the verification work the other way?

Best regards,
Alexander