Wrong check in pg_visibility?

Started by Konstantin Knizhnikabout 5 years ago2 messages
#1Konstantin Knizhnik
k.knizhnik@postgrespro.ru

Hi hackers!

Due to the error in PG-ProEE we have added the following test to
pg_visibility:

create table vacuum_test as select 42 i;
vacuum vacuum_test;
select count(*) > 0 from pg_check_visible('vacuum_test');
drop table vacuum_test;

Sometime (very rarely) this test failed: pg_visibility reports
"corrupted" tuples.
The same error can be reproduced not only with PG-Pro but also with
vanilla REL_11_STABLE, REL_12_STABLE and REL_13_STABLE.
It is not reproduced with master after Andres snapshot optimization -
commit dc7420c2.

It is not so easy to reproduce this error: it is necessary to repeat
this tests many times.
Probability increased with more aggressive autovacuum settings.
But even with such settings and thousands of iterations I was not able
to reproduce this error at my notebook - only at virtual machine.

The error is reported here:

            /*
             * If we're checking whether the page is all-visible, we expect
             * the tuple to be all-visible.
             */
            if (check_visible &&
                !tuple_all_visible(&tuple, OldestXmin, buffer))
            {
                TransactionId RecomputedOldestXmin;

                /*
                 * Time has passed since we computed OldestXmin, so it's
                 * possible that this tuple is all-visible in reality even
                 * though it doesn't appear so based on our
                 * previously-computed value.  Let's compute a new
value so we
                 * can be certain whether there is a problem.
                 *
                 * From a concurrency point of view, it sort of sucks to
                 * retake ProcArrayLock here while we're holding the buffer
                 * exclusively locked, but it should be safe against
                 * deadlocks, because surely GetOldestXmin() should
never take
                 * a buffer lock. And this shouldn't happen often, so it's
                 * worth being careful so as to avoid false positives.
                 */
                RecomputedOldestXmin = GetOldestXmin(NULL,
PROCARRAY_FLAGS_VACUUM);

                if (!TransactionIdPrecedes(OldestXmin,
RecomputedOldestXmin))
                    record_corrupt_item(items, &tuple.t_self);

I debugger I have checked that OldestXmin = RecomputedOldestXmin =
tuple.t_data->xmin
I wonder if this check in pg_visibility is really correct and it can not
happen that OldestXmin=tuple.t_data->xmin?
Please notice that tuple_all_visible returns false if
!TransactionIdPrecedes(xmin, OldestXmin)

Thanks in advance,
Konstantin

#2Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#1)
Re: Wrong check in pg_visibility?

On 06.12.2020 23:50, Konstantin Knizhnik wrote:

Hi hackers!

Due to the error in PG-ProEE we have added the following test to
pg_visibility:

create table vacuum_test as select 42 i;
vacuum vacuum_test;
select count(*) > 0 from pg_check_visible('vacuum_test');
drop table vacuum_test;

Sometime (very rarely) this test failed: pg_visibility reports
"corrupted" tuples.
The same error can be reproduced not only with PG-Pro but also with
vanilla REL_11_STABLE, REL_12_STABLE and REL_13_STABLE.
It is not reproduced with master after Andres snapshot optimization -
commit dc7420c2.

It is not so easy to reproduce this error: it is necessary to repeat
this tests many times.
Probability increased with more aggressive autovacuum settings.
But even with such settings and thousands of iterations I was not able
to reproduce this error at my notebook - only at virtual machine.

The error is reported here:

            /*
             * If we're checking whether the page is all-visible, we
expect
             * the tuple to be all-visible.
             */
            if (check_visible &&
                !tuple_all_visible(&tuple, OldestXmin, buffer))
            {
                TransactionId RecomputedOldestXmin;

                /*
                 * Time has passed since we computed OldestXmin, so it's
                 * possible that this tuple is all-visible in reality
even
                 * though it doesn't appear so based on our
                 * previously-computed value.  Let's compute a new
value so we
                 * can be certain whether there is a problem.
                 *
                 * From a concurrency point of view, it sort of sucks to
                 * retake ProcArrayLock here while we're holding the
buffer
                 * exclusively locked, but it should be safe against
                 * deadlocks, because surely GetOldestXmin() should
never take
                 * a buffer lock. And this shouldn't happen often, so
it's
                 * worth being careful so as to avoid false positives.
                 */
                RecomputedOldestXmin = GetOldestXmin(NULL,
PROCARRAY_FLAGS_VACUUM);

                if (!TransactionIdPrecedes(OldestXmin,
RecomputedOldestXmin))
                    record_corrupt_item(items, &tuple.t_self);

I debugger I have checked that OldestXmin = RecomputedOldestXmin =
tuple.t_data->xmin
I wonder if this check in pg_visibility is really correct and it can
not happen that OldestXmin=tuple.t_data->xmin?
Please notice that tuple_all_visible returns false if
!TransactionIdPrecedes(xmin, OldestXmin)

I did more investigations and have to say that this check in
pg_visibility.c is really not correct.
The process which is keeping oldest xmin is autovacuum.
It should be ignored by GetOldestXmin because of PROCARRAY_FLAGS_VACUUM
flags, but it is not actually skipped
because PROC_IN_VACUUM flag is not set yet. There is yet another flag -
PROC_IS_AUTOVACUUM
which is always set in autovacuum, but it can not be passed to
GetOldestXmin? because is cleared by PROCARRAY_PROC_FLAGS_MASK.

If we just repeat RecomputedOldestXmin = GetOldestXmin(NULL,
PROCARRAY_FLAGS_VACUUM);
several times, then finally we will get right xmin.

I wonder if such check should be excluded from pg_visibility or made in
more correct way?
Because nothing in documentation of pg_check_visible says that it may
return false positives:

|pg_check_visible(relation regclass, t_ctid OUT tid) returns setof tid|

    Returns the TIDs of non-all-visible tuples stored in pages
marked all-visible in the visibility map. If this function returns a
non-empty set of TIDs, the visibility map is corrupt.

||
And comment to this function is even morefrightening:

/*
 * Return the TIDs of not-all-visible tuples in pages marked all-visible
 * in the visibility map.  We hope no one will ever find any, but there
could
 * be bugs, database corruption, etc.
 */

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company