Incorrect assumption in heap_prepare_freeze_tuple

Started by Kuntal Ghoshover 5 years ago7 messages
#1Kuntal Ghosh
kuntalghosh.2007@gmail.com
1 attachment(s)

Hello hackers,

In heap_prepare_freeze_tuple, we make the following assumption:

* It is assumed that the caller has checked the tuple with
* HeapTupleSatisfiesVacuum() and determined that it is not HEAPTUPLE_DEAD
* (else we should be removing the tuple, not freezing it).

Thus, when we see a committed xmax that precedes the cutoff_xid, we throw
the following data corruption error:
errmsg_internal("cannot freeze committed xmax %u", xid)

However, in the caller (lazy_scan_heap), HeapTupleSatisfiesVacuum may
return HEAPTUPLE_DEAD for an updated/deleted tuple that got modified by a
transaction older than OldestXmin. And, if the tuple is HOT-updated, it
should only be removed by a hot-chain prune operation. So, we treat the
tuple as RECENTLY_DEAD and don't remove the tuple.

So, it may lead to an incorrect data corruption error. IIUC, following will
be the exact scenario where the error may happen,

An updated/deleted tuple whose xamx is in between cutoff_xid and
OldestXmin. Since cutoff_xid depends on vacuum_freeze_min_age and
autovacuum_freeze_max_age, it'll not be encountered easily. But, I think
it can be reproduced with some xid burner patch.

I think the fix should be something like following:
            if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) &&
-               TransactionIdDidCommit(xid))
+               TransactionIdDidCommit(xid) &&
+               !HeapTupleHeaderIsHotUpdated(tuple))
                ereport(ERROR,
                        (errcode(ERRCODE_DATA_CORRUPTED),
                         errmsg_internal("cannot freeze committed xmax %u",
                                         xid)));
-           freeze_xmax = true;
+
+           freeze_xmax = HeapTupleHeaderIsHotUpdated(tuple) ? false : true;

Attached a patch for the same. Thoughts?

--
Thanks & Regards,
Kuntal Ghosh

Attachments:

0001-Fix-sanity-check-for-HOT-updated-tuple-when-freezing.patchapplication/octet-stream; name=0001-Fix-sanity-check-for-HOT-updated-tuple-when-freezing.patchDownload
From 1c7bf416cf445c8f65f8d511e6054e02308fa126 Mon Sep 17 00:00:00 2001
From: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Date: Fri, 2 Oct 2020 23:03:18 +0530
Subject: [PATCH] Fix sanity check for HOT-updated tuple when freezing tuples

---
 src/backend/access/heap/heapam.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 1585861a02..127e6a84c4 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -6106,7 +6106,9 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask,
  *
  * It is assumed that the caller has checked the tuple with
  * HeapTupleSatisfiesVacuum() and determined that it is not HEAPTUPLE_DEAD
- * (else we should be removing the tuple, not freezing it).
+ * (else we should be removing the tuple, not freezing it).  But, there is an
+ * exception - if a tuple is HOT-updated then it must only be removed by a prune
+ * operation.  So, we can neither remove the tuple nor freeze the tuple.
  *
  * NB: cutoff_xid *must* be <= the current global xmin, to ensure that any
  * XID older than it could neither be running nor seen as running by any
@@ -6245,15 +6247,18 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 			 * If we freeze xmax, make absolutely sure that it's not an XID
 			 * that is important.  (Note, a lock-only xmax can be removed
 			 * independent of committedness, since a committed lock holder has
-			 * released the lock).
+			 * released the lock.  Also, if the tuple is HOT-UPDATED, we can
+			 * neither remove it nor freeze it).
 			 */
 			if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) &&
-				TransactionIdDidCommit(xid))
+				TransactionIdDidCommit(xid) &&
+				!HeapTupleHeaderIsHotUpdated(tuple))
 				ereport(ERROR,
 						(errcode(ERRCODE_DATA_CORRUPTED),
 						 errmsg_internal("cannot freeze committed xmax %u",
 										 xid)));
-			freeze_xmax = true;
+
+			freeze_xmax = HeapTupleHeaderIsHotUpdated(tuple) ? false : true;
 		}
 		else
 			freeze_xmax = false;
-- 
2.27.0

#2Andres Freund
andres@anarazel.de
In reply to: Kuntal Ghosh (#1)
Re: Incorrect assumption in heap_prepare_freeze_tuple

Hi,

On 2020-10-02 23:26:05 +0530, Kuntal Ghosh wrote:

In heap_prepare_freeze_tuple, we make the following assumption:

* It is assumed that the caller has checked the tuple with
* HeapTupleSatisfiesVacuum() and determined that it is not HEAPTUPLE_DEAD
* (else we should be removing the tuple, not freezing it).

Thus, when we see a committed xmax that precedes the cutoff_xid, we throw
the following data corruption error:
errmsg_internal("cannot freeze committed xmax %u", xid)

However, in the caller (lazy_scan_heap), HeapTupleSatisfiesVacuum may
return HEAPTUPLE_DEAD for an updated/deleted tuple that got modified by a
transaction older than OldestXmin. And, if the tuple is HOT-updated, it
should only be removed by a hot-chain prune operation. So, we treat the
tuple as RECENTLY_DEAD and don't remove the tuple.

This code is so terrible :(

We really should just merge the HOT pruning and lazy_scan_heap()
removal/freeze operations. That'd avoid this corner case and *also*
would significantly reduce the WAL volume of VACUUM. And safe a good bit
of CPU time.

So, it may lead to an incorrect data corruption error. IIUC, following will
be the exact scenario where the error may happen,

An updated/deleted tuple whose xamx is in between cutoff_xid and
OldestXmin. Since cutoff_xid depends on vacuum_freeze_min_age and
autovacuum_freeze_max_age, it'll not be encountered easily. But, I think
it can be reproduced with some xid burner patch.

I don't think this case is possible (*). By definition, there cannot be any
transactions needing tuples from before OldestXmin. Which means that the
heap_page_prune() earlier in lazy_scan_heap() would have pruned away a
DEAD tuple version that is part of a hot chain.

The HEAPTUPLE_DEAD branch you're referring to really can only be hit for
tuples that are *newer* than OldestXmin but become DEAD (instead of
RECENTLY_DEAD) because the inserting transaction aborted.

(*) with the exception of temp tables due to some recent changes, I am
currently working on a fix for that.

I think the fix should be something like following:
if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) &&
-               TransactionIdDidCommit(xid))
+               TransactionIdDidCommit(xid) &&
+               !HeapTupleHeaderIsHotUpdated(tuple))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg_internal("cannot freeze committed xmax %u",
xid)));
-           freeze_xmax = true;
+
+           freeze_xmax = HeapTupleHeaderIsHotUpdated(tuple) ? false : true;

I don't think that would be correct - we'd end up with an xmax that's
older than cutoff_xid left in the table. Breaking relfrozenxid /
creating wraparound and clog lookup dangers. This branch is only entered
when xmax precedes cutoff_xid - which is what we may set relfrozenxid
to.

What made you look at this? Did you hit the error?

Greetings,

Andres Freund

#3Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Andres Freund (#2)
Re: Incorrect assumption in heap_prepare_freeze_tuple

On Sat, Oct 3, 2020 at 12:05 AM Andres Freund <andres@anarazel.de> wrote:
Thank you for your quick response and detailed explanation.

* It is assumed that the caller has checked the tuple with
* HeapTupleSatisfiesVacuum() and determined that it is not HEAPTUPLE_DEAD
* (else we should be removing the tuple, not freezing it).

Thus, when we see a committed xmax that precedes the cutoff_xid, we throw
the following data corruption error:
errmsg_internal("cannot freeze committed xmax %u", xid)

However, in the caller (lazy_scan_heap), HeapTupleSatisfiesVacuum may
return HEAPTUPLE_DEAD for an updated/deleted tuple that got modified by a
transaction older than OldestXmin. And, if the tuple is HOT-updated, it
should only be removed by a hot-chain prune operation. So, we treat the
tuple as RECENTLY_DEAD and don't remove the tuple.

This code is so terrible :(

We really should just merge the HOT pruning and lazy_scan_heap()
removal/freeze operations. That'd avoid this corner case and *also*
would significantly reduce the WAL volume of VACUUM. And safe a good bit
of CPU time.

+1

So, it may lead to an incorrect data corruption error. IIUC, following will
be the exact scenario where the error may happen,

An updated/deleted tuple whose xamx is in between cutoff_xid and
OldestXmin. Since cutoff_xid depends on vacuum_freeze_min_age and
autovacuum_freeze_max_age, it'll not be encountered easily. But, I think
it can be reproduced with some xid burner patch.

I don't think this case is possible (*). By definition, there cannot be any
transactions needing tuples from before OldestXmin. Which means that the
heap_page_prune() earlier in lazy_scan_heap() would have pruned away a
DEAD tuple version that is part of a hot chain.

The HEAPTUPLE_DEAD branch you're referring to really can only be hit for
tuples that are *newer* than OldestXmin but become DEAD (instead of
RECENTLY_DEAD) because the inserting transaction aborted.

(*) with the exception of temp tables due to some recent changes, I am
currently working on a fix for that.

IIUC, there can be a HOT-updated tuple which is not initially pruned
by heap_page_prune but later diagnosed HEAPTUPLE_DEAD by
HeapTupleSatisfiesVacuum (Since OldestXmin can be updated by the time
we call HeapTupleSatisfiesVacuum and xmax becomes older than
OldestXmin). So, there can be deleted tuples with xmax older than
OldestXmin. But, you're right that any tuple older than cutoff_xid
(since it was set earlier) will be pruned by heap_page_prune and hence
we shouldn't encounter the error. I'll study the rewrite_heap_tuple
path as well.

What made you look at this? Did you hit the error?

Nope, I haven't encountered the error. Just trying to understand the code. :-)

--
Thanks & Regards,
Kuntal Ghosh

#4Andres Freund
andres@anarazel.de
In reply to: Kuntal Ghosh (#3)
Re: Incorrect assumption in heap_prepare_freeze_tuple

Hi,

On 2020-10-03 12:58:01 +0530, Kuntal Ghosh wrote:

IIUC, there can be a HOT-updated tuple which is not initially pruned
by heap_page_prune but later diagnosed HEAPTUPLE_DEAD by
HeapTupleSatisfiesVacuum (Since OldestXmin can be updated by the time
we call HeapTupleSatisfiesVacuum and xmax becomes older than
OldestXmin).

Hm? OldestXmin is constant over the course of vacuum, no?

Greetings,

Andres Freund

#5Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Andres Freund (#4)
Re: Incorrect assumption in heap_prepare_freeze_tuple

On Sat, Oct 3, 2020 at 1:06 PM Andres Freund <andres@anarazel.de> wrote:

On 2020-10-03 12:58:01 +0530, Kuntal Ghosh wrote:

IIUC, there can be a HOT-updated tuple which is not initially pruned
by heap_page_prune but later diagnosed HEAPTUPLE_DEAD by
HeapTupleSatisfiesVacuum (Since OldestXmin can be updated by the time
we call HeapTupleSatisfiesVacuum and xmax becomes older than
OldestXmin).

Hm? OldestXmin is constant over the course of vacuum, no?

Yeah, it's constant. And, my understanding was completely wrong.

Actually, I misunderstood the following commit message:

commit dd69597988859c51131e0cbff3e30432db4259e1
Date: Thu May 2 10:07:13 2019 -0400

Fix some problems with VACUUM (INDEX_CLEANUP FALSE)

...
Change the logic for the case where a tuple is not initially pruned
by heap_page_prune but later diagnosed HEAPTUPLE_DEAD by
HeapTupleSatisfiesVacuum.
.....
I thought the reason is OldestXmin will be updated in between, but
that's just a stupid assumption I've made. :-(

I still need to understand the scenario that the above commit is
referring to. If those kind of tuples can't have committed xmax older
than OldestXmin/cutoff_xid, then we're good. Otherwise, these kind of
tuples can create problems if we're performing vacuum with index
cleanup disabled. Because, if index cleanup is disabled, we set
tupgone as false although the status of the tuple is HEAPTUPLE_DEAD
and try to freeze those tuple later.

You've also mentioned that HEAPTUPLE_DEAD case I'm referring to can
only be hit for for tuples that are *newer* than OldestXmin but become
DEAD (instead of RECENTLY_DEAD) because the inserting transaction
aborted. But, I don't see that's the only case when
HeapTupleSatisfiesVacuum returns HEAPTUPLE_DEAD. If
HeapTupleSatisfiesVacuumHorizon returns HEAPTUPLE_RECENTLY_DEAD and if
tuple xmax(dead_after) precedes OlestXmin, we set it as
HEAPTUPLE_DEAD.

res = HeapTupleSatisfiesVacuumHorizon(htup, buffer, &dead_after);

if (res == HEAPTUPLE_RECENTLY_DEAD)
{
Assert(TransactionIdIsValid(dead_after));

if (TransactionIdPrecedes(dead_after, OldestXmin))
res = HEAPTUPLE_DEAD;
}
else
Assert(!TransactionIdIsValid(dead_after));

Am I missing something here?

--
Thanks & Regards,
Kuntal Ghosh

#6Andres Freund
andres@anarazel.de
In reply to: Kuntal Ghosh (#5)
Re: Incorrect assumption in heap_prepare_freeze_tuple

Hi,

On 2020-10-03 19:57:03 +0530, Kuntal Ghosh wrote:

You've also mentioned that HEAPTUPLE_DEAD case I'm referring to can
only be hit for for tuples that are *newer* than OldestXmin but become
DEAD (instead of RECENTLY_DEAD) because the inserting transaction
aborted. But, I don't see that's the only case when
HeapTupleSatisfiesVacuum returns HEAPTUPLE_DEAD. If
HeapTupleSatisfiesVacuumHorizon returns HEAPTUPLE_RECENTLY_DEAD and if
tuple xmax(dead_after) precedes OlestXmin, we set it as
HEAPTUPLE_DEAD.

res = HeapTupleSatisfiesVacuumHorizon(htup, buffer, &dead_after);

if (res == HEAPTUPLE_RECENTLY_DEAD)
{
Assert(TransactionIdIsValid(dead_after));

if (TransactionIdPrecedes(dead_after, OldestXmin))
res = HEAPTUPLE_DEAD;
}
else
Assert(!TransactionIdIsValid(dead_after));

Am I missing something here?

To get to this point heap_page_prune() has to have been called for the
page. That removes all tuple [versions] that are DEAD. But not
RECENTLY_DEAD. But RECENTLY_DEAD can only happen for tuples that are
newere than OldestXmin. Thus the only tuples that the HTSV() we're
talking about can return DEAD for are ones that were RECENTLY_DEAD
in heap_page_prune().

Greetings,

Andres Freund

#7Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Andres Freund (#6)
Re: Incorrect assumption in heap_prepare_freeze_tuple

On Sun, Oct 4, 2020 at 12:33 AM Andres Freund <andres@anarazel.de> wrote:

To get to this point heap_page_prune() has to have been called for the
page. That removes all tuple [versions] that are DEAD. But not
RECENTLY_DEAD. But RECENTLY_DEAD can only happen for tuples that are
newere than OldestXmin. Thus the only tuples that the HTSV() we're
talking about can return DEAD for are ones that were RECENTLY_DEAD
in heap_page_prune().

Got it. Thank you for the explanations. :-)

--
Thanks & Regards,
Kuntal Ghosh