pgsql: Prevent tuples to be marked as dead in subtransactions on standb
Prevent tuples to be marked as dead in subtransactions on standbys
Dead tuples are ignored and are not marked as dead during recovery, as
it can lead to MVCC issues on a standby because its xmin may not match
with the primary. This information is tracked by a field called
"xactStartedInRecovery" in the transaction state data, switched on when
starting a transaction in recovery.
Unfortunately, this information was not correctly tracked when starting
a subtransaction, because the transaction state used for the
subtransaction did not update "xactStartedInRecovery" based on the state
of its parent. This would cause index scans done in subtransactions to
return inconsistent data, depending on how the xmin of the primary
and/or the standby evolved.
This is broken since the introduction of hot standby in efc16ea52067, so
backpatch all the way down.
Author: Fei Changhong
Reviewed-by: Kyotaro Horiguchi
Discussion: /messages/by-id/tencent_C4D907A5093C071A029712E73B43C6512706@qq.com
Backpatch-through: 12
Branch
------
REL_15_STABLE
Details
-------
https://git.postgresql.org/pg/commitdiff/f5d8f59cae3e0befda51a97cc3715fa6caf7105d
Modified Files
--------------
src/backend/access/transam/xact.c | 1 +
1 file changed, 1 insertion(+)
On Tue, Dec 12, 2023 at 11:06 AM Michael Paquier <michael@paquier.xyz> wrote:
Prevent tuples to be marked as dead in subtransactions on standbys
I don't think this is a good commit message. It's totally unclear what
it means, and when I opened up the diff to try to see what was
changed, it looked nothing like what I expected.
I think a better message would have been something like
"startedInRecovery flag must be propagated to subtransactions". And I
think there should have been some analysis in the commit message or
the comments within the commit itself of whether it was intended that
this be propagated to subtransactions or not. It's hard to understand
why the flag would have been placed in the TransactionState if it
applied globally to the transaction and all subtransactions, but maybe
that's what happened.
Instead of discussing that issue, your commit message focuses in the
user-visible consequences, but in a sort of baffling way. The
statement that "Dead tuples are ignored and are not marked as dead
during recovery," for example, is clearly false on its face. If
recovery didn't mark dead tuples as dead, it would be completely
broken.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Dec 19, 2023 at 03:15:42PM -0500, Robert Haas wrote:
I don't think this is a good commit message. It's totally unclear what
it means, and when I opened up the diff to try to see what was
changed, it looked nothing like what I expected.
(Not my intention to ignore you here, my head has been underwater for
some time.)
I think a better message would have been something like
"startedInRecovery flag must be propagated to subtransactions".
You are right, sorry about that. Something like "Propagate properly
startedInRecovery in subtransactions started during recovery" would
have been better than what I used.
And I
think there should have been some analysis in the commit message or
the comments within the commit itself of whether it was intended that
this be propagated to subtransactions or not. It's hard to understand
why the flag would have been placed in the TransactionState if it
applied globally to the transaction and all subtransactions, but maybe
that's what happened.
Alvaro has mentioned something like that on the original thread where
we could use comments when a transaction state is pushed to a
subtransaction to track better the fields used and/or not used.
Documenting more all that at the top of TransactionStateData is
something we should do.
Instead of discussing that issue, your commit message focuses in the
user-visible consequences, but in a sort of baffling way. The
statement that "Dead tuples are ignored and are not marked as dead
during recovery," for example, is clearly false on its face. If
recovery didn't mark dead tuples as dead, it would be completely
broken.
Rather than "dead" tuples, I implied "killed" tuples in this sentence.
Killed tuples are ignored during recovery.
--
Michael