Logical replication: lost updates/deletes and invalid log messages caused by SnapshotDirty + concurrent updates
Hello, everyone!
Initially this was discussed in ("[BUG?]
check_exclusion_or_unique_constraint false negative")[0]/messages/by-id/CANtu0oiktqQ2pwExoXqDpByXNCJa-KE5vQRodTRnmFHN_+qwHg@mail.gmail.com, but as Amit
recommended [1]/messages/by-id/CAA4eK1LZxzORgAoDhix9MWrOqYOsNZuZLW2sTfGsJFM99yRgrg@mail.gmail.com I decided to write a dedicated email about it, since
the original thread is too deep and has some unrelated discussion.
Commitfest entry [2]https://commitfest.postgresql.org/patch/5151/ contains reproducers for most of the issues
described below - both injection_point-based and pgbench-based (to
ensure such things may happen in real situations).
------
Source of the problem: any scan using SnapshotDirty/SnapshotSelf may
miss some logical tuple (as if it does not exist) in case of a
parallel update.
It happens like this:
* A SnapshotDirty heap scan starts
* Some page X is already processed by the scan, but page Y is not yet
* A logical row is updated by a concurrent transaction: the new
version is placed into page X (already visited), the old version
located in page Y (not-yet-visited) is marked as deleted
* Our scan finally visits page Y and skips the tuple because of the
xmax of the committed transaction
* Both the first and the second version of the tuple are skipped
So, in the MVCC view, the row exists all the time (just in different
versions), but from the perspective of SnapshotDirty it does not exist
at all at that moment.
The same may happen in a BTREE scan, even in the case of a single
index page. This is because the btree scan copies all TIDs from an
index page before starting to process them.
If, after taking such a copy, a new tuple version TID is inserted into
that page, the scan will not see it. But the "cached" TID may be
marked as deleted by a committed transaction, resulting in zero tuples
being detected.
------
Minor consequence: check_exclusion_or_unique_constraint efficiency
It may not find a record in a UNIQUE index during INSERT ON CONFLICT
UPDATE. However, this is just a minor performance issue - retry logic
takes care of it anyway.
------
Major consequence: weird behavior of logical replication in case of
parallel updates on the subscriber.
Caused by the fact that RelationFindReplTupleByIndex (or less likely
RelationFindReplTupleSeq) may miss a target row to perform an action
on, resulting in a skipped action.
---case 1: LOST DELETE---
Scenario: race of DELETE on publisher with UPDATE on subscriber
Setup:
On publisher:
CREATE TABLE t(a int PRIMARY key, data text);
INSERT INTO t(a, data) VALUES (1,'value_from_publisher');
On subscriber:
CREATE TABLE t(a int PRIMARY key, data text);
CREATE INDEX data_index ON t(data); -- index is required
to prevent HOT
Commands:
On publisher:
DELETE FROM t WHERE a=1;
On subscriber:
UPDATE t SET data = 'value_from_subscriber' WHERE (a=1);
Results:
Expected:
Tuple is deleted on both subscriber and publisher.
Both DELETE-before-UPDATE and UPDATE-before-DELETE result in
the absence of the row in the end.
Actual:
Either as expected, or:
Tuple is deleted on publisher, but 'value_from_subscriber'
remains on subscriber.
---case 2: LOST UPDATE---
Scenario: UPDATE on publisher vs subscriber-columns only UPDATE on subscriber
Setup:
On publisher:
CREATE TABLE t(a int PRIMARY key, data text);
INSERT INTO t(a, data) VALUES (1,'initial_data');
On subscriber:
-- note additional subscriber-only column
CREATE TABLE t(a int PRIMARY key, data text, sub_only
varchar(40) DEFAULT 'default_data');
CREATE INDEX sub_only_index ON t(sub_only);
Commands:
On publisher:
UPDATE t SET data = 'update_from_publisher' WHERE (a=1);
On subscriber:
UPDATE t SET sub_only = 'update_from_subscriber' WHERE (a=1);
Results:
Expected:
On subscriber: tuple (a=1, data='update_from_publisher',
sub_only='update_from_subscriber').
Update on subscriber only affects the subscriber-only column,
so I expect publisher-driven columns are replicated.
Actual:
Either as expected, or:
Publisher update is lost, leaving (a=1, data='initial_data',
sub_only='update_from_subscriber') on subscriber.
------
Normal consequence:
For the same reasons, logical replication may provide invalid log
messages (and stats) about conflicts:
* correct: update_origin_differs, in logs: update_missing
* correct: update_origin_differs, in logs: update_deleted (providing
invalid deleted tuple data)
* correct: delete_origin_differs, in logs: delete_missing
This is not a critical issue, but invalid logs are something no one
wants to deal with, especially logs made to debug conflicts.
Also, in case of any conflict-resolution mechanics in the future, it
may be a critical flaw.
------
Proposed fix:
The patch in [2]https://commitfest.postgresql.org/patch/5151/ addresses the issue by rewriting
RelationFindReplTupleByIndex/RelationFindReplTupleSeq to use
GetLatestSnapshot for each attempt to find the target row.
Since it calls GetLatestSnapshot before table_tuple_lock anyway, no
performance regression is expected.
As a result:
* MVCC scan will not miss updated tuples, while DirtyScan may
* MVCC scan cannot see not-yet-committed new rows, while DirtyScan
can, however in both variants, table_tuple_lock will wait for the
updating transaction to commit before retrying the whole scan
In any case I think we should at least document such behavior.
Best regards,
Mikhail.
[0]: /messages/by-id/CANtu0oiktqQ2pwExoXqDpByXNCJa-KE5vQRodTRnmFHN_+qwHg@mail.gmail.com
[1]: /messages/by-id/CAA4eK1LZxzORgAoDhix9MWrOqYOsNZuZLW2sTfGsJFM99yRgrg@mail.gmail.com
[2]: https://commitfest.postgresql.org/patch/5151/
Hello, everyone!
Andres, added you in CC because I am referring to your words in
personal conversation.
Amit, I asked Andres about that situation and Andres said it is a bug
if we execute DirtySnapshot without holding a real one.
And this is how it is done in RelationFindReplTupleByIndex and
RelationFindReplTupleSeq now.
Best regards,
Mikhail.
Hello!
Issue is known for almost a year, so I propose to document it at least.
This is a draft of a documentation update.
Best regards,
Mikhail.
Attachments:
v1-0002-update-for-pg19.patchtext/plain; charset=US-ASCII; name=v1-0002-update-for-pg19.patchDownload
From 08f4ccf2b5403b84b15b14ddb26b529cb3c11c27 Mon Sep 17 00:00:00 2001
From: Mikhail Nikalayeu <mihailnikalayeu@gmail.com>
Date: Thu, 13 Nov 2025 03:11:11 +0100
Subject: [PATCH v1 2/2] update for pg19:
documentation update to reflect issue described in https://commitfest.postgresql.org/patch/5151/ and https://www.postgresql.org/message-id/flat/CADzfLwXZVmbo11tFS_G2i+6TfFVwHU4VUUSeoqb+8UQfuoJs8A@mail.gmail.com
---
doc/src/sgml/logical-replication.sgml | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index e20c92d15f0..4c6ee9e9229 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2236,7 +2236,8 @@ DETAIL: <replaceable class="parameter">detailed_explanation</replaceable>.
Due to some restrictions in implementation of conflict detection,
logical replication may report incorrect conflict types in the server
log and statistics. An <literal>update_origin_differs</literal> conflict
- may be incorrectly reported as <literal>update_missing</literal>, and a
+ may be incorrectly reported as <literal>update_missing</literal> or
+ <literal>update_deleted</literal, and a
<literal>delete_origin_differs</literal> conflict may be incorrectly
reported as <literal>delete_missing</literal>.
In some cases such race condition may cause inconsistent behaviour, see
@@ -2448,8 +2449,8 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
<para>
An <command>UPDATE</command> from the publisher not being applied, so
the publisher's changes are lost while the subscriber's local changes
- remain. In this case an <literal>update_missing</literal> conflict is
- logged instead of <literal>update_origin_differs</literal>. This can
+ remain. In this case an <literal>update_missing</literal> or <update_deleted>
+ conflict is logged instead of <literal>update_origin_differs</literal>. This can
occur even when the subscriber's <command>UPDATE</command> modifies only
columns that do not exist on the publisher, which might otherwise appear
to be a safe operation.
--
2.43.0
v1-0001-documentation-update-to-reflect-issue-described-i.patchtext/plain; charset=US-ASCII; name=v1-0001-documentation-update-to-reflect-issue-described-i.patchDownload
From befd4a21284166547da12f811876a1342114098e Mon Sep 17 00:00:00 2001
From: Mikhail Nikalayeu <mihailnikalayeu@gmail.com>
Date: Thu, 13 Nov 2025 03:07:17 +0100
Subject: [PATCH v1 1/2] documentation update to reflect issue described in
https://commitfest.postgresql.org/patch/5151/ and
https://www.postgresql.org/message-id/flat/CADzfLwXZVmbo11tFS_G2i+6TfFVwHU4VUUSeoqb+8UQfuoJs8A@mail.gmail.com
---
doc/src/sgml/logical-replication.sgml | 50 +++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 79ecd09614f..e20c92d15f0 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -93,6 +93,8 @@
application, there will be no conflicts from a single subscription. On the
other hand, if there are other writes done either by an application or by other
subscribers to the same set of tables, conflicts can arise.
+ See <xref linkend="logical-replication-conflicts"/> and
+ <xref linkend="logical-replication-restrictions"/>.
</para>
<sect1 id="logical-replication-publication">
@@ -2229,6 +2231,19 @@ DETAIL: <replaceable class="parameter">detailed_explanation</replaceable>.
</variablelist>
</para>
+ <warning>
+ <para>
+ Due to some restrictions in implementation of conflict detection,
+ logical replication may report incorrect conflict types in the server
+ log and statistics. An <literal>update_origin_differs</literal> conflict
+ may be incorrectly reported as <literal>update_missing</literal>, and a
+ <literal>delete_origin_differs</literal> conflict may be incorrectly
+ reported as <literal>delete_missing</literal>.
+ In some cases such race condition may cause inconsistent behaviour, see
+ <xref linkend="logical-replication-restrictions"/> for details.
+ </para>
+ </warning>
+
<para>
Logical replication operations are performed with the privileges of the role
which owns the subscription. Permissions failures on target tables will
@@ -2409,6 +2424,41 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
key or replica identity defined for it.
</para>
</listitem>
+
+ <listitem>
+ <para>
+ Due to a restriction of the current conflict detection logic, concurrent
+ writes on the subscriber can lead to data inconsistencies. While applying
+ a <command>DELETE</command> or <command>UPDATE</command> from the publisher,
+ a concurrent <command>UPDATE</command> on the subscriber can cause the
+ logical replication worker to fail to find the target row, so the publisher's
+ change is skipped and an incorrect conflict type is reported.
+ This can result in:
+ <itemizedlist>
+ <listitem>
+ <para>
+ A <command>DELETE</command> from the publisher not being applied, leaving
+ the row on the subscriber with locally modified values. In this case a
+ <literal>delete_missing</literal> conflict is logged even though the row
+ still exists on the subscriber, whereas <literal>delete_origin_differs</literal>
+ would be expected.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ An <command>UPDATE</command> from the publisher not being applied, so
+ the publisher's changes are lost while the subscriber's local changes
+ remain. In this case an <literal>update_missing</literal> conflict is
+ logged instead of <literal>update_origin_differs</literal>. This can
+ occur even when the subscriber's <command>UPDATE</command> modifies only
+ columns that do not exist on the publisher, which might otherwise appear
+ to be a safe operation.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </listitem>
+
</itemizedlist>
</sect1>
--
2.43.0