Simplify code building the LR conflict messages
While reviewing another conflict-related thread, I noticed that the
existing conflict messages are currently getting built using
unexpected logic.
e.g
----------
if (tuple_value.len > 0)
{
appendStringInfoString(&tuple_value, "; ");
appendStringInfo(&tuple_value, _("existing local row %s"),
desc);
}
else
{
appendStringInfo(&tuple_value, _("Existing local row %s"),
desc);
}
----------
I couldn't think of a reason why the "; " string needed to be
separated from the rest of the message like that. And when you combine
the strings, the logic easily collapses into a single statement with
less code and greater readability.
There were several code fragments like this. Here is a patch to modify them.
Thoughts?
======
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v1-0001-Simplify-the-code-building-the-conflict-messages.patchapplication/octet-stream; name=v1-0001-Simplify-the-code-building-the-conflict-messages.patchDownload+8-32
Peter Smith <smithpb2250@gmail.com> writes:
I couldn't think of a reason why the "; " string needed to be
separated from the rest of the message like that. And when you combine
the strings, the logic easily collapses into a single statement with
less code and greater readability.
... and, probably, less ability of the compiler to verify that the
variadic arguments match the format string. I think you've taken
this a bit too far.
regards, tom lane
On Fri, Nov 28, 2025 at 1:49 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Smith <smithpb2250@gmail.com> writes:
I couldn't think of a reason why the "; " string needed to be
separated from the rest of the message like that. And when you combine
the strings, the logic easily collapses into a single statement with
less code and greater readability.... and, probably, less ability of the compiler to verify that the
variadic arguments match the format string. I think you've taken
this a bit too far.
Hi Tom,
Thank you for the feedback. Could you please clarify which aspect is
of concern so I can address it properly:
* Is the concern about having a StringInfo format string that starts
with a semicolon? If so, I noticed there are already multiple similar
examples in the codebase (see: `grep -r . -e 'StringInfo(.*";.*%'`),
so I understood this pattern to be acceptable.
* Or is it the use of the ternary operator to select the format
string? If that's the issue, please note that my patch only introduced
the ternary operator for the first two code fragments. The third
fragment already uses ternaries in the same way on master, so I
understood that to be an established pattern as well.
I'd like to make sure I understand your concern correctly so I can
revise the patch appropriately.
======
Kind Regards,
Peter Smith
Fujitsu Australia
Peter Smith <smithpb2250@gmail.com> writes:
On Fri, Nov 28, 2025 at 1:49 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
... and, probably, less ability of the compiler to verify that the
variadic arguments match the format string. I think you've taken
this a bit too far.
* Or is it the use of the ternary operator to select the format
string? If that's the issue, please note that my patch only introduced
the ternary operator for the first two code fragments. The third
fragment already uses ternaries in the same way on master, so I
understood that to be an established pattern as well.
Sadly, you were copying bad code that we need to fix.
I'd like to make sure I understand your concern correctly so I can
revise the patch appropriately.
My concern is that we follow a coding style that the compiler can
check. Most C compilers can only verify that printf-like format
strings match the actual arguments if the format string is a constant.
So if I fat-finger the format string in this example:
--- a/src/backend/replication/logical/conflict.c
+++ b/src/backend/replication/logical/conflict.c
@@ -383,7 +383,7 @@ build_tuple_value_details(EState *estate, ResultRelInfo *relinfo,
if (tuple_value.len > 0)
{
appendStringInfoString(&tuple_value, "; ");
- appendStringInfo(&tuple_value, _("existing local row %s"),
+ appendStringInfo(&tuple_value, _("existing local row %d"),
desc);
}
I'll hear about it:
conflict.c: In function 'build_tuple_value_details':
conflict.c:386:38: warning: format '%d' expects argument of type 'int', but argument 3 has type 'char *' [-Wformat=]
appendStringInfo(&tuple_value, _("existing local row %d"),
^~~~~~~~~~~~~~~~~~~~~~~
conflict.c:386:36: note: in expansion of macro '_'
appendStringInfo(&tuple_value, _("existing local row %d"),
^
But I don't trust the compiler to see through a ternary expression
and check (both!) format strings against the actuals.
In a quick test, the gcc version I have handy does seem to be able to
do that, but I don't think we should rely on that. Format strings
are something that is way too easy to get wrong, and most of us just
expect the compiler to cross-check them for us, so coding patterns
that might silently disable such compiler warnings are best avoided.
(There's also some magic going on here to allow the compiler to see
through gettext(), but it's quite magic. We don't need to assume
multiple layers of magic will work reliably.)
regards, tom lane
On Fri, Nov 28, 2025 at 3:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Smith <smithpb2250@gmail.com> writes:
On Fri, Nov 28, 2025 at 1:49 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
... and, probably, less ability of the compiler to verify that the
variadic arguments match the format string. I think you've taken
this a bit too far.* Or is it the use of the ternary operator to select the format
string? If that's the issue, please note that my patch only introduced
the ternary operator for the first two code fragments. The third
fragment already uses ternaries in the same way on master, so I
understood that to be an established pattern as well.Sadly, you were copying bad code that we need to fix.
I'd like to make sure I understand your concern correctly so I can
revise the patch appropriately.My concern is that we follow a coding style that the compiler can
check. Most C compilers can only verify that printf-like format
strings match the actual arguments if the format string is a constant.
So if I fat-finger the format string in this example:--- a/src/backend/replication/logical/conflict.c +++ b/src/backend/replication/logical/conflict.c @@ -383,7 +383,7 @@ build_tuple_value_details(EState *estate, ResultRelInfo *relinfo, if (tuple_value.len > 0) { appendStringInfoString(&tuple_value, "; "); - appendStringInfo(&tuple_value, _("existing local row %s"), + appendStringInfo(&tuple_value, _("existing local row %d"), desc); }I'll hear about it:
conflict.c: In function 'build_tuple_value_details':
conflict.c:386:38: warning: format '%d' expects argument of type 'int', but argument 3 has type 'char *' [-Wformat=]
appendStringInfo(&tuple_value, _("existing local row %d"),
^~~~~~~~~~~~~~~~~~~~~~~
conflict.c:386:36: note: in expansion of macro '_'
appendStringInfo(&tuple_value, _("existing local row %d"),
^But I don't trust the compiler to see through a ternary expression
and check (both!) format strings against the actuals.In a quick test, the gcc version I have handy does seem to be able to
do that, but I don't think we should rely on that. Format strings
are something that is way too easy to get wrong, and most of us just
expect the compiler to cross-check them for us, so coding patterns
that might silently disable such compiler warnings are best avoided.(There's also some magic going on here to allow the compiler to see
through gettext(), but it's quite magic. We don't need to assume
multiple layers of magic will work reliably.)
Got it. Thanks for the detailed reason.
Here is a revised patch v2 that removes all ternaries from the format
string decision making (including the ones that were already in
master).
======
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v2-0001-Simplify-format-strings-and-remove-ternaries.patchapplication/octet-stream; name=v2-0001-Simplify-format-strings-and-remove-ternaries.patchDownload+14-20
Hello,
So, what we're doing here is to append further row-identifying details
to an errdetail string that already contains some explanation of the
problem. That is, we have something like
DETAIL: The row to be updated was deleted.
and then we add whatever this function produces, after a newline. I
don't remember any other place in our code that does such a thing. The
result would look something like
DETAIL: The row to be updated was deleted.
Key: (1, 2); remote row (1,2,3); replica identity full (1,2,3).
or something like that, where each of the parts can begin with uppercase
or not, with a semicolon or not, depending on whether there are any
previous parts.
I wonder if it would make more sense to move this identifying
information to a CONTEXT line instead, where we won't have to care about
the uppercase.
We still have to worry about the semicolon though. Maybe we should make
that explicit in the string but given control to the translator by doing
something like
/* translator: first %s is either empty or a translated list
separator, second %s is a row descriptor */
appendStringInfo(&buf, "%sremote row %s", empty ? "" :
/* translator: this is used as a list separator */
_("; "), remote_row_description);
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Debido a que la velocidad de la luz es mucho mayor que la del sonido,
algunas personas nos parecen brillantes un minuto antes
de escuchar las pelotudeces que dicen." (Roberto Fontanarrosa)
On Fri, Nov 28, 2025 at 6:38 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
So, what we're doing here is to append further row-identifying details
to an errdetail string that already contains some explanation of the
problem. That is, we have something likeDETAIL: The row to be updated was deleted.
and then we add whatever this function produces, after a newline. I
don't remember any other place in our code that does such a thing. The
result would look something likeDETAIL: The row to be updated was deleted.
Key: (1, 2); remote row (1,2,3); replica identity full (1,2,3).or something like that, where each of the parts can begin with uppercase
or not, with a semicolon or not, depending on whether there are any
previous parts.I wonder if it would make more sense to move this identifying
information to a CONTEXT line instead, where we won't have to care about
the uppercase.
As per my understanding, we typically display the information in
context via errcontext() in error context callback functions which may
not be immediately available at the location where error happens. For
example, while applying the remote changes, the error can happen
anywhere in the heap or below layer, but we add additional apply
related information (like which apply_message was being processed when
error occurred) via apply_error_callback. So, it is not clear in the
case at hand whether we should display the information which is
available at error_site via context message.
--
With Regards,
Amit Kapila.
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
So, what we're doing here is to append further row-identifying details
to an errdetail string that already contains some explanation of the
problem. That is, we have something like
DETAIL: The row to be updated was deleted.
and then we add whatever this function produces, after a newline. I
don't remember any other place in our code that does such a thing. The
result would look something like
DETAIL: The row to be updated was deleted.
Key: (1, 2); remote row (1,2,3); replica identity full (1,2,3).
or something like that, where each of the parts can begin with uppercase
or not, with a semicolon or not, depending on whether there are any
previous parts.
I looked into the postmaster log for subscription/t/035_conflicts.pl
to find actual examples of what this code produces, and now I think
it's got worse problems than whether the C code style is good.
The output is frankly a mess, and IMO it violates our message style
guidelines in multiple ways. Here's a sampling:
2025-11-29 11:43:03.782 EST logical replication apply worker[659868] ERROR: conflict detected on relation "public.conf_tab": conflict=multiple_unique_conflicts
2025-11-29 11:43:03.782 EST logical replication apply worker[659868] DETAIL: Key already exists in unique index "conf_tab_pkey", modified in transaction 770.
Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
Key already exists in unique index "conf_tab_b_key", modified in transaction 770.
Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
Key already exists in unique index "conf_tab_c_key", modified in transaction 770.
Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4).
2025-11-29 11:43:04.654 EST logical replication apply worker[659907] LOG: conflict detected on relation "public.conf_tab": conflict=update_missing
2025-11-29 11:43:04.654 EST logical replication apply worker[659907] DETAIL: Could not find the row to be updated.
Remote row (6, 7, 8); replica identity (a)=(5).
2025-11-29 11:43:05.266 EST logical replication apply worker[659942] LOG: conflict detected on relation "public.tab": conflict=delete_origin_differs
2025-11-29 11:43:05.266 EST logical replication apply worker[659942] DETAIL: Deleting the row that was modified locally in transaction 790 at 2025-11-29 11:43:05.262807-05.
Existing local row (1, 3); replica identity (a)=(1).
Where to begin?
* "conflict detected" is already presuming way too much context for
someone who's looking into the postmaster log. Here we can guess
what it's about because somebody configured this test with %b in
log_line_prefix. But without the "logical replication apply worker"
cue, are users really going to know what these messages are about?
* The business about "conflict=some_kind_of_keyword" is certainly not
per our style guidelines, and I wonder how it translates. (Checks
code ... looks like we don't even try.)
* The DETAIL messages are not complete sentences or even
approximations to them, and frankly they are just about unintelligible.
* I follow "Key already exists in unique index "foo" but not "modified
in transaction 770" --- what was modified, which side's XID is this,
and why do we think an XID will be of any help whatever? The run-on
clause is very poor English in any case. "Deleting the row that was
modified locally in transaction 790 at 2025-11-29 11:43:05.262807-05."
is better in that a transaction timestamp is much more likely to be
useful than an XID, and it's marginally better English, but it's still
not a complete sentence and it reads like context not detail.
* What am I to make of this:
"Key (a)=(55); existing local row (55, 2, 3); remote row (55, 2, 3)."
From context I guess it's trying to show me the entire rows not just
the duplicated key value, but they're not labeled adequately.
Besides, I wonder how well this output scales to non-toy tables where
the rows are kilobytes or megabytes long. Index keys can be safely
assumed to not be enormously long, but it doesn't follow that it's
sane to emit the entire row. Much less several of them in one
error message.
* The three examples I show above look like the detail messages were
written by three different people who were not talking to each other.
The initial kind-of-a-sentence part is totally different style in
each case. And why do some of the outputs put the labeled key first
and others put it last?
So I think this area desperately needs significant editorial
attention, as well as some fundamental rethinking of just what
information we should show. Perhaps using errcontext would help,
but I'm not sure. I think a large part of the problem stems from
trying to cram multiple error conditions into one ereport ... is it
possible to avoid that?
regards, tom lane
I wrote:
So I think this area desperately needs significant editorial
attention, as well as some fundamental rethinking of just what
information we should show. Perhaps using errcontext would help,
but I'm not sure. I think a large part of the problem stems from
trying to cram multiple error conditions into one ereport ... is it
possible to avoid that?
After a little bit of thought, here's a sketch of a straw-man idea.
1. The longstanding output for unique constraint violations is like
ERROR: duplicate key value violates unique constraint "foo_f1_f2_key"
DETAIL: Key (f1, f2)=(1, 2) already exists.
We could do worse than to use exactly that output (perhaps even
sharing code) and add errcontext like
CONTEXT: replicated INSERT of row with replica identity (f1, f2)=(1, 2) in table "foo"
We should drop the full-row output for the reason I gave previously,
and I do not see the value of the XID printout either. This would
also serve (after s/INSERT/UPDATE/) for unique violations during
replicated updates.
2. For no-matching-row in UPDATE or DELETE, perhaps
ERROR: row to be updated[deleted] does not exist
CONTEXT: replicated UPDATE[DELETE] of row with replica identity (f1, f2)=(1, 2) in table "foo"
3. I don't understand what delete_origin_differs means or why
it's an error condition, so I won't dare to propose new text
for that. But the new text should make the reason clear,
and I think the same errcontext still works.
4. We need to make this a separate ereport for each problematic row.
Without having looked at the code, I suspect the reason it works the
way it does now is that the process will exit after ereport(ERROR).
I don't want to completely redesign how ereport works, but maybe
we could change replication apply so that the per-row reports are
emitted as ereport(LOG), and then when we get to a place where we
want to quit, end with
ERROR: stopping replication because of previously-logged errors
which would make the consequences clearer anyway.
regards, tom lane
On Sat, Nov 29, 2025 at 11:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
After a little bit of thought, here's a sketch of a straw-man idea.
1. The longstanding output for unique constraint violations is like
ERROR: duplicate key value violates unique constraint "foo_f1_f2_key"
DETAIL: Key (f1, f2)=(1, 2) already exists.We could do worse than to use exactly that output (perhaps even
sharing code) and add errcontext likeCONTEXT: replicated INSERT of row with replica identity (f1, f2)=(1, 2) in table "foo"
We should drop the full-row output for the reason I gave previously,
and I do not see the value of the XID printout either.
The reason for displaying in this style is that, in conflicts, users
may want to define their custom resolution strategies based on
conflict_type. Sometimes they need to resolve conflicts manually as
well. To make an informed decision on which version of the row is
"correct," the human reviewer needs full context.
Example: Imagine a table storing employee data.
Node A updates John Smith’s salary from $100k to $110k.
Node B simultaneously updates John Smith’s job title from "Senior Dev"
to "Lead Dev".
If the conflict report only showed:
Conflict on PK 123. Column 'Salary' differs. Column 'Job Title' differs.
The reviewer doesn't have enough information. However, if the report
shows the full rows:
Node A Version: ID:123, Name: John Smith, Salary: $110k, Title: Senior
Dev, Dept: Engineering, LastUpdatedBy: PayrollSys
Node B Version: ID:123, Name: John Smith, Salary: $100k, Title: Lead
Dev, Dept: Engineering, LastUpdatedBy: HR_Manager
Seeing the full row allows the reviewer to see who made the change and
what else is in the row. They might decide that the HR Manager's
promotion (Title change) should take precedence, or they might realize
they need to merge the two changes manually because both are valid.
Without the full row data (like LastUpdatedBy or Dept), this decision
is impossible.
Another case where we need row data is when a value in Column A might
only be valid based on the value in Column B within the same row. For
example, CHECK Constraints: Imagine a constraint where IF status =
'Active' THEN termination_date MUST BE NULL. If a conflict arises
involving these columns, you need to see both columns simultaneously
to understand why a specific resolution might violate database rules.
We do display the entire row as DETAIL for CHECK constraints even without apply.
CREATE TABLE products (
product_no integer,
name text,
price numeric CHECK (price > 0),
discounted_price numeric CHECK (discounted_price > 0),
CHECK (price > discounted_price)
);
postgres=# insert into products values (1, 'pen', 10, 20);
ERROR: new row for relation "products" violates check constraint
"products_check"
DETAIL: Failing row contains (1, pen, 10, 20).
Also, as per the docs [1]https://www.postgresql.org/docs/devel/logical-replication-conflicts.html, "The large column values are truncated to
64 bytes." while displaying conflict information which is same as what
we do while displaying the row during CHECK constraint violation.
Additionally, we checked some other systems where they display the
entire row information [2]https://docs.oracle.com/database/timesten-18.1/TTREP/conflict.htm#TTREP634 (See, "The format of a uniqueness conflict
record is").
Currently, we display the information for multiple_unique_conflicts,
when track_commit_timestamp is on as follows:
ERROR: conflict detected on relation "public.conf_tab":
conflict=multiple_unique_conflicts
DETAIL: Key already exists in unique index "conf_tab_pkey", modified
locally in transaction 764 at 2025-12-01 08:59:50.789608+05:30.
Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
Key already exists in unique index "conf_tab_b_key", modified
locally in transaction 764 at 2025-12-01 08:59:50.789608+05:30.
Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
Key already exists in unique index "conf_tab_c_key", modified
locally in transaction 764 at 2025-12-01 08:59:50.789608+05:30.
Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4).
CONTEXT: processing remote data for replication origin "pg_16394"
during message type "INSERT" for replication target relation
"public.conf_tab" in transaction 759, finished at 0/017A7380
The idea to split it as per your suggestion, and assuming we agree
that additional row details are required for conflict/resolution based
on my above explanation.
LOG: conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab"
DETAIL: Key already exists in unique index "conf_tab_pkey", modified
locally in transaction 764 at 2025-12-01 08:59:50.789608+05:30.
Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
LOG: conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab"
DETAIL: Key already exists in unique index "conf_tab_b_key", modified
locally in transaction 764 at 2025-12-01 08:59:50.789608+05:30.
Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
…
ERROR: stopping replication because of previously-logged errors
CONTEXT: processing remote data for replication origin "pg_16394"
during message type "INSERT" for replication target relation
"public.conf_tab" in transaction 759, finished at 0/017A7380
OR
DETAIL: key (a)=(2) already exists in unique index "conf_tab_pkey",
modified locally in transaction 764 at 2025-12-01
08:59:50.789608+05:30
existing local row (2, 2, 2)
remote row (2, 3, 4)
This is based on the below errdetail, we use somewhere else in the code.
Example:
ERROR: cannot drop table parent because other objects depend on it
DETAIL: constraint child1_parent_id_fkey on table child1 depends on
table parent
constraint child2_parent_id_fkey on table child2 depends on table parent
view parent_view depends on table parent
view parent_view2 depends on table parent
HINT: Use DROP ... CASCADE to drop the dependent objects too.
Steps:
CREATE TABLE parent ( id SERIAL PRIMARY KEY, name TEXT);
CREATE TABLE child1 ( id SERIAL PRIMARY KEY, parent_id INT
REFERENCES parent(id));
CREATE TABLE child2 ( id SERIAL PRIMARY KEY, parent_id INT
REFERENCES parent(id));
CREATE VIEW parent_view AS SELECT * FROM parent;
CREATE VIEW parent_view2 AS SELECT id, name FROM parent;
CREATE SEQUENCE parent_seq OWNED BY parent.id;
DROP TABLE parent;
2. For no-matching-row in UPDATE or DELETE, perhaps
ERROR: row to be updated[deleted] does not exist
CONTEXT: replicated UPDATE[DELETE] of row with replica identity (f1, f2)=(1, 2) in table "foo"3. I don't understand what delete_origin_differs means or why
it's an error condition, so I won't dare to propose new text
for that. But the new text should make the reason clear,
and I think the same errcontext still works.
You can see the information on delete_origin_differs and other
conflicts in the docs [3]https://www.postgresql.org/docs/devel/logical-replication-conflicts.html. We can discuss/decide on the message of
other conflict types, once we decide multiple_unique_conflicts.
[1]: https://www.postgresql.org/docs/devel/logical-replication-conflicts.html
[2]: https://docs.oracle.com/database/timesten-18.1/TTREP/conflict.htm#TTREP634
[3]: https://www.postgresql.org/docs/devel/logical-replication-conflicts.html
--
With Regards,
Amit Kapila.
On 2025-Dec-01, Amit Kapila wrote:
The reason for displaying in this style is that, in conflicts, users
may want to define their custom resolution strategies based on
conflict_type. Sometimes they need to resolve conflicts manually as
well. To make an informed decision on which version of the row is
"correct," the human reviewer needs full context.
I think it's odd that conflict resolution depends on log entries. I
think it would be much more valuable if conflict reporting would save
the details of the conflict to some kind of conflict logging table.
How exactly are we expecting that users would bring the data from the
log file to a database row, when they are to be merged? What happens if
there are bytea columns in the table?
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Hello, everyone!
On Wed, Dec 3, 2025 at 3:41 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
I think it's odd that conflict resolution depends on log entries. I
think it would be much more valuable if conflict reporting would save
the details of the conflict to some kind of conflict logging table.
How exactly are we expecting that users would bring the data from the
log file to a database row, when they are to be merged? What happens if
there are bytea columns in the table?
Such a proposal exists already at [0]/messages/by-id/CAA4eK1+44b3vd_OWfiaVNtjf5Njb5cek09pmKRmttBByeg0NoA@mail.gmail.com.
Also, sorry for being noisy and a little-bit offtopic but I think it
is a good moment to highlight the fact that such messages are also not
so useful because sometimes they are just clearly incorrect...
You may find details and reproducers at [1]/messages/by-id/CADzfLwXZVmbo11tFS_G2i+6TfFVwHU4VUUSeoqb+8UQfuoJs8A@mail.gmail.com and [2]/messages/by-id/CADzfLwVj9ogzrf2P_H9Xb1z8vLEzBemBxp1nC9wCg4KaOFbvmw@mail.gmail.com.
Best regards,
Mikhail.
[0]: /messages/by-id/CAA4eK1+44b3vd_OWfiaVNtjf5Njb5cek09pmKRmttBByeg0NoA@mail.gmail.com
[1]: /messages/by-id/CADzfLwXZVmbo11tFS_G2i+6TfFVwHU4VUUSeoqb+8UQfuoJs8A@mail.gmail.com
[2]: /messages/by-id/CADzfLwVj9ogzrf2P_H9Xb1z8vLEzBemBxp1nC9wCg4KaOFbvmw@mail.gmail.com
On Wed, Dec 3, 2025 at 8:11 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
On 2025-Dec-01, Amit Kapila wrote:
The reason for displaying in this style is that, in conflicts, users
may want to define their custom resolution strategies based on
conflict_type. Sometimes they need to resolve conflicts manually as
well. To make an informed decision on which version of the row is
"correct," the human reviewer needs full context.I think it's odd that conflict resolution depends on log entries. I
think it would be much more valuable if conflict reporting would save
the details of the conflict to some kind of conflict logging table.
Yes, that is more valuable, and as mentioned by Mihail, we are already
discussing the same. However, that has its own challenges, like it
needs some form of manual management by the user to clean up the
conflict_history table. Also, one can say that because table insert is
a heavy operation and more prone to failures, it is better to LOG it.
Yes, the user would require some form of script to extract the
information from the LOG, so both methods have their pros and cons. I
think we will need a configurable knob to let the user decide which
way they prefer to log the conflict information.
--
With Regards,
Amit Kapila.
On Wed, Dec 3, 2025 at 11:34 PM Mihail Nikalayeu
<mihailnikalayeu@gmail.com> wrote:
Hello, everyone!
On Wed, Dec 3, 2025 at 3:41 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
I think it's odd that conflict resolution depends on log entries. I
think it would be much more valuable if conflict reporting would save
the details of the conflict to some kind of conflict logging table.
How exactly are we expecting that users would bring the data from the
log file to a database row, when they are to be merged? What happens if
there are bytea columns in the table?Such a proposal exists already at [0].
Thanks for providing the information.
Also, sorry for being noisy and a little-bit offtopic but I think it
is a good moment to highlight the fact that such messages are also not
so useful because sometimes they are just clearly incorrect...
You may find details and reproducers at [1] and [2].
Yes, I am aware of those, but IIRC, those are due to the traditional
design choice of using DirtySnapshot during the apply-phase of logical
replication. We need to discuss that separately and come to a
conclusion.
--
With Regards,
Amit Kapila.
On Sun, Nov 30, 2025 at 10:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Nov 29, 2025 at 11:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
After a little bit of thought, here's a sketch of a straw-man idea.
1. The longstanding output for unique constraint violations is like
ERROR: duplicate key value violates unique constraint "foo_f1_f2_key"
DETAIL: Key (f1, f2)=(1, 2) already exists.We could do worse than to use exactly that output (perhaps even
sharing code) and add errcontext likeCONTEXT: replicated INSERT of row with replica identity (f1, f2)=(1, 2) in table "foo"
We should drop the full-row output for the reason I gave previously,
and I do not see the value of the XID printout either.The reason for displaying in this style is that, in conflicts, users
may want to define their custom resolution strategies based on
conflict_type. Sometimes they need to resolve conflicts manually as
well. To make an informed decision on which version of the row is
"correct," the human reviewer needs full context.Example: Imagine a table storing employee data.
Node A updates John Smith’s salary from $100k to $110k.
Node B simultaneously updates John Smith’s job title from "Senior Dev"
to "Lead Dev".
If the conflict report only showed:
Conflict on PK 123. Column 'Salary' differs. Column 'Job Title' differs.The reviewer doesn't have enough information. However, if the report
shows the full rows:Node A Version: ID:123, Name: John Smith, Salary: $110k, Title: Senior
Dev, Dept: Engineering, LastUpdatedBy: PayrollSys
Node B Version: ID:123, Name: John Smith, Salary: $100k, Title: Lead
Dev, Dept: Engineering, LastUpdatedBy: HR_ManagerSeeing the full row allows the reviewer to see who made the change and
what else is in the row. They might decide that the HR Manager's
promotion (Title change) should take precedence, or they might realize
they need to merge the two changes manually because both are valid.
Without the full row data (like LastUpdatedBy or Dept), this decision
is impossible.Another case where we need row data is when a value in Column A might
only be valid based on the value in Column B within the same row. For
example, CHECK Constraints: Imagine a constraint where IF status =
'Active' THEN termination_date MUST BE NULL. If a conflict arises
involving these columns, you need to see both columns simultaneously
to understand why a specific resolution might violate database rules.We do display the entire row as DETAIL for CHECK constraints even without apply.
CREATE TABLE products (
product_no integer,
name text,
price numeric CHECK (price > 0),
discounted_price numeric CHECK (discounted_price > 0),
CHECK (price > discounted_price)
);postgres=# insert into products values (1, 'pen', 10, 20);
ERROR: new row for relation "products" violates check constraint
"products_check"
DETAIL: Failing row contains (1, pen, 10, 20).Also, as per the docs [1], "The large column values are truncated to
64 bytes." while displaying conflict information which is same as what
we do while displaying the row during CHECK constraint violation.Additionally, we checked some other systems where they display the
entire row information [2] (See, "The format of a uniqueness conflict
record is").
I find there is a value in showing the whole before and after row
image for better investigation. Users might want to write the row
images to a separate file to avoid the server logs from being bloated
and 64 bytes limit might not be enough in practice. It might also be a
good idea to show only the different column data of remote and local
rows. There is room for improvements and the conflict history table
patch would be a good start for that anyway.
Regarding the proposed message style:
The idea to split it as per your suggestion, and assuming we agree
that additional row details are required for conflict/resolution based
on my above explanation.LOG: conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab"
DETAIL: Key already exists in unique index "conf_tab_pkey", modified
locally in transaction 764 at 2025-12-01 08:59:50.789608+05:30.
Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
LOG: conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab"
DETAIL: Key already exists in unique index "conf_tab_b_key", modified
locally in transaction 764 at 2025-12-01 08:59:50.789608+05:30.
Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
…
ERROR: stopping replication because of previously-logged errors
CONTEXT: processing remote data for replication origin "pg_16394"
during message type "INSERT" for replication target relation
"public.conf_tab" in transaction 759, finished at 0/017A7380
The DETAIL message still seems to violate our message style guideline.
For instance, "Key (a)=(2); existing local row (2, 2, 2); remote row
(2, 3, 4)." is not a complete sentence.
I personally find that the current approach (i.e., having multiple
detail lines in DETAIL) is more understandable since it's clear for
users why the apply worker exited. With the proposed approach, after
checking the ERROR log, the user would need to collect all relevant
(potentially straggled) LOG messages from the server logs.
If we report multiple_unique_conflicts conflicts by writing a
combination of multiple LOGs and one ERROR, probably we might want to
approach this style also for other conflict types for better
consistency.
The conflict report for multiple_unique_conflicts looks redundant to
me as it shows the whole remote row image multiple times even though
it should be the same.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Thu, Dec 25, 2025 at 2:40 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, Nov 30, 2025 at 10:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Regarding the proposed message style:
The idea to split it as per your suggestion, and assuming we agree
that additional row details are required for conflict/resolution based
on my above explanation.LOG: conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab"
DETAIL: Key already exists in unique index "conf_tab_pkey", modified
locally in transaction 764 at 2025-12-01 08:59:50.789608+05:30.
Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
LOG: conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab"
DETAIL: Key already exists in unique index "conf_tab_b_key", modified
locally in transaction 764 at 2025-12-01 08:59:50.789608+05:30.
Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
…
ERROR: stopping replication because of previously-logged errors
CONTEXT: processing remote data for replication origin "pg_16394"
during message type "INSERT" for replication target relation
"public.conf_tab" in transaction 759, finished at 0/017A7380The DETAIL message still seems to violate our message style guideline.
For instance, "Key (a)=(2); existing local row (2, 2, 2); remote row
(2, 3, 4)." is not a complete sentence.I personally find that the current approach (i.e., having multiple
detail lines in DETAIL) is more understandable since it's clear for
users why the apply worker exited. With the proposed approach, after
checking the ERROR log, the user would need to collect all relevant
(potentially straggled) LOG messages from the server logs.
Fair enough. Also, with the current approach, we don't need to repeat
the same LOG message (
conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab") again and again even though we do similar things at
other places[1][client backend] STATEMENT: drop table toast_test; [client backend] LOG: drop auto-cascades to type toast_test [client backend] STATEMENT: drop table toast_test; [client backend] LOG: drop auto-cascades to type toast_test[] [client backend] STATEMENT: drop table toast_test; [client backend] LOG: drop auto-cascades to default value for column id of table toast_test [client backend] STATEMENT: drop table toast_test; [client backend] LOG: drop auto-cascades to constraint toast_test_id_not_null on table toast_test (the STATEMENT is repeated). If we have to follow your
advice then I can think of following formats:
Format-1:
DETAIL: Key (a)=(2) already exists in unique index "conf_tab_pkey",
modified locally in transaction 764 at 2025-12-01
08:59:50.789608+05:30: local row (2, 2, 2), remote row (2, 3, 4).
Key (b)=(3) already exists in unique index "conf_tab_b_key", modified
locally in transaction 764 at 2025-12-01 08:59:50.789608+05:30: local
row (3, 3, 3); remote row (2, 3, 4).
...
Format-2:
Key already exists in unique index "conf_tab_pkey",
modified locally in transaction 764 at 2025-12-01
08:59:50.789608+05:30: key (a)=(2), local row (2, 2, 2), remote row
(2, 3, 4).
Key already exists in unique index "conf_tab_b_key", modified
locally in transaction 764 at 2025-12-01 08:59:50.789608+05:30: key
(b)=(3), local row (3, 3, 3), remote row (2, 3, 4).
The difference between both formats is that in the first format key
values are displayed in beginning and in the second format, it is
displayed along with rows. Note, I have removed the existing word from
the existing local row as that seems implicit. BTW, do we need to use
full stop at the end of each line if we have multiple lines to display
as part of the DETAIL message. AFAICS, in similar messages [2]ERROR: role "regress_role_joe" cannot be dropped because some objects depend on it DETAIL: privileges for parameter test_oat_hooks.user_var1 privileges for parameter test_oat_hooks.super_var1 privileges for parameter test_oat_hooks.user_var2 privileges for parameter test_oat_hooks.super_var2 privileges for parameter none.such privileges for parameter another.bogus (in
regression tests), we don't use full stop.
The above proposal is somewhat related to existing message:
errdetail_internal("Registering more than maximum %u bytes allowed to
block %u: current %u bytes, adding %u bytes.",
[1]: [client backend] STATEMENT: drop table toast_test; [client backend] LOG: drop auto-cascades to type toast_test [client backend] STATEMENT: drop table toast_test; [client backend] LOG: drop auto-cascades to type toast_test[] [client backend] STATEMENT: drop table toast_test; [client backend] LOG: drop auto-cascades to default value for column id of table toast_test [client backend] STATEMENT: drop table toast_test; [client backend] LOG: drop auto-cascades to constraint toast_test_id_not_null on table toast_test
[client backend] STATEMENT: drop table toast_test;
[client backend] LOG: drop auto-cascades to type toast_test
[client backend] STATEMENT: drop table toast_test;
[client backend] LOG: drop auto-cascades to type toast_test[]
[client backend] STATEMENT: drop table toast_test;
[client backend] LOG: drop auto-cascades to default value for column
id of table toast_test
[client backend] STATEMENT: drop table toast_test;
[client backend] LOG: drop auto-cascades to constraint
toast_test_id_not_null on table toast_test
[2]: ERROR: role "regress_role_joe" cannot be dropped because some objects depend on it DETAIL: privileges for parameter test_oat_hooks.user_var1 privileges for parameter test_oat_hooks.super_var1 privileges for parameter test_oat_hooks.user_var2 privileges for parameter test_oat_hooks.super_var2 privileges for parameter none.such privileges for parameter another.bogus
ERROR: role "regress_role_joe" cannot be dropped because some objects
depend on it
DETAIL: privileges for parameter test_oat_hooks.user_var1
privileges for parameter test_oat_hooks.super_var1
privileges for parameter test_oat_hooks.user_var2
privileges for parameter test_oat_hooks.super_var2
privileges for parameter none.such
privileges for parameter another.bogus
If we report multiple_unique_conflicts conflicts by writing a
combination of multiple LOGs and one ERROR, probably we might want to
approach this style also for other conflict types for better
consistency.
makes sense, if we have any other such reports but I think the
multiple detail lines is used only for multiple_unique_conflicts type
of conlict.
The conflict report for multiple_unique_conflicts looks redundant to
me as it shows the whole remote row image multiple times even though
it should be the same.
Yeah, I think it is mostly because apart from
multiple_unique_conflicts, it won't be repeated and that makes the
code easier to write in its current form. However, you have a valid
point and we should try to eliminate duplicate information for remote
row and replica identity. How about something like:
ERROR: conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab"
DETAIL: Remote row (2, 3, 4) could not be applied by using replica
identity (a)=(5).
Key (a)=(2) already exists in unique index "conf_tab_pkey",
modified locally in transaction 764 at 2025-12-01
08:59:50.789608+05:30: existing local row (2, 2, 2).
Key (b)=(3) already exists in unique index "conf_tab_b_key",
modified locally in transaction 799 at 2025-12-02
09:00:00.123456+05:30: existing local row (3, 3, 3).
Key (c)=(4) already exists in unique index "conf_tab_c_key",
modified locally in transaction 764 at 2025-12-01
08:59:50.789608+05:30: existing local row (4, 4, 4).
And when replica identity is not applicable then we can output like:
ERROR: conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab"
DETAIL: Remote row (2, 3, 4) could not be applied.
Key (a)=(2) already exists in unique index "conf_tab_pkey",
modified locally in transaction 764 at 2025-12-01
08:59:50.789608+05:30: existing local row (2, 2, 2).
Key (b)=(3) already exists in unique index "conf_tab_b_key",
modified locally in transaction 799 at 2025-12-02
09:00:00.123456+05:30: existing local row (3, 3, 3).
Key (c)=(4) already exists in unique index "conf_tab_c_key",
modified locally in transaction 764 at 2025-12-01
08:59:50.789608+05:30: existing local row (4, 4, 4).
--
With Regards,
Amit Kapila.
Dear Amit, Sawada-san,
Fair enough. Also, with the current approach, we don't need to repeat
the same LOG message (
conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab") again and again even though we do similar things at
other places[1] (the STATEMENT is repeated). If we have to follow your
advice then I can think of following formats:
...
Basically they look good to me, but I prefer to clarify the column name for each
tuples at least once per one output. Like:
```
DETAIL: Key (a)=(2) already ... local row (a, b, c) = (2, 2, 2), remote row [(a, b, c) = ](2, 3, 4).
```
Admin can easily follow what is the exact reason why it happened.
Also, the ordering of column can be different between instances, and it may cause
misunderstanding. Currently they would be re-ordered by the subscriber-side ones,
but reader may understand by the publisher-side ones.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
On Wed, Jan 7, 2026 at 3:37 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Amit, Sawada-san,
Fair enough. Also, with the current approach, we don't need to repeat
the same LOG message (
conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab") again and again even though we do similar things at
other places[1] (the STATEMENT is repeated). If we have to follow your
advice then I can think of following formats:...
Basically they look good to me, but I prefer to clarify the column name for each
tuples at least once per one output. Like:```
DETAIL: Key (a)=(2) already ... local row (a, b, c) = (2, 2, 2), remote row [(a, b, c) = ](2, 3, 4).
```Admin can easily follow what is the exact reason why it happened.
Also, the ordering of column can be different between instances, and it may cause
misunderstanding. Currently they would be re-ordered by the subscriber-side ones,
but reader may understand by the publisher-side ones.
As shown upthread, in existing places where we display the entire row,
we don't use columns, so doesn't see why we need to be different here.
I think but we can display for RI columns.
--
With Regards,
Amit Kapila.
Dear Amit,
Fair enough. Also, with the current approach, we don't need to repeat
the same LOG message (
conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab") again and again even though we do similar things at
other places[1] (the STATEMENT is repeated). If we have to follow your
advice then I can think of following formats:...
As shown upthread, in existing places where we display the entire row,
we don't use columns, so doesn't see why we need to be different here.
I think but we can display for RI columns.
Thanks for the suggestion. I've created the first draft based on the comment.
While considering and implementing, I found that worker sometimes miss to read
information for indexes and relations due to the missing permissions. Previous
style just appended key/row/replica identity information at the bottom thus it
had less impacts.
However, it needs some branches if we tried to integrate into complete statements
to avoid constructing sentences at run-time.
E.g., if we have complete information, the output can be like:
```
Key (a) = (1) already exists in unique index "tab_pkey", modified in transaction 777: local row (1, 1).
```
But if the worker cannot read the content of the index, the statement should be slightly different like:
```
Unique index "tab_pkey" rejects applying due to local row (1, 1), modified in transaction 77.
```
How do you feel? This patch may need idea to reduce lines.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
0001-Fix-errdetail-for-logical-replication-conflict.patchapplication/octet-stream; name=0001-Fix-errdetail-for-logical-replication-conflict.patchDownload+467-193
On Fri, Jan 9, 2026 at 7:07 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Amit,
Fair enough. Also, with the current approach, we don't need to repeat
the same LOG message (
conflict (multiple_unique_conflicts) detected on relation
"public.conf_tab") again and again even though we do similar things at
other places[1] (the STATEMENT is repeated). If we have to follow your
advice then I can think of following formats:...
As shown upthread, in existing places where we display the entire row,
we don't use columns, so doesn't see why we need to be different here.
I think but we can display for RI columns.Thanks for the suggestion. I've created the first draft based on the comment.
While considering and implementing, I found that worker sometimes miss to read
information for indexes and relations due to the missing permissions. Previous
style just appended key/row/replica identity information at the bottom thus it
had less impacts.
However, it needs some branches if we tried to integrate into complete statements
to avoid constructing sentences at run-time.E.g., if we have complete information, the output can be like:
```
Key (a) = (1) already exists in unique index "tab_pkey", modified in transaction 777: local row (1, 1).
```But if the worker cannot read the content of the index, the statement should be slightly different like:
```
Unique index "tab_pkey" rejects applying due to local row (1, 1), modified in transaction 77.
```How do you feel? This patch may need idea to reduce lines.
I haven't looked at the code yet, but some initial thoughts while
looking at the output:
1)
DETAIL: Could not apply remote row (20, 10).
DETAIL: Could not apply remote row (40, 200) by using replica
identity (i)=(20).
We generally "apply" in terms of insert, update, delete etc and not
rows. Do you think we shall have:
'Could not apply remote change (20, 10)..'
The most informative will be to say below, but since operation-type is
already mentioned in Context, mentioning it here might not be needed.
So we can say 'remote change'.
DETAIL: Could not apply remote INSERT for row (30, 10).
DETAIL: Could not apply remote UPDATE for row (40, 200) using replica
identity (i)=(20).
2)
LOG: conflict update_origin_differs detected on relation "public.tab1"
DETAIL: Remote row (20, 200) was applied but previously modified by
different origin.
Local row (20, 100) detected by replica identity (i)=(20) is being
updated, but it was previously modified locally in transaction 804 at
2026-01-12 14:20:28.346397+05:30.
Row embedded in the sentence may make it difficult to read when the
count of columns is huge.
Shall we make it similar to what we had earlier:
DETAIL: Updating the row that was modified locally in transaction
<..> at <..>: local row (..), remote row (..), replica identity
(i)=(..).
Or close to newer version:
DETAIL: The row is being updated, but was previously modified locally
in transaction <..> at <..>: local row (..), remote row (..), replica
identity (i)=(..).
3)
Comment 2 is valid for 'delete_origin_differs' as well. Both should
have exactly the same format.
4)
LOG: conflict update_deleted detected on relation "public.tab1"
DETAIL: Could not find remote row (10, 100) by using replica identity (i)=(10).
The remote-row is embedded here too. Shall it be:
DETAIL: Could not find the row by using replica identity (i) = (10):
remote row (10, 100).
Please note that replica-identity when set to FULl can also be huge,
so both RI and remote-row at the end makes sense.
thanks
Shveta