Seeking Clarification on Logical Replication Start LSN

Started by Pradeep Kumaralmost 2 years ago2 messages
#1Pradeep Kumar
spradeepkumar29@gmail.com

Dear Postgres Community,

I hope this email finds you well. I am reaching out to seek clarification
on an issue I am encountering with logical replication in PostgreSQL.

My specific question pertains to determining the appropriate LSN (Log
Sequence Number) from which to start logical replication. Allow me to
provide detailed context for better understanding:

During the process of performing a parallel pg_basebackup, I concurrently
execute DML queries. As part of the pg_basebackup command, I utilize the
option create-slot to create a replication slot. Subsequently, upon
completion of the base backup, I initiate logical replication using the
restart_lsn obtained during the execution of the pg_basebackup command. My
intention is to ensure consistency between two database clusters.

However, I am encountering errors during this process. Specifically, I
receive the following error message on the source side:

"""
2024-02-27 16:20:09.271 IST [2838457] ERROR: duplicate key value violates
unique constraint "table_15_36_pkey"
2024-02-27 16:20:09.271 IST [2838457] DETAIL: Key (col_1, col_2)=(23,
2024-02-27 15:14:24.332557) already exists.
2024-02-27 16:20:09.272 IST [2834967] LOG: background worker "logical
replication worker" (PID 2838457) exited with exit code 1
Upon analysis, it appears that the errors stem from starting the logical
replication with an incorrect LSN, one that has already been applied to the
target side, leading to duplicate key conflicts.
"""

In light of this issue, I seek guidance on determining the appropriate LSN
from which to commence logical replication.

To further clarify my problem:

1)I have a source machine and a target machine.
2) I perform a backup from the source to the target using pg_basebackup.
3) Prior to initiating the base backup, I create logical replication slots
on the source machine.
4) During the execution of pg_basebackup, DML queries are executed, and I
aim to replicate this data on the target machine.
5) My dilemma lies in determining the correct LSN to begin the logical
replication process.
Your insights and guidance on this matter would be immensely appreciated.
Thank you for your time and assistance.

Warm regards,
Pradeep

#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Pradeep Kumar (#1)
Re: Seeking Clarification on Logical Replication Start LSN

On Tue, Feb 27, 2024 at 5:56 PM Pradeep Kumar <spradeepkumar29@gmail.com> wrote:

Dear Postgres Community,

I hope this email finds you well. I am reaching out to seek clarification on an issue I am encountering with logical replication in PostgreSQL.

My specific question pertains to determining the appropriate LSN (Log Sequence Number) from which to start logical replication. Allow me to provide detailed context for better understanding:

During the process of performing a parallel pg_basebackup, I concurrently execute DML queries. As part of the pg_basebackup command, I utilize the option create-slot to create a replication slot. Subsequently, upon completion of the base backup, I initiate logical replication using the restart_lsn obtained during the execution of the pg_basebackup command. My intention is to ensure consistency between two database clusters.

However, I am encountering errors during this process. Specifically, I receive the following error message on the source side:

"""
2024-02-27 16:20:09.271 IST [2838457] ERROR: duplicate key value violates unique constraint "table_15_36_pkey"
2024-02-27 16:20:09.271 IST [2838457] DETAIL: Key (col_1, col_2)=(23, 2024-02-27 15:14:24.332557) already exists.
2024-02-27 16:20:09.272 IST [2834967] LOG: background worker "logical replication worker" (PID 2838457) exited with exit code 1
Upon analysis, it appears that the errors stem from starting the logical replication with an incorrect LSN, one that has already been applied to the target side, leading to duplicate key conflicts.
"""

In light of this issue, I seek guidance on determining the appropriate LSN from which to commence logical replication.

To further clarify my problem:

1)I have a source machine and a target machine.
2) I perform a backup from the source to the target using pg_basebackup.
3) Prior to initiating the base backup, I create logical replication slots on the source machine.
4) During the execution of pg_basebackup, DML queries are executed, and I aim to replicate this data on the target machine.
5) My dilemma lies in determining the correct LSN to begin the logical replication process.

I think the reason of the problem you are seeing is pg_basebackup also
includes the WAL generated during backup if you specify -X method. See
[1]: https://www.postgresql.org/docs/devel/app-pgbasebackup.html
data duplication is possible. I don't see a very straightforward way
but you might be able to achieve your desired purpose if somehow
identify the last WAL location copied in backup and use that as your
starting point for logical replication.

[1]: https://www.postgresql.org/docs/devel/app-pgbasebackup.html

--
With Regards,
Amit Kapila.