Replication lag in Postgres

Started by Mukesh Tanukuover 1 year ago4 messagesgeneral

mukesh.postgres@gmail.com

over 1 year ago

Hello everyone.
Firstly thanks to the community members who are addressing all the queries
that are posted. Those give us more insights about the issues/doubts in the
postgres.

I have a question with postgres HA setup.
We are setting up a 2 node postgres cluster with async streaming
replication, we want to define a RPO (Recovery point objective) in case of
primary failure.

How can we best define the RPO in this setup? since it's an async streaming
replication setup there might be a chance of data loss which is
proportional to the replication delay.

Is there any way we can configure the delay duration, like for example to
make sure every 10 mins the standby sync has to happen with primary?

Thank you
Regards
Mukesh T

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: Mukesh Tanuku (#1)

Re: Replication lag in Postgres

On Fri, 2024-07-12 at 20:41 +0530, Mukesh Tanuku wrote:

I have a question with postgres HA setup.
We are setting up a 2 node postgres cluster with async streaming replication, we want to
define a RPO (Recovery point objective) in case of primary failure.

How can we best define the RPO in this setup? since it's an async streaming replication
setup there might be a chance of data loss which is proportional to the replication delay.

Is there any way we can configure the delay duration, like for example to make sure every
10 mins the standby sync has to happen with primary?

When there is a delay, it is usually because replay at the standby is delayed.
The WAL information is still replicated. You won't lose that information on
failover; it will just make the failover take longer.

Unless you have a network problem, you should never lose more than a fraction
of a second.

Yours,
Laurenz Albe

Mukesh Tanuku

mukesh.postgres@gmail.com

over 1 year ago

In reply to: Laurenz Albe (#2)

Re: Replication lag in Postgres

Thank you for the information Laurenz Albe

On Fri, Jul 12, 2024 at 9:13 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

Show quoted text

On Fri, 2024-07-12 at 20:41 +0530, Mukesh Tanuku wrote:

I have a question with postgres HA setup.
We are setting up a 2 node postgres cluster with async streaming

replication, we want to

define a RPO (Recovery point objective) in case of primary failure.

How can we best define the RPO in this setup? since it's an async

streaming replication

setup there might be a chance of data loss which is proportional to the

replication delay.

Is there any way we can configure the delay duration, like for example

to make sure every

10 mins the standby sync has to happen with primary?

When there is a delay, it is usually because replay at the standby is
delayed.
The WAL information is still replicated. You won't lose that information
on
failover; it will just make the failover take longer.

Unless you have a network problem, you should never lose more than a
fraction
of a second.

Yours,
Laurenz Albe

Muhammad Imtiaz

imtiaz.m@bitnine.net

over 1 year ago

In reply to: Mukesh Tanuku (#1)

Re: Replication lag in Postgres

Hi,

I recommend the following configurations/options in this case:

• wal_sender_timeout: This setting determines how long the primary server
waits for the standby server to acknowledge receipt of WAL data. Adjusting
this can help ensure timely data transfer.

• wal_keep_size: Ensures that enough WAL files are retained for the standby
to catch up if it falls behind.

• checkpoint_timeout: Adjust the checkpoint frequency to ensure WAL files
are regularly flushed and sent to the standby server regularly.

• pg_receivewal: Use this tool to continuously archive WAL files to a safe
location.It will helpful if there is a delay in streaming replication, you
have a backup of WAL files.

Regards,
Muhammad Imtiaz

On Fri, 12 Jul 2024, 20:11 Mukesh Tanuku, <mukesh.postgres@gmail.com> wrote:

Show quoted text

Hello everyone.
Firstly thanks to the community members who are addressing all the queries
that are posted. Those give us more insights about the issues/doubts in the
postgres.

I have a question with postgres HA setup.
We are setting up a 2 node postgres cluster with async streaming
replication, we want to define a RPO (Recovery point objective) in case of
primary failure.

How can we best define the RPO in this setup? since it's an async
streaming replication setup there might be a chance of data loss which is
proportional to the replication delay.

Is there any way we can configure the delay duration, like for example to
make sure every 10 mins the standby sync has to happen with primary?

Thank you
Regards
Mukesh T