Heartbeat between Primary and Standby replicas

Started by fazool meinover 15 years ago6 messages
#1fazool mein
fazoolmein@gmail.com

Hello everyone,

I am designing a heartbeat system between replicas to know when a replica
goes down so that necessary measures can be taken. As I see, there are two
ways of doing it:

1) Creating a separate heartbeat process on replicas.
2) Creating a heartbeat message, and sending it over the connection that is
already established between walsender and walreceiver.

With 2, sending heartbeat from walsender to walreceiver seems trivial.
Sending a heartbeat from walreceiver to walsender seems tricky. Going
through the code, it seems that the walreceiver is always in the
PGASYNC_COPY_OUT mode (except in the beginning when handshaking is done).

Can you recommend the right way of doing this?

Thank you.

Regards

---------------------------
Postgres version = 9.0 beta-4

#2Fujii Masao
masao.fujii@gmail.com
In reply to: fazool mein (#1)
Re: Heartbeat between Primary and Standby replicas

On Fri, Sep 17, 2010 at 6:49 AM, fazool mein <fazoolmein@gmail.com> wrote:

I am designing a heartbeat system between replicas to know when a replica
goes down so that necessary measures can be taken. As I see, there are two
ways of doing it:

1) Creating a separate heartbeat process on replicas.
2) Creating a heartbeat message, and sending it over the connection that is
already established between walsender and walreceiver.

With 2, sending heartbeat from walsender to walreceiver seems trivial.
Sending a heartbeat from walreceiver to walsender seems tricky. Going
through the code, it seems that the walreceiver is always in the
PGASYNC_COPY_OUT mode (except in the beginning when handshaking is done).

Can you recommend the right way of doing this?

The existing keepalive feature doesn't help?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#3fazool mein
fazoolmein@gmail.com
In reply to: Fujii Masao (#2)
Re: Heartbeat between Primary and Standby replicas

Apologies. I'm new to Postgres and I didn't see that feature. It satisfies
what I want to do.

Thanks.

On Thu, Sep 16, 2010 at 7:34 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

Show quoted text

On Fri, Sep 17, 2010 at 6:49 AM, fazool mein <fazoolmein@gmail.com> wrote:

I am designing a heartbeat system between replicas to know when a replica
goes down so that necessary measures can be taken. As I see, there are

two

ways of doing it:

1) Creating a separate heartbeat process on replicas.
2) Creating a heartbeat message, and sending it over the connection that

is

already established between walsender and walreceiver.

With 2, sending heartbeat from walsender to walreceiver seems trivial.
Sending a heartbeat from walreceiver to walsender seems tricky. Going
through the code, it seems that the walreceiver is always in the
PGASYNC_COPY_OUT mode (except in the beginning when handshaking is done).

Can you recommend the right way of doing this?

The existing keepalive feature doesn't help?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#4fazool mein
fazoolmein@gmail.com
In reply to: fazool mein (#3)
Re: Heartbeat between Primary and Standby replicas

Hello again,

I checked the code for the keepalive feature. It seems that the socket
options are only set on the primary's socket connection. The tcp connection
created on the secondary for walreceiver does not use the keepalive
parameters from the configuration.

Am I correct? Is this intended or some bug?

Thanks.

On Fri, Sep 17, 2010 at 7:05 PM, fazool mein <fazoolmein@gmail.com> wrote:

Show quoted text

Apologies. I'm new to Postgres and I didn't see that feature. It satisfies
what I want to do.

Thanks.

On Thu, Sep 16, 2010 at 7:34 PM, Fujii Masao <masao.fujii@gmail.com>wrote:

On Fri, Sep 17, 2010 at 6:49 AM, fazool mein <fazoolmein@gmail.com>
wrote:

I am designing a heartbeat system between replicas to know when a

replica

goes down so that necessary measures can be taken. As I see, there are

two

ways of doing it:

1) Creating a separate heartbeat process on replicas.
2) Creating a heartbeat message, and sending it over the connection that

is

already established between walsender and walreceiver.

With 2, sending heartbeat from walsender to walreceiver seems trivial.
Sending a heartbeat from walreceiver to walsender seems tricky. Going
through the code, it seems that the walreceiver is always in the
PGASYNC_COPY_OUT mode (except in the beginning when handshaking is

done).

Can you recommend the right way of doing this?

The existing keepalive feature doesn't help?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#5Fujii Masao
masao.fujii@gmail.com
In reply to: fazool mein (#4)
Re: Heartbeat between Primary and Standby replicas

On Mon, Sep 27, 2010 at 7:46 AM, fazool mein <fazoolmein@gmail.com> wrote:

I checked the code for the keepalive feature. It seems that the socket
options are only set on the primary's socket connection. The tcp connection
created on the secondary for walreceiver does not use the keepalive
parameters from the configuration.

You can use libpq keepalive parameters for walreceiver.

keepalives_idle
keepalives_interval
keepalives_count
http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html

Those can be set in primary_connection in recovery.conf.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#6fazool mein
fazoolmein@gmail.com
In reply to: Fujii Masao (#5)
Re: Heartbeat between Primary and Standby replicas

Ah, great. I missed looking there.
Thanks.

On Sun, Sep 26, 2010 at 4:19 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

Show quoted text

On Mon, Sep 27, 2010 at 7:46 AM, fazool mein <fazoolmein@gmail.com> wrote:

I checked the code for the keepalive feature. It seems that the socket
options are only set on the primary's socket connection. The tcp

connection

created on the secondary for walreceiver does not use the keepalive
parameters from the configuration.

You can use libpq keepalive parameters for walreceiver.

keepalives_idle
keepalives_interval
keepalives_count
http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html

Those can be set in primary_connection in recovery.conf.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center