pg_stat_replication when standby is unreachable

Started by Abhishek Raiover 12 years ago6 messages

abhishekrai@gmail.com

over 12 years ago

Hello Postgres gurus,

I'm writing a thin clustering layer on top of Postgres using the
synchronous replication feature. The goal is to enable HA and survive
permanent loss of a single node. Using an external coordinator
(Zookeeper), one of the nodes is elected as the primary. The primary node
then picks up another healthy node as its standby, and starts serving.
Thereafter, the cluster monitors the primary and the standby, and triggers
a re-election if itself or its standby go down.

Detecting primary health is easy. But what is the best way to know if the
standby is live? Since this is not a hot-standby, I cannot send queries to
it. Currently, I'm sending the following query to the primary:

SELECT * from pg_stat_replication();

I've noticed that when I terminate the standby (cleanly or through kill
-9), the result of above function goes from 1 row to zero rows. The result
comes back to 1 row when the standby restarts and reconnects. I was
wondering if there is any kind of guarantee about the results of
pg_stat_replication as the standby suffers a network partition, and/or
restarts and reconnects with the primary. Are there any parameters that
control this behavior?

I tried looking at src/backend/replication/walsender.c/WalSndLoop() but am
still not clear on the expected behavior.

Thanks for your time,
Abhishek

Dimitri Fontaine

dimitri@2ndQuadrant.fr

over 12 years ago

In reply to: Abhishek Rai (#1)

Re: pg_stat_replication when standby is unreachable

Abhishek Rai <abhishekrai@gmail.com> writes:

SELECT * from pg_stat_replication();

I've noticed that when I terminate the standby (cleanly or through kill
-9), the result of above function goes from 1 row to zero rows. The result
comes back to 1 row when the standby restarts and reconnects. I was
wondering if there is any kind of guarantee about the results of
pg_stat_replication as the standby suffers a network partition, and/or
restarts and reconnects with the primary. Are there any parameters that
control this behavior?

Not that I know of. We don't register standbies at all, so the master
only knows about those which are successfully connected now.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Eisentraut

peter_e@gmx.net

over 12 years ago

In reply to: Abhishek Rai (#1)

Re: pg_stat_replication when standby is unreachable

On 5/28/13 9:42 PM, Abhishek Rai wrote:

Detecting primary health is easy. But what is the best way to know if
the standby is live? Since this is not a hot-standby, I cannot send
queries to it.

Then how do you define "live" for your use case?

Currently, I'm sending the following query to the primary:

SELECT * from pg_stat_replication();

I've noticed that when I terminate the standby (cleanly or through kill
-9), the result of above function goes from 1 row to zero rows. The
result comes back to 1 row when the standby restarts and reconnects. I
was wondering if there is any kind of guarantee about the results of
pg_stat_replication as the standby suffers a network partition, and/or
restarts and reconnects with the primary. Are there any parameters that
control this behavior?

No, pg_stat_replication is not an appropriate tool for tracking
standbys, for the reasons you point out. You need to track the list of
actual and potential standbys that you are interested in somewhere else.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Abhishek Rai

abhishekrai@gmail.com

over 12 years ago

In reply to: Peter Eisentraut (#3)

Re: pg_stat_replication when standby is unreachable

On Wed, May 29, 2013 at 9:16 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

On 5/28/13 9:42 PM, Abhishek Rai wrote:

Detecting primary health is easy. But what is the best way to know if
the standby is live? Since this is not a hot-standby, I cannot send
queries to it.

Then how do you define "live" for your use case?

By a "live standby", I mean that the standby is currently connected to the
primary and being used for synchronous replication. It may be lagging
(e.g. when in catchup mode), but at least it's connected. This way, when
the standby is fully caught up, I can be sure that if I suffer a permanent
loss of the master or the replica, I would not lose any transaction which
was successfully acknowledged to a client.

Currently, I'm sending the following query to the primary:

SELECT * from pg_stat_replication();

I've noticed that when I terminate the standby (cleanly or through kill
-9), the result of above function goes from 1 row to zero rows. The
result comes back to 1 row when the standby restarts and reconnects. I
was wondering if there is any kind of guarantee about the results of
pg_stat_replication as the standby suffers a network partition, and/or
restarts and reconnects with the primary. Are there any parameters that
control this behavior?

No, pg_stat_replication is not an appropriate tool for tracking
standbys, for the reasons you point out. You need to track the list of
actual and potential standbys that you are interested in somewhere else.

I see. Thanks for the response.

Actually, pg_stat_replication seemed to be doing almost the exact thing I
needed - providing a list of currently successfully connected standbys.
And as per my observation, it was getting updated as standbys joined and
left, precisely what I wanted. The request for a config parameter was
simply to tune how quickly a down standby is reported via
pg_stat_replication. But I can live without a configuration param to tune
this behavior, as long as there is something in the code based on which I
can infer how long it will take for the result of pg_stat_replication to be
updated when a standby joins or leaves. I can use these timeouts to
configure timeouts in the clustering software.

Thanks,
Abhishek

Abhishek Rai

abhishekrai@gmail.com

over 12 years ago

In reply to: Dimitri Fontaine (#2)

Re: pg_stat_replication when standby is unreachable

On Wed, May 29, 2013 at 9:14 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr>wrote:

Abhishek Rai <abhishekrai@gmail.com> writes:

SELECT * from pg_stat_replication();

I've noticed that when I terminate the standby (cleanly or through kill
-9), the result of above function goes from 1 row to zero rows. The

result

comes back to 1 row when the standby restarts and reconnects. I was
wondering if there is any kind of guarantee about the results of
pg_stat_replication as the standby suffers a network partition, and/or
restarts and reconnects with the primary. Are there any parameters that
control this behavior?

Not that I know of. We don't register standbies at all, so the master
only knows about those which are successfully connected now.

Actually that is precisely what I need. If the master reports via
pg_stat_replication the set of standbies that it is connected to, then the
clustering software can just rely on it to know if standby is active or
not. However, this information is not very useful until there are some
guarantees on how long it would take for the response of
pg_stat_replication to be updated in response to an event. Without that,
the clustering software would not know how long to wait before declaring
the standby unhealthy. It's not a requirement that the timeout be
configurable, as long as it's deterministic.

Thanks for your help!
Abhishek

Show quoted text

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Abhishek Rai

abhishekrai@gmail.com

over 12 years ago

In reply to: Abhishek Rai (#5)

Re: pg_stat_replication when standby is unreachable

I looked a bit more into the code and it appears to me that the following
are true:

- A separate wal sender process is created on the primary side for each
connected standby.
- The wal sender process terminates (walsender.c / WalSndLoop) when there
is an error to write to the standby's socket.
- If the standby machine is reachable but postgres is not running there any
more, then the wal sender terminates almost immediately, probably because
the standby machine sends a TCP RST to the wal sender.
- If the standby machine is unreachable, then the wal sender will keep
trying to send wal data. However, since the wal sender uses a non-blocking
socket to talk to the standby, it will timeout and exit after
"replication_timeout" (configured in postgresql.conf).

So it seems like the wal sender should exit within replication_timeout or
sooner, and this will be reflected using an update to pg_stat_replication.
Therefore, I could just wait for up to replication_timeout before declaring
the standby as dead.

Thanks,
Abhishek