Replication server timeout patch

Started by Daniel Farinaabout 15 years ago47 messageshackers
Jump to latest
#1Daniel Farina
daniel@heroku.com

Hello list,

I split this out of the synchronous replication patch for independent
review. I'm dashing out the door, so I haven't put it on the CF yet or
anything, but I just wanted to get it out there...I'll be around in
Not Too Long to finish any other details.

--
fdr

Attachments:

0001-Split-and-rename-out-server-timeouts.patchtext/x-patch; charset=US-ASCII; name=0001-Split-and-rename-out-server-timeouts.patchDownload+60-24
#2Daniel Farina
drfarina@acm.org
In reply to: Daniel Farina (#1)

Hello list,

I split this out of the synchronous replication patch for independent
review. I'm dashing out the door, so I haven't put it on the CF yet or
anything, but I just wanted to get it out there...I'll be around in
Not Too Long to finish any other details.

--
fdr

Attachments:

0001-Split-and-rename-out-server-timeouts.patchtext/x-patch; charset=US-ASCII; name=0001-Split-and-rename-out-server-timeouts.patchDownload+60-24
#3Robert Haas
robertmhaas@gmail.com
In reply to: Daniel Farina (#2)
Re: Replication server timeout patch

On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina <drfarina@acm.org> wrote:

I split this out of the synchronous replication patch for independent
review. I'm dashing out the door, so I haven't put it on the CF yet or
anything, but I just wanted to get it out there...I'll be around in
Not Too Long to finish any other details.

This looks like a useful and separately committable change.

However, it looks to me like this renders wal_sender_delay aka
WalSndDelay completely unused. If we don't need that GUC any more, we
should rip it out completely.

The comment in WalSndHandshake should have a tab at the beginning of
every line. Right now the first line has a tab and the rest have
spaces.

The first hunk in WalSndLoop is a meaningless whitespace change.

I wonder if we ought to just call this replication_timeout, rather
than replication_timeout_server. Simon's patch (from which this
extracted) also has replication_timeout_client, but the two aren't
symmetrical. The replication_timeout_client in this patch is the
amount of time after which the master acknowledges the commit even
though the synchronous standby hasn't acked yet. So it only applies
to the synchronous replication case, whereas this is useful for both
synchronous and asynchronous replication. I'm inclined to think that
knob is utterly useless anyway; surely waiting more than zero will
reduce the throughput of the system to some minute fraction of its
normal value, while waiting less than infinity throws out the data
guarantee that made you pick synchronous replication in the first
place. Even if we do decide to keep that knob, I don't think we'll
want the value to be symmetric with this one.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#3)
Re: Replication server timeout patch

On 11.02.2011 22:11, Robert Haas wrote:

On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina<drfarina@acm.org> wrote:

I split this out of the synchronous replication patch for independent
review. I'm dashing out the door, so I haven't put it on the CF yet or
anything, but I just wanted to get it out there...I'll be around in
Not Too Long to finish any other details.

This looks like a useful and separately committable change.

Hmm, so this patch implements a watchdog, where the master disconnects
the standby if the heartbeat from the standby stops for more than
'replication_[server]_timeout' seconds. The standby sends the heartbeat
every wal_receiver_status_interval seconds.

It would be nice if the master and standby could negotiate those
settings. As the patch stands, it's easy to have a pathological
configuration where replication_server_timeout <
wal_receiver_status_interval, so that the master repeatedly disconnects
the standby because it doesn't reply in time. Maybe the standby should
report how often it's going to send a heartbeat, and master should wait
for that long + some safety margin. Or maybe the master should tell the
standby how often it should send the heartbeat?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#5Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#4)
Re: Replication server timeout patch

On Fri, Feb 11, 2011 at 4:30 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 11.02.2011 22:11, Robert Haas wrote:

On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina<drfarina@acm.org>  wrote:

I split this out of the synchronous replication patch for independent
review. I'm dashing out the door, so I haven't put it on the CF yet or
anything, but I just wanted to get it out there...I'll be around in
Not Too Long to finish any other details.

This looks like a useful and separately committable change.

Hmm, so this patch implements a watchdog, where the master disconnects the
standby if the heartbeat from the standby stops for more than
'replication_[server]_timeout' seconds. The standby sends the heartbeat
every wal_receiver_status_interval seconds.

It would be nice if the master and standby could negotiate those settings.
As the patch stands, it's easy to have a pathological configuration where
replication_server_timeout < wal_receiver_status_interval, so that the
master repeatedly disconnects the standby because it doesn't reply in time.
Maybe the standby should report how often it's going to send a heartbeat,
and master should wait for that long + some safety margin. Or maybe the
master should tell the standby how often it should send the heartbeat?

I guess the biggest use case for that behavior would be in a case
where you have two standbys, one of which doesn't send a heartbeat and
the other of which does. Then you really can't rely on a single
timeout.

Maybe we could change the server parameter to indicate what multiple
of wal_receiver_status_interval causes a hangup, and then change the
client to notify the server what value it's using. But that gets
complicated, because the value could be changed while the standby is
running.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Daniel Farina
daniel@heroku.com
In reply to: Robert Haas (#3)
Re: Replication server timeout patch

On Fri, Feb 11, 2011 at 12:11 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina <drfarina@acm.org> wrote:

I split this out of the synchronous replication patch for independent
review. I'm dashing out the door, so I haven't put it on the CF yet or
anything, but I just wanted to get it out there...I'll be around in
Not Too Long to finish any other details.

This looks like a useful and separately committable change.

However, it looks to me like this renders wal_sender_delay aka
WalSndDelay completely unused.  If we don't need that GUC any more, we
should rip it out completely.

Indeed; I have cleaned this up.

The comment in WalSndHandshake should have a tab at the beginning of
every line.  Right now the first line has a tab and the rest have
spaces.

Also correct. Done.

The first hunk in WalSndLoop is a meaningless whitespace change.

I was trying to get it under 80 columns wide, but yes, it is unnecessary.

I think this closes out the small fry.

I have rebased my splitorific branch to reflect these changes:

https://github.com/fdr/postgres/commits/splitorific

Context diff equivalent attached.

--
fdr

Attachments:

0001-Split-and-rename-out-server-timeout-of-clients-2.patchapplication/octet-stream; name=0001-Split-and-rename-out-server-timeout-of-clients-2.patchDownload+57-32
#7Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#5)
Re: Replication server timeout patch

On Fri, Feb 11, 2011 at 4:38 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Feb 11, 2011 at 4:30 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 11.02.2011 22:11, Robert Haas wrote:

On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina<drfarina@acm.org>  wrote:

I split this out of the synchronous replication patch for independent
review. I'm dashing out the door, so I haven't put it on the CF yet or
anything, but I just wanted to get it out there...I'll be around in
Not Too Long to finish any other details.

This looks like a useful and separately committable change.

Hmm, so this patch implements a watchdog, where the master disconnects the
standby if the heartbeat from the standby stops for more than
'replication_[server]_timeout' seconds. The standby sends the heartbeat
every wal_receiver_status_interval seconds.

It would be nice if the master and standby could negotiate those settings.
As the patch stands, it's easy to have a pathological configuration where
replication_server_timeout < wal_receiver_status_interval, so that the
master repeatedly disconnects the standby because it doesn't reply in time.
Maybe the standby should report how often it's going to send a heartbeat,
and master should wait for that long + some safety margin. Or maybe the
master should tell the standby how often it should send the heartbeat?

I guess the biggest use case for that behavior would be in a case
where you have two standbys, one of which doesn't send a heartbeat and
the other of which does.  Then you really can't rely on a single
timeout.

Maybe we could change the server parameter to indicate what multiple
of wal_receiver_status_interval causes a hangup, and then change the
client to notify the server what value it's using.  But that gets
complicated, because the value could be changed while the standby is
running.

On reflection I'm deeply uncertain this is a good idea. It's pretty
hopeless to suppose that we can keep the user from choosing parameter
settings which will cause them problems, and there are certainly far
stupider things they could do then set replication_timeout <
wal_receiver_status_interval. They could, for example, set fsync=off
or work_mem=4GB or checkpoint_segments=3 (never mind that we ship that
last one out of the box). Any of those settings have the potential to
thoroughly destroy their system in one way or another, and there's not
a darn thing we can do about it. Setting up some kind of handshake
system based on a multiple of the wal_receiver_status_interval is
going to be complex, and it's not necessarily going to deliver the
behavior someone wants anyway. If someone has
wal_receiver_status_interval=10 on one system and =30 on another
system, does it therefore follow that the timeouts should also be
different by 3X? Perhaps, but it's non-obvious.

There are two things that I think are pretty clear. If the receiver
has wal_receiver_status_interval=0, then we should ignore
replication_timeout for that connection. And also we need to make
sure that the replication_timeout can't kill off a connection that is
in the middle of streaming a large base backup. Maybe we should try
to get those two cases right and not worry about the rest. Dan, can
you check whether the base backup thing is a problem with this as
implemented?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Daniel Farina
daniel@heroku.com
In reply to: Robert Haas (#7)
Re: Replication server timeout patch

On Feb 11, 2011 8:20 PM, "Robert Haas" <robertmhaas@gmail.com> wrote:

On Fri, Feb 11, 2011 at 4:38 PM, Robert Haas <robertmhaas@gmail.com>

wrote:

On Fri, Feb 11, 2011 at 4:30 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 11.02.2011 22:11, Robert Haas wrote:

On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina<drfarina@acm.org>

wrote:

I split this out of the synchronous replication patch for independent
review. I'm dashing out the door, so I haven't put it on the CF yet

or

anything, but I just wanted to get it out there...I'll be around in
Not Too Long to finish any other details.

This looks like a useful and separately committable change.

Hmm, so this patch implements a watchdog, where the master disconnects

the

standby if the heartbeat from the standby stops for more than
'replication_[server]_timeout' seconds. The standby sends the heartbeat
every wal_receiver_status_interval seconds.

It would be nice if the master and standby could negotiate those

settings.

As the patch stands, it's easy to have a pathological configuration

where

replication_server_timeout < wal_receiver_status_interval, so that the
master repeatedly disconnects the standby because it doesn't reply in

time.

Maybe the standby should report how often it's going to send a

heartbeat,

and master should wait for that long + some safety margin. Or maybe the
master should tell the standby how often it should send the heartbeat?

I guess the biggest use case for that behavior would be in a case
where you have two standbys, one of which doesn't send a heartbeat and
the other of which does. Then you really can't rely on a single
timeout.

Maybe we could change the server parameter to indicate what multiple
of wal_receiver_status_interval causes a hangup, and then change the
client to notify the server what value it's using. But that gets
complicated, because the value could be changed while the standby is
running.

On reflection I'm deeply uncertain this is a good idea. It's pretty
hopeless to suppose that we can keep the user from choosing parameter
settings which will cause them problems, and there are certainly far
stupider things they could do then set replication_timeout <
wal_receiver_status_interval. They could, for example, set fsync=off
or work_mem=4GB or checkpoint_segments=3 (never mind that we ship that
last one out of the box). Any of those settings have the potential to
thoroughly destroy their system in one way or another, and there's not
a darn thing we can do about it. Setting up some kind of handshake
system based on a multiple of the wal_receiver_status_interval is
going to be complex, and it's not necessarily going to deliver the
behavior someone wants anyway. If someone has
wal_receiver_status_interval=10 on one system and =30 on another
system, does it therefore follow that the timeouts should also be
different by 3X? Perhaps, but it's non-obvious.

There are two things that I think are pretty clear. If the receiver
has wal_receiver_status_interval=0, then we should ignore
replication_timeout for that connection. And also we need to make
sure that the replication_timeout can't kill off a connection that is
in the middle of streaming a large base backup. Maybe we should try
to get those two cases right and not worry about the rest. Dan, can
you check whether the base backup thing is a problem with this as
implemented?

Yes, I will have something to say come Saturday.

--
fdr

#9Fujii Masao
masao.fujii@gmail.com
In reply to: Daniel Farina (#6)
Re: Replication server timeout patch

On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina <daniel@heroku.com> wrote:

Context diff equivalent attached.

Thanks for the patch!

As I said before, the timeout which this patch provides doesn't work well
when the walsender gets blocked in sending WAL. At first, we would
need to implement a non-blocking write function as an infrastructure
of the replication timeout, I think.
http://archives.postgresql.org/message-id/AANLkTi%3DPu2ne%3DVO-%2BCLMXLQh9y85qumLCbBP15CjnyUS%40mail.gmail.com

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#10Daniel Farina
daniel@heroku.com
In reply to: Fujii Masao (#9)
Re: Replication server timeout patch

On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina <daniel@heroku.com> wrote:

Context diff equivalent attached.

Thanks for the patch!

As I said before, the timeout which this patch provides doesn't work well
when the walsender gets blocked in sending WAL. At first, we would
need to implement a non-blocking write function as an infrastructure
of the replication timeout, I think.
http://archives.postgresql.org/message-id/AANLkTi%3DPu2ne%3DVO-%2BCLMXLQh9y85qumLCbBP15CjnyUS%40mail.gmail.com

Interesting point...if that's accepted as required-for-commit, what
are the perceptions of the odds that, presuming I can write the code
quickly enough, that there's enough infrastructure/ports already in
postgres to allow for a non-blocking write on all our supported
platforms?

--
fdr

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Daniel Farina (#10)
Re: Replication server timeout patch

On Mon, 2011-02-14 at 14:13 -0800, Daniel Farina wrote:

On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina <daniel@heroku.com> wrote:

Context diff equivalent attached.

Thanks for the patch!

As I said before, the timeout which this patch provides doesn't work well
when the walsender gets blocked in sending WAL. At first, we would
need to implement a non-blocking write function as an infrastructure
of the replication timeout, I think.
http://archives.postgresql.org/message-id/AANLkTi%3DPu2ne%3DVO-%2BCLMXLQh9y85qumLCbBP15CjnyUS%40mail.gmail.com

I wasn't aware that had been raised before. Thanks for noting it again.

I guess that's why you thought "wait forever" was a good idea ;-)

Interesting point...if that's accepted as required-for-commit, what
are the perceptions of the odds that, presuming I can write the code
quickly enough, that there's enough infrastructure/ports already in
postgres to allow for a non-blocking write on all our supported
platforms?

I'd like to see what you come up with. I would rate that as important,
though not essential for sync replication.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#12Robert Haas
robertmhaas@gmail.com
In reply to: Daniel Farina (#10)
Re: Replication server timeout patch

On Mon, Feb 14, 2011 at 5:13 PM, Daniel Farina <daniel@heroku.com> wrote:

On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina <daniel@heroku.com> wrote:

Context diff equivalent attached.

Thanks for the patch!

As I said before, the timeout which this patch provides doesn't work well
when the walsender gets blocked in sending WAL. At first, we would
need to implement a non-blocking write function as an infrastructure
of the replication timeout, I think.
http://archives.postgresql.org/message-id/AANLkTi%3DPu2ne%3DVO-%2BCLMXLQh9y85qumLCbBP15CjnyUS%40mail.gmail.com

Interesting point...if that's accepted as required-for-commit, what
are the perceptions of the odds that, presuming I can write the code
quickly enough, that there's enough infrastructure/ports already in
postgres to allow for a non-blocking write on all our supported
platforms?

Are you working on this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#13Fujii Masao
masao.fujii@gmail.com
In reply to: Daniel Farina (#10)
Re: Replication server timeout patch

On Tue, Feb 15, 2011 at 7:13 AM, Daniel Farina <daniel@heroku.com> wrote:

On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina <daniel@heroku.com> wrote:

Context diff equivalent attached.

Thanks for the patch!

As I said before, the timeout which this patch provides doesn't work well
when the walsender gets blocked in sending WAL. At first, we would
need to implement a non-blocking write function as an infrastructure
of the replication timeout, I think.
http://archives.postgresql.org/message-id/AANLkTi%3DPu2ne%3DVO-%2BCLMXLQh9y85qumLCbBP15CjnyUS%40mail.gmail.com

Interesting point...if that's accepted as required-for-commit, what
are the perceptions of the odds that, presuming I can write the code
quickly enough, that there's enough infrastructure/ports already in
postgres to allow for a non-blocking write on all our supported
platforms?

I'm not sure if there's already enough infrastructure for a non-blocking
write. But the patch which I submitted before might help to implement that.
http://archives.postgresql.org/message-id/AANLkTinSvcdAYryNfZqd0wepyh1Pf7YX6Q0KxhZjas6a%40mail.gmail.com

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#14Simon Riggs
simon@2ndQuadrant.com
In reply to: Fujii Masao (#13)
Re: Replication server timeout patch

On Wed, 2011-02-16 at 11:34 +0900, Fujii Masao wrote:

On Tue, Feb 15, 2011 at 7:13 AM, Daniel Farina <daniel@heroku.com> wrote:

On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina <daniel@heroku.com> wrote:

Context diff equivalent attached.

Thanks for the patch!

As I said before, the timeout which this patch provides doesn't work well
when the walsender gets blocked in sending WAL. At first, we would
need to implement a non-blocking write function as an infrastructure
of the replication timeout, I think.
http://archives.postgresql.org/message-id/AANLkTi%3DPu2ne%3DVO-%2BCLMXLQh9y85qumLCbBP15CjnyUS%40mail.gmail.com

Interesting point...if that's accepted as required-for-commit, what
are the perceptions of the odds that, presuming I can write the code
quickly enough, that there's enough infrastructure/ports already in
postgres to allow for a non-blocking write on all our supported
platforms?

I'm not sure if there's already enough infrastructure for a non-blocking
write. But the patch which I submitted before might help to implement that.
http://archives.postgresql.org/message-id/AANLkTinSvcdAYryNfZqd0wepyh1Pf7YX6Q0KxhZjas6a%40mail.gmail.com

So, in summary, the position is that we have a timeout, but that timeout
doesn't work in all cases. But it does work in some, so that seems
enough for me to say "let's commit". Not committing gives us nothing at
all, which is as much use as a chocolate teapot.

I will be looking to commit this tomorrow morning, unless I hear some
clear No comments, with reasons.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#15Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#14)
Re: Replication server timeout patch

On Thu, Feb 17, 2011 at 4:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Wed, 2011-02-16 at 11:34 +0900, Fujii Masao wrote:

On Tue, Feb 15, 2011 at 7:13 AM, Daniel Farina <daniel@heroku.com> wrote:

On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina <daniel@heroku.com> wrote:

Context diff equivalent attached.

Thanks for the patch!

As I said before, the timeout which this patch provides doesn't work well
when the walsender gets blocked in sending WAL. At first, we would
need to implement a non-blocking write function as an infrastructure
of the replication timeout, I think.
http://archives.postgresql.org/message-id/AANLkTi%3DPu2ne%3DVO-%2BCLMXLQh9y85qumLCbBP15CjnyUS%40mail.gmail.com

Interesting point...if that's accepted as required-for-commit, what
are the perceptions of the odds that, presuming I can write the code
quickly enough, that there's enough infrastructure/ports already in
postgres to allow for a non-blocking write on all our supported
platforms?

I'm not sure if there's already enough infrastructure for a non-blocking
write. But the patch which I submitted before might help to implement that.
http://archives.postgresql.org/message-id/AANLkTinSvcdAYryNfZqd0wepyh1Pf7YX6Q0KxhZjas6a%40mail.gmail.com

So, in summary, the position is that we have a timeout, but that timeout
doesn't work in all cases. But it does work in some, so that seems
enough for me to say "let's commit". Not committing gives us nothing at
all, which is as much use as a chocolate teapot.

I will be looking to commit this tomorrow morning, unless I hear some
clear No comments, with reasons.

I guess the question is whether it works in 10% of cases or 95% of
cases. In the first case there's probably no point in pretending we
have a feature if it doesn't really work. In the second case, it
might make sense. But I don't have a good feeling for which it is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#16Josh Berkus
josh@agliodbs.com
In reply to: Simon Riggs (#14)
Re: Replication server timeout patch

So, in summary, the position is that we have a timeout, but that timeout
doesn't work in all cases. But it does work in some, so that seems
enough for me to say "let's commit". Not committing gives us nothing at
all, which is as much use as a chocolate teapot.

Can someone summarize the cases where it does and doesn't work?
There's been a longish gap in this thread.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#17Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#15)
Re: Replication server timeout patch

On Thu, 2011-02-17 at 16:42 -0500, Robert Haas wrote:

So, in summary, the position is that we have a timeout, but that timeout
doesn't work in all cases. But it does work in some, so that seems
enough for me to say "let's commit". Not committing gives us nothing at
all, which is as much use as a chocolate teapot.

I will be looking to commit this tomorrow morning, unless I hear some
clear No comments, with reasons.

I guess the question is whether it works in 10% of cases or 95% of
cases. In the first case there's probably no point in pretending we
have a feature if it doesn't really work. In the second case, it
might make sense. But I don't have a good feeling for which it is.

Well, I guess the people that wanted to wait forever may get their wish.

For sync rep, I intend to put in place a client timeout, which we do
have code for. The server side timeout still makes sense, but it's not a
requirement for sync rep.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#18Fujii Masao
masao.fujii@gmail.com
In reply to: Josh Berkus (#16)
Re: Replication server timeout patch

On Fri, Feb 18, 2011 at 7:55 AM, Josh Berkus <josh@agliodbs.com> wrote:

So, in summary, the position is that we have a timeout, but that timeout
doesn't work in all cases. But it does work in some, so that seems
enough for me to say "let's commit". Not committing gives us nothing at
all, which is as much use as a chocolate teapot.

Can someone summarize the cases where it does and doesn't work?
There's been a longish gap in this thread.

The timeout doesn't work when walsender gets blocked during sending the
WAL because the send buffer has been filled up, I'm afraid. IOW, it doesn't
work when the standby becomes unresponsive while WAL is generated on
the master one after another. Since walsender tries to continue sending the
WAL while the standby is unresponsive, the send buffer gets filled up and
the blocking send function (e.g., pq_flush) blocks the walsender.

OTOH, if the standby becomes unresponsive when there is no workload
which causes WAL, the timeout would work.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#19Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#18)
Re: Replication server timeout patch

On Thu, Feb 17, 2011 at 9:10 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Fri, Feb 18, 2011 at 7:55 AM, Josh Berkus <josh@agliodbs.com> wrote:

So, in summary, the position is that we have a timeout, but that timeout
doesn't work in all cases. But it does work in some, so that seems
enough for me to say "let's commit". Not committing gives us nothing at
all, which is as much use as a chocolate teapot.

Can someone summarize the cases where it does and doesn't work?
There's been a longish gap in this thread.

The timeout doesn't work when walsender gets blocked during sending the
WAL because the send buffer has been filled up, I'm afraid. IOW, it doesn't
work when the standby becomes unresponsive while WAL is generated on
the master one after another. Since walsender tries to continue sending the
WAL while the standby is unresponsive, the send buffer gets filled up and
the blocking send function (e.g., pq_flush) blocks the walsender.

OTOH, if the standby becomes unresponsive when there is no workload
which causes WAL, the timeout would work.

IMHO, that's so broken as to be useless.

I would really like to have a solution to this problem, though.
Relying on TCP keepalives is weak.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#19)
Re: Replication server timeout patch

On Fri, Feb 18, 2011 at 12:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:

IMHO, that's so broken as to be useless.

I would really like to have a solution to this problem, though.
Relying on TCP keepalives is weak.

Agreed.

I updated the replication timeout patch which I submitted before.
http://archives.postgresql.org/message-id/AANLkTinSvcdAYryNfZqd0wepyh1Pf7YX6Q0KxhZjas6a%40mail.gmail.com

Since the patch implements also non-blocking send functions,
the timeout can work properly even when the send buffer has
been filled up.

There are two things that I think are pretty clear. If the receiver
has wal_receiver_status_interval=0, then we should ignore
replication_timeout for that connection.

The patch still doesn't check that wal_receiver_status_interval
is set up properly. I'll implement that later.

Regards,

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

replication_timeout_v2.patchtext/x-patch; charset=US-ASCII; name=replication_timeout_v2.patchDownload+461-136
#21Fujii Masao
masao.fujii@gmail.com
In reply to: Fujii Masao (#20)
#22Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#21)
#23Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#22)
#24Fujii Masao
masao.fujii@gmail.com
In reply to: Fujii Masao (#23)
#25Fujii Masao
masao.fujii@gmail.com
In reply to: Fujii Masao (#24)
#26Fujii Masao
masao.fujii@gmail.com
In reply to: Fujii Masao (#25)
#27Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#26)
#28Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#27)
#29Bruce Momjian
bruce@momjian.us
In reply to: Fujii Masao (#28)
#30Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#28)
#31Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#30)
#32Fujii Masao
masao.fujii@gmail.com
In reply to: Fujii Masao (#31)
#33Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#32)
#34Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#33)
#35Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#33)
#36Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#34)
#37Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#36)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fujii Masao (#37)
#39Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#38)
#40Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#39)
#41Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#37)
#42Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#41)
#43Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#42)
#44Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#43)
#45Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#44)
#46Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#45)
#47Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#46)