Synchronization levels in SR

Started by Fujii Masaoabout 16 years ago145 messageshackers

masao.fujii@gmail.com

about 16 years ago

Hi,

I'm now designing the "synchronous" replication feature based on
SR for 9.1, while discussing that at another thread.
http://archives.postgresql.org/pgsql-hackers/2010-04/msg01516.php

At the first design phase, I'd like to clarify which synch levels
should be supported 9.1 and how it should be specified by users.

The log-shipping replication has some synch levels as follows.

The transaction commit on the master
#1 doesn't wait for replication (already suppored in 9.0)
#2 waits for WAL to be received by the standby
#3 waits for WAL to be received and flushed by the standby
#4 waits for WAL to be received, flushed and replayed by
the standby
..etc?

Which should we include in 9.1? I'd like to add #2 and #3.
They are enough for high-availability use case (i.e., to
prevent failover from losing any transactions committed).
AFAIR, MySQL semi-synchronous replication supports #2 level.

#4 is useful for some cases, but might often make the
transaction commit on the master get stuck since read-only
query can easily block recovery by the lock conflict. So
#4 seems not to be worth working on until that HS problem
has been addressed. Thought?

Second, we need to discuss about how to specify the synch
level. There are three approaches:

* Per standby
Since the purpose, location and H/W resource often differ
from one standby to another, specifying level per standby
(i.e., we set the level in recovery.conf) is a
straightforward approach, I think. For example, we can
choose #3 for high-availability standby near the master,
and choose #1 (async) for the disaster recovery standby
remote.

* Per transaction
Define the PGC_USERSET option specifying the level and
specify it on the master in response to the purpose of
transaction. In this approach, for example, we can choose
#4 for the transaction which should be visible on the
standby as soon as a "success" of the commit has been
returned to a client. We can also choose #1 for
time-critical but not mission-critical transaction.

* Mix
Allow users to specify the level per standby and
transaction at the same time, and then calculate the real
level from them by using some algorithm.

Which should we adopt for 9.1? I'd like to implement the
"per-standby" approach at first since it's simple and seems
to cover more use cases. Thought?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 16 years ago

In reply to: Fujii Masao (#1)

Re: Synchronization levels in SR

On 24/05/10 16:20, Fujii Masao wrote:

The log-shipping replication has some synch levels as follows.

The transaction commit on the master
#1 doesn't wait for replication (already suppored in 9.0)
#2 waits for WAL to be received by the standby
#3 waits for WAL to be received and flushed by the standby
#4 waits for WAL to be received, flushed and replayed by
the standby
..etc?

Which should we include in 9.1? I'd like to add #2 and #3.
They are enough for high-availability use case (i.e., to
prevent failover from losing any transactions committed).
AFAIR, MySQL semi-synchronous replication supports #2 level.

#4 is useful for some cases, but might often make the
transaction commit on the master get stuck since read-only
query can easily block recovery by the lock conflict. So
#4 seems not to be worth working on until that HS problem
has been addressed. Thought?

I see a lot of value in #4; it makes it possible to distribute read-only
load to the standby using something like pgbouncer, completely
transparently to the application. In the lesser modes, the application
can see slightly stale results.

But whatever we can easily implement, really. Pick one that you think is
the easiest and start with that, but keep the other modes in mind in the
design and in the user interface so that you don't paint yourself in the
corner.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Josh Berkus

josh@agliodbs.com

about 16 years ago

In reply to: Fujii Masao (#1)

Re: Synchronization levels in SR

#4 is useful for some cases, but might often make the
transaction commit on the master get stuck since read-only
query can easily block recovery by the lock conflict. So
#4 seems not to be worth working on until that HS problem
has been addressed. Thought?

I agree that #4 should be done last, but it will be needed, not in the
least by your employer ;-) . I don't see any obvious way to make #4
compatible with any significant query load on the slave, but in general
I'd think that users of #4 are far more concerned with 0% data loss than
they are with getting the slave to run read queries.

Second, we need to discuss about how to specify the synch
level. There are three approaches:

* Per standby

* Per transaction

Ach, I'm torn. I can see strong use cases for both of the above.
Really, I think:

* Mix
Allow users to specify the level per standby and
transaction at the same time, and then calculate the real
level from them by using some algorithm.

What we should do is specify it per-standby, and then have a USERSET GUC
on the master which specifies which transactions will be synched, and
those will be synched only on the slaves which are set up to support
synch. That is, if you have:

Master
Slave #1: synch
Slave #2: not synch
Slave #3: not synch

And you have:
Session #1: synch
Session #2: not synch

Session #1 will be synched on Slave #1 before commit. Nothing will be
synched on Slaves 2 and 3, and session #2 will not wait for synch on any
slave.

I think this model delivers the maximum HA flexibility to users while
still making intuitive logical sense.

Which should we adopt for 9.1? I'd like to implement the
"per-standby" approach at first since it's simple and seems
to cover more use cases. Thought?

If people agree that the above is our roadmap, implementing
"per-standby" first makes sense, and then we can implement "per-session"
GUC later.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Heikki Linnakangas (#2)

Re: Synchronization levels in SR

On Tue, May 25, 2010 at 1:18 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I see a lot of value in #4; it makes it possible to distribute read-only
load to the standby using something like pgbouncer, completely transparently
to the application.

Agreed.

In the lesser modes, the application can see slightly
stale results.

Yes

BTW, even if we got #4, we would need to be careful about that
we might see the uncommitted results from the standby. That is,
the transaction commit might become visible in the standby before
the master returns its "success" to a client. I think that we
would never get the completely-transaction-consistent results
from the standby until we have implemented the "snapshot cloning"
feature.
http://wiki.postgresql.org/wiki/ClusterFeatures#Export_snapshots_to_other_sessions

But whatever we can easily implement, really. Pick one that you think is the
easiest and start with that, but keep the other modes in mind in the design
and in the user interface so that you don't paint yourself in the corner.

Yep, the design and implementation for #2 and #3 should be
easily extensible for #4. I'll keep in mind that.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Josh Berkus (#3)

Re: Synchronization levels in SR

On Tue, May 25, 2010 at 10:29 AM, Josh Berkus <josh@agliodbs.com> wrote:

I agree that #4 should be done last, but it will be needed, not in the
least by your employer ;-) . I don't see any obvious way to make #4
compatible with any significant query load on the slave, but in general
I'd think that users of #4 are far more concerned with 0% data loss than
they are with getting the slave to run read queries.

Since #2 and #3 are enough for 0% data loss, I think that such users
would be more concerned about what results are visible in the standby.
No?

What we should do is specify it per-standby, and then have a USERSET GUC
on the master which specifies which transactions will be synched, and
those will be synched only on the slaves which are set up to support
synch. That is, if you have:

Master
Slave #1: synch
Slave #2: not synch
Slave #3: not synch

And you have:
Session #1: synch
Session #2: not synch

Session #1 will be synched on Slave #1 before commit. Nothing will be
synched on Slaves 2 and 3, and session #2 will not wait for synch on any
slave.

I think this model delivers the maximum HA flexibility to users while
still making intuitive logical sense.

This makes sense.

Since it's relatively easy and simple to implement such a boolean GUC flag
rather than "per-transaction" levels (there are four valid values #1, #2,
#3 and #4), I'll do that.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Fujii Masao (#1)

Re: Synchronization levels in SR

On Mon, May 24, 2010 at 10:20 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

At the first design phase, I'd like to clarify which synch levels
should be supported 9.1 and how it should be specified by users.

There is another question about synch level:

When should the master wait for replication?

In my current design, the backend waits for replication only at
the end of the transaction commit. Is this enough? Is there other
waiting point?

For example, smart or fast shutdown on the master should wait
for a shutdown checkpoint record to be replicated to the standby
(btw, in 9.0, shutdown waits for checkpoint record to be *sent*)?
pg_switch_xlog() needs to wait for all of original WAL file to
be replicated?

I'm not sure if the above two "waits-for-replication" have use
cases, so I'm thinking they are not worth implementing, but..

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#5)

Re: Synchronization levels in SR

On Tue, 2010-05-25 at 12:40 +0900, Fujii Masao wrote:

On Tue, May 25, 2010 at 10:29 AM, Josh Berkus <josh@agliodbs.com> wrote:

I agree that #4 should be done last, but it will be needed, not in the
least by your employer ;-) . I don't see any obvious way to make #4
compatible with any significant query load on the slave, but in general
I'd think that users of #4 are far more concerned with 0% data loss than
they are with getting the slave to run read queries.

Since #2 and #3 are enough for 0% data loss, I think that such users
would be more concerned about what results are visible in the standby.
No?

Please add #4 also. You can do that easily at the same time as #2 and
#3, and it will leave me free to fix the perceived conflict problems.

--
Simon Riggs www.2ndQuadrant.com

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#1)

Re: Synchronization levels in SR

On Mon, 2010-05-24 at 22:20 +0900, Fujii Masao wrote:

Second, we need to discuss about how to specify the synch
level. There are three approaches:

* Per standby
Since the purpose, location and H/W resource often differ
from one standby to another, specifying level per standby
(i.e., we set the level in recovery.conf) is a
straightforward approach, I think. For example, we can
choose #3 for high-availability standby near the master,
and choose #1 (async) for the disaster recovery standby
remote.

* Per transaction
Define the PGC_USERSET option specifying the level and
specify it on the master in response to the purpose of
transaction. In this approach, for example, we can choose
#4 for the transaction which should be visible on the
standby as soon as a "success" of the commit has been
returned to a client. We can also choose #1 for
time-critical but not mission-critical transaction.

* Mix
Allow users to specify the level per standby and
transaction at the same time, and then calculate the real
level from them by using some algorithm.

Which should we adopt for 9.1? I'd like to implement the
"per-standby" approach at first since it's simple and seems
to cover more use cases. Thought?

-1

Synchronous replication implies that a commit should wait. This wait is
experienced by the transaction, not by other parts of the system. If we
define robustness at the standby level then robustness depends upon
unseen administrators, as well as the current up/down state of standbys.
This is action-at-a-distance in its worst form.

Imagine having 2 standbys, 1 synch, 1 async. If the synch server goes
down, performance will improve and robustness will have been lost. What
good would that be?

Imagine a standby connected over a long distance. DBA brings up standby
in synch mode accidentally and the primary server hits massive
performance problems without any way of directly controlling this.

The worst aspect of standby-level controls is that nobody ever knows how
safe a transaction is. There is no definition or test for us to check
exactly how safe any particular transaction is. Also, the lack of safety
occurs at the time when you least want it - when one of your servers is
already down.

So I call "per-standby" settings simple, and broken in multiple ways.

Putting the control in the hands of the transaction owner (i.e. on the
master) is exactly where the control should be. I personally like the
idea of that being a USERSET, though could live with system wide
settings if need be. But the control must be on the *master* not on the
standbys.

The best parameter we can specify is the number of servers that we wish
to wait for confirmation from. That is a definition that easily manages
the complexity of having various servers up/down at any one time. It
also survives misconfiguration more easily, as well as providing a
workaround if replicating across a bursty network where we can't
guarantee response times, even of the typical response time is good.

(We've discussed this many times before over a period of years and not
really sure why we have to re-discuss this repeatedly just because
people disagree. You don't mention the earlier discussions, not sure
why. If we want to follow the community process, then all previous
discussions need to be taken into account, unless things have changed -
which they haven't: same topic, same people, AFAICS.)

--
Simon Riggs www.2ndQuadrant.com

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Josh Berkus (#3)

Re: Synchronization levels in SR

On Mon, 2010-05-24 at 18:29 -0700, Josh Berkus wrote:

If people agree that the above is our roadmap, implementing
"per-standby" first makes sense, and then we can implement "per-session"
GUC later.

IMHO "per-standby" sounds simple, but is dangerously simplistic,
explained on another part of the thread.

We need to think clearly about failure modes and how they will be
handled. Failure modes and edge cases completely govern the design here.
"All running smoothly" isn't a major concern and so it appears that the
user interface can be done various ways.

--
Simon Riggs www.2ndQuadrant.com

#10

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Simon Riggs (#8)

Re: Synchronization levels in SR

On Tue, May 25, 2010 at 12:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Synchronous replication implies that a commit should wait. This wait is
experienced by the transaction, not by other parts of the system. If we
define robustness at the standby level then robustness depends upon
unseen administrators, as well as the current up/down state of standbys.
This is action-at-a-distance in its worst form.

Maybe, but I can't help thinking people are going to want some form of
this. The case where someone wants to do sync rep to the machine in
the next rack over and async rep to a server at a remote site seems
too important to ignore.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#11

Joshua D. Drake

jd@commandprompt.com

about 16 years ago

In reply to: Robert Haas (#10)

Re: Synchronization levels in SR

On Tue, 2010-05-25 at 12:40 -0400, Robert Haas wrote:

On Tue, May 25, 2010 at 12:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Synchronous replication implies that a commit should wait. This wait is
experienced by the transaction, not by other parts of the system. If we
define robustness at the standby level then robustness depends upon
unseen administrators, as well as the current up/down state of standbys.
This is action-at-a-distance in its worst form.

Maybe, but I can't help thinking people are going to want some form of
this. The case where someone wants to do sync rep to the machine in
the next rack over and async rep to a server at a remote site seems
too important to ignore.

Uhh yeah, that is pretty much the standard use case. The "next rack" is
only 50% of the equation. The next part is the disaster recovery rack
over 100Mb (or even 10Mb) that is half way across the country. It is
common, very common.

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering

#12

MMK

bomuvi@yahoo.com

about 16 years ago

In reply to: Simon Riggs (#8)

Confused about the buffer pool size

Hello All:
In the code (costsize.c), I see that effective_cache_size is set to DEFAULT_EFFECTIVE_CACHE_SIZE.
This is defined as follows in cost.h
#define DEFAULT_EFFECTIVE_CACHE_SIZE 16384
But when I say
show shared_buffers in psql I get,
shared_buffers ---------------- 28MB
In postgresql.conf file, the following lines appear
shared_buffers = 28MB # min 128kB # (change requires restart)#temp_buffers = 8MB # min 800kB

So I am assuming that the buffer pool size is 28MB = 28 * 128 = 3584 8K pages.
So should effective_cache_size be set to 3584 rather than the 16384?
Thanks,
MMK.

#13

Kevin Grittner

Kevin.Grittner@wicourts.gov

about 16 years ago

In reply to: Robert Haas (#10)

Re: Synchronization levels in SR

Robert Haas <robertmhaas@gmail.com> wrote:

Simon Riggs <simon@2ndquadrant.com> wrote:

If we define robustness at the standby level then robustness
depends upon unseen administrators, as well as the current
up/down state of standbys. This is action-at-a-distance in its
worst form.

Maybe, but I can't help thinking people are going to want some
form of this. The case where someone wants to do sync rep to the
machine in the next rack over and async rep to a server at a
remote site seems too important to ignore.

I think there may be a terminology issue here -- I took "configure
by standby" to mean that *at the master* you would specify rules for
each standby. I think Simon took it to mean that each standby would
define the rules for replication to it. Maybe this issue can
resolve gracefully with a bit of clarification?

-Kevin

#14

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Robert Haas (#10)

Re: Synchronization levels in SR

On Tue, 2010-05-25 at 12:40 -0400, Robert Haas wrote:

On Tue, May 25, 2010 at 12:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Synchronous replication implies that a commit should wait. This wait is
experienced by the transaction, not by other parts of the system. If we
define robustness at the standby level then robustness depends upon
unseen administrators, as well as the current up/down state of standbys.
This is action-at-a-distance in its worst form.

Maybe, but I can't help thinking people are going to want some form of
this.
The case where someone wants to do sync rep to the machine in
the next rack over and async rep to a server at a remote site seems
too important to ignore.

The use case of "machine in the next rack over and async rep to a server
at a remote site" *is* important, but you give no explanation as to why
that implies "per-standby" is the solution to it.

If you read the rest of my email, you'll see that I have explained the
problems "per-standby" settings would cause.

Please don't be so quick to claim it is me ignoring anything.

--
Simon Riggs www.2ndQuadrant.com

#15

Alastair Turner

bell@ctrlf5.co.za

about 16 years ago

In reply to: Simon Riggs (#8)

Re: Synchronization levels in SR

On Tue, May 25, 2010 at 6:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
.......

The best parameter we can specify is the number of servers that we wish
to wait for confirmation from. That is a definition that easily manages
the complexity of having various servers up/down at any one time. It
also survives misconfiguration more easily, as well as providing a
workaround if replicating across a bursty network where we can't
guarantee response times, even of the typical response time is good.

This may be an incredibly naive question, but what happens to the
transaction on the master if the number of confirmations is not
received? Is this intended to create a situation where the master
effectively becomes unavailable for write operations when its
synchronous slaves are unavailable?

Alastair "Bell" Turner

^F5

#16

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Kevin Grittner (#13)

Re: Synchronization levels in SR

On Tue, 2010-05-25 at 11:52 -0500, Kevin Grittner wrote:

Robert Haas <robertmhaas@gmail.com> wrote:

Simon Riggs <simon@2ndquadrant.com> wrote:

If we define robustness at the standby level then robustness
depends upon unseen administrators, as well as the current
up/down state of standbys. This is action-at-a-distance in its
worst form.

Maybe, but I can't help thinking people are going to want some
form of this. The case where someone wants to do sync rep to the
machine in the next rack over and async rep to a server at a
remote site seems too important to ignore.

I think there may be a terminology issue here -- I took "configure
by standby" to mean that *at the master* you would specify rules for
each standby. I think Simon took it to mean that each standby would
define the rules for replication to it. Maybe this issue can
resolve gracefully with a bit of clarification?

The use case of "machine in the next rack over and async rep to a server
at a remote site" would require the settings

server.nextrack = synch
server.remotesite = async

which leaves open the question of what happens when "nextrack" is down.

In many cases, to give adequate performance in that situation people add
an additional server, so the config becomes

server.nextrack1 = synch
server.nextrack2 = synch
server.remotesite = async

We then want to specify for performance reasons that we can get a reply
from either nextrack1 or nextrack2, so it all still works safely and
quickly if one of them is down. How can we express that rule concisely?
With some difficulty.

My suggestion is simply to have a single parameter (name unimportant)

number_of_synch_servers_we_wait_for = N

which is much easier to understand because it is phrased in terms of the
guarantee given to the transaction, not in terms of what the admin
thinks is the situation.

--
Simon Riggs www.2ndQuadrant.com

#17

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Simon Riggs (#16)

Re: Synchronization levels in SR

On Tue, May 25, 2010 at 1:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Tue, 2010-05-25 at 11:52 -0500, Kevin Grittner wrote:

Robert Haas <robertmhaas@gmail.com> wrote:

Simon Riggs <simon@2ndquadrant.com> wrote:

If we define robustness at the standby level then robustness
depends upon unseen administrators, as well as the current
up/down state of standbys. This is action-at-a-distance in its
worst form.

Maybe, but I can't help thinking people are going to want some
form of this. The case where someone wants to do sync rep to the
machine in the next rack over and async rep to a server at a
remote site seems too important to ignore.

I think there may be a terminology issue here -- I took "configure
by standby" to mean that *at the master* you would specify rules for
each standby. I think Simon took it to mean that each standby would
define the rules for replication to it. Maybe this issue can
resolve gracefully with a bit of clarification?

The use case of "machine in the next rack over and async rep to a server
at a remote site" would require the settings

server.nextrack = synch
server.remotesite = async

which leaves open the question of what happens when "nextrack" is down.

In many cases, to give adequate performance in that situation people add
an additional server, so the config becomes

server.nextrack1 = synch
server.nextrack2 = synch
server.remotesite = async

We then want to specify for performance reasons that we can get a reply
from either nextrack1 or nextrack2, so it all still works safely and
quickly if one of them is down. How can we express that rule concisely?
With some difficulty.

Perhaps the difficulty here is that those still look like per-server
settings to me. Just maybe with a different set of semantics.

My suggestion is simply to have a single parameter (name unimportant)

number_of_synch_servers_we_wait_for = N

which is much easier to understand because it is phrased in terms of the
guarantee given to the transaction, not in terms of what the admin
thinks is the situation.

So I agree that we need to talk about whether or not we want to do
this. I'll give my opinion. I am not sure how useful this really is.
Consider a master with two standbys. The master commits a
transaction and waits for one of the two standbys, then acknowledges
the commit back to the user. Then the master crashes. Now what?
It's not immediately obvious which standby we should being online as
the primary, and if we guess wrong we could lose transactions thought
to be committed. This is probably a solvable problem, with enough
work: we can write a script to check the last LSN received by each of
the two standbys and promote whichever one is further along.

But... what happens if the master and one standby BOTH crash
simultaneously? There's no way of knowing (until we get at least one
of them back up) whether it's safe to promote the other standby.

I like the idea of a "quorum commit" type feature where we promise the
user that things are committed when "enough" servers have acknowledged
the commit. But I think most people are not going to want that
configuration unless we also provide some really good management tools
that we don't have today.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#18

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 16 years ago

In reply to: MMK (#12)

Re: Confused about the buffer pool size

On 25/05/10 19:49, MMK wrote:

Hello All:
In the code (costsize.c), I see that effective_cache_size is set to DEFAULT_EFFECTIVE_CACHE_SIZE.
This is defined as follows in cost.h
#define DEFAULT_EFFECTIVE_CACHE_SIZE 16384
But when I say
show shared_buffers in psql I get,
shared_buffers ---------------- 28MB
In postgresql.conf file, the following lines appear
shared_buffers = 28MB # min 128kB # (change requires restart)#temp_buffers = 8MB # min 800kB

So I am assuming that the buffer pool size is 28MB = 28 * 128 = 3584 8K pages.
So should effective_cache_size be set to 3584 rather than the 16384?

No. Please see the manual for what effective_cache_size means:

http://www.postgresql.org/docs/8.4/interactive/runtime-config-query.html#GUC-EFFECTIVE-CACHE-SIZE

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#19

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Robert Haas (#17)

Re: Synchronization levels in SR

On Tue, 2010-05-25 at 13:31 -0400, Robert Haas wrote:

On Tue, May 25, 2010 at 1:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Tue, 2010-05-25 at 11:52 -0500, Kevin Grittner wrote:

Robert Haas <robertmhaas@gmail.com> wrote:

Simon Riggs <simon@2ndquadrant.com> wrote:

If we define robustness at the standby level then robustness
depends upon unseen administrators, as well as the current
up/down state of standbys. This is action-at-a-distance in its
worst form.

Maybe, but I can't help thinking people are going to want some
form of this. The case where someone wants to do sync rep to the
machine in the next rack over and async rep to a server at a
remote site seems too important to ignore.

I think there may be a terminology issue here -- I took "configure
by standby" to mean that *at the master* you would specify rules for
each standby. I think Simon took it to mean that each standby would
define the rules for replication to it. Maybe this issue can
resolve gracefully with a bit of clarification?

The use case of "machine in the next rack over and async rep to a server
at a remote site" would require the settings

server.nextrack = synch
server.remotesite = async

which leaves open the question of what happens when "nextrack" is down.

In many cases, to give adequate performance in that situation people add
an additional server, so the config becomes

server.nextrack1 = synch
server.nextrack2 = synch
server.remotesite = async

We then want to specify for performance reasons that we can get a reply
from either nextrack1 or nextrack2, so it all still works safely and
quickly if one of them is down. How can we express that rule concisely?
With some difficulty.

Perhaps the difficulty here is that those still look like per-server
settings to me. Just maybe with a different set of semantics.

(Those are the per-server settings.)

--
Simon Riggs www.2ndQuadrant.com

#20

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Alastair Turner (#15)

Re: Synchronization levels in SR

On Tue, 2010-05-25 at 19:08 +0200, Alastair Turner wrote:

On Tue, May 25, 2010 at 6:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
.......

The best parameter we can specify is the number of servers that we wish
to wait for confirmation from. That is a definition that easily manages
the complexity of having various servers up/down at any one time. It
also survives misconfiguration more easily, as well as providing a
workaround if replicating across a bursty network where we can't
guarantee response times, even of the typical response time is good.

This may be an incredibly naive question, but what happens to the
transaction on the master if the number of confirmations is not
received? Is this intended to create a situation where the master
effectively becomes unavailable for write operations when its
synchronous slaves are unavailable?

How we handle degraded mode is important, yes. Whatever parameters we
choose the problem will remain the same.

Should we just ignore degraded mode and respond as if nothing bad had
happened? Most people would say not.

If we specify server1 = synch and server2 = async we then also need to
specify what happens if server1 is down. People might often specify
if (server1 == down) server2 = synch.
So now we have 3 configuration settings, one quite complex.

It's much easier to say you want to wait for N servers to respond, but
don't care which they are. One parameter, simple and flexible.

In both cases, we have to figure what to do if we can't get either
server to respond. In replication there is no such thing as "server
down" just a "server didn't reply in time X". So we need to define
timeouts.

So whatever we do, we need additional parameters to specify timeouts
(including wait-forever as an option) and action-on-timeout: commit or
rollback.

--
Simon Riggs www.2ndQuadrant.com

#21

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Robert Haas (#17)

#22

Yeb Havinga

yebhavinga@gmail.com

about 16 years ago

In reply to: Simon Riggs (#20)

#23

Dimitri Fontaine

dimitri@2ndQuadrant.fr

about 16 years ago

In reply to: Simon Riggs (#20)

#24

MMK

bomuvi@yahoo.com

about 16 years ago

In reply to: Heikki Linnakangas (#18)

#25

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Yeb Havinga (#22)

#26

Josh Berkus

josh@agliodbs.com

about 16 years ago

In reply to: MMK (#24)

#27

Florian Pflug

fgp@phlo.org

about 16 years ago

In reply to: Simon Riggs (#25)

#28

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Simon Riggs (#16)

#29

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Fujii Masao (#28)

#30

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Simon Riggs (#7)

#31

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Robert Haas (#29)

#32

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#30)

#33

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#28)

#34

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Simon Riggs (#33)

#35

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Simon Riggs (#31)

#36

Alastair Turner

bell@ctrlf5.co.za

about 16 years ago

In reply to: Robert Haas (#35)

#37

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#34)

#38

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#34)

#39

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Robert Haas (#35)

#40

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Simon Riggs (#39)

#41

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 16 years ago

In reply to: Robert Haas (#40)

#42

Kevin Grittner

Kevin.Grittner@wicourts.gov

about 16 years ago

In reply to: Heikki Linnakangas (#41)

#43

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 16 years ago

In reply to: Kevin Grittner (#42)

#44

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Robert Haas (#40)

#45

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Kevin Grittner (#42)

#46

Kevin Grittner

Kevin.Grittner@wicourts.gov

about 16 years ago

In reply to: Heikki Linnakangas (#43)

#47

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Heikki Linnakangas (#41)

#48

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Heikki Linnakangas (#43)

#49

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 16 years ago

In reply to: Kevin Grittner (#46)

#50

Kevin Grittner

Kevin.Grittner@wicourts.gov

about 16 years ago

In reply to: Heikki Linnakangas (#49)

#51

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 16 years ago

In reply to: Simon Riggs (#47)

#52

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Simon Riggs (#44)

#53

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Robert Haas (#52)

#54

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Simon Riggs (#53)

#55

Joshua D. Drake

jd@commandprompt.com

about 16 years ago

In reply to: Robert Haas (#54)

#56

Dimitri Fontaine

dimitri@2ndQuadrant.fr

about 16 years ago

In reply to: Simon Riggs (#47)

#57

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 16 years ago

In reply to: Dimitri Fontaine (#56)

#58

Jan Wieck

JanWieck@Yahoo.com

about 16 years ago

In reply to: Heikki Linnakangas (#41)

#59

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 16 years ago

In reply to: Dimitri Fontaine (#56)

#60

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Jan Wieck (#58)

#61

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Heikki Linnakangas (#57)

#62

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 16 years ago

In reply to: Simon Riggs (#61)

#63

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Simon Riggs (#37)

#64

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#63)

#65

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Heikki Linnakangas (#62)

#66

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 16 years ago

In reply to: Simon Riggs (#65)

#67

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Simon Riggs (#38)

#68

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Simon Riggs (#64)

#69

Dimitri Fontaine

dimitri@2ndQuadrant.fr

about 16 years ago

In reply to: Heikki Linnakangas (#59)

#70

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Heikki Linnakangas (#66)

#71

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#68)

#72

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#67)

#73

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#67)

#74

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Simon Riggs (#71)

#75

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#74)

#76

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Simon Riggs (#73)

#77

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Simon Riggs (#75)

#78

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Fujii Masao (#67)

#79

Dimitri Fontaine

dimitri@2ndQuadrant.fr

about 16 years ago

In reply to: Simon Riggs (#73)

#80

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#77)

#81

Simon Riggs

simon@2ndQuadrant.com

about 16 years ago

In reply to: Fujii Masao (#76)

#82

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Robert Haas (#78)

#83

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Simon Riggs (#80)

#84

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Simon Riggs (#81)

#85

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Fujii Masao (#82)

#86

Fujii Masao

masao.fujii@gmail.com

about 16 years ago

In reply to: Robert Haas (#85)

#87

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Fujii Masao (#86)

#88

Bruce Momjian

bruce@momjian.us

about 16 years ago

In reply to: Simon Riggs (#37)

#89

Bruce Momjian

bruce@momjian.us

about 16 years ago

In reply to: Heikki Linnakangas (#59)

#90

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Bruce Momjian (#89)

#91

Greg Smith

gsmith@gregsmith.com

almost 16 years ago

In reply to: Heikki Linnakangas (#59)

#92

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

almost 16 years ago

In reply to: Greg Smith (#91)

#93

Simon Riggs

simon@2ndQuadrant.com

almost 16 years ago

In reply to: Greg Smith (#91)

#94

Tom Lane

tgl@sss.pgh.pa.us

almost 16 years ago

In reply to: Heikki Linnakangas (#92)

#95

Greg Smith

gsmith@gregsmith.com

almost 16 years ago

In reply to: Tom Lane (#94)

#96

Jan Wieck

JanWieck@Yahoo.com

almost 16 years ago

In reply to: Bruce Momjian (#89)

#97

Robert Haas

robertmhaas@gmail.com

almost 16 years ago

In reply to: Jan Wieck (#96)

#98

David Fetter

david@fetter.org

almost 16 years ago

In reply to: Robert Haas (#97)

#99

Jan Wieck

JanWieck@Yahoo.com

almost 16 years ago

In reply to: Robert Haas (#97)

#100

Robert Haas

robertmhaas@gmail.com

almost 16 years ago

In reply to: Jan Wieck (#99)

#101

Jan Wieck

JanWieck@Yahoo.com

almost 16 years ago

In reply to: Robert Haas (#100)

#102

Boszormenyi Zoltan

zb@cybertec.at

over 15 years ago

In reply to: Dimitri Fontaine (#79)

#103

Fujii Masao

masao.fujii@gmail.com

over 15 years ago

In reply to: Boszormenyi Zoltan (#102)

#104

Boszormenyi Zoltan

zb@cybertec.at

over 15 years ago

In reply to: Fujii Masao (#103)

#105

Dimitri Fontaine

dimitri@2ndQuadrant.fr

over 15 years ago

In reply to: Boszormenyi Zoltan (#102)

#106

Boszormenyi Zoltan

zb@cybertec.at

over 15 years ago

In reply to: Dimitri Fontaine (#105)

#107

Simon Riggs

simon@2ndQuadrant.com

over 15 years ago

In reply to: Boszormenyi Zoltan (#106)

#108

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Simon Riggs (#107)

#109

Boszormenyi Zoltan

zb@cybertec.at

over 15 years ago

In reply to: Simon Riggs (#107)

#110

Simon Riggs

simon@2ndQuadrant.com

over 15 years ago

In reply to: Boszormenyi Zoltan (#109)

#111

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Simon Riggs (#110)

#112

Simon Riggs

simon@2ndQuadrant.com

over 15 years ago

In reply to: Bruce Momjian (#111)

#113

Simon Riggs

simon@2ndQuadrant.com

over 15 years ago

In reply to: Tom Lane (#108)

#114

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Simon Riggs (#110)

#115

Markus Wanner

markus@bluegap.ch

over 15 years ago

In reply to: Robert Haas (#78)

#116

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Markus Wanner (#115)

#117

Markus Wanner

markus@bluegap.ch

over 15 years ago

In reply to: Robert Haas (#116)

#118

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Markus Wanner (#117)

#119

Markus Wanner

markus@bluegap.ch

over 15 years ago

In reply to: Robert Haas (#118)

#120

Ron Mayer

rm_pg@cheapcomplexdevices.com

over 15 years ago

In reply to: Markus Wanner (#117)

#121

Simon Riggs

simon@2ndQuadrant.com

over 15 years ago

In reply to: Markus Wanner (#119)

#122

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Markus Wanner (#119)

#123

Simon Riggs

simon@2ndQuadrant.com

over 15 years ago

In reply to: Tom Lane (#122)

#124

Markus Wanner

markus@bluegap.ch

over 15 years ago

In reply to: Ron Mayer (#120)

#125

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Simon Riggs (#123)

#126

Markus Wanner

markus@bluegap.ch

over 15 years ago

In reply to: Tom Lane (#122)

#127

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Simon Riggs (#121)

#128

Markus Wanner

markus@bluegap.ch

over 15 years ago

In reply to: Robert Haas (#127)

#129

Markus Wanner

markus@bluegap.ch

over 15 years ago

In reply to: Markus Wanner (#126)

#130

marcin mank

marcin.mank@gmail.com

over 15 years ago

In reply to: Tom Lane (#122)

#131

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: marcin mank (#130)

#132

Fujii Masao

masao.fujii@gmail.com

over 15 years ago

In reply to: Simon Riggs (#110)

#133

Boszormenyi Zoltan

zb@cybertec.at

over 15 years ago

In reply to: Fujii Masao (#132)

#134

Fujii Masao

masao.fujii@gmail.com

over 15 years ago

In reply to: Boszormenyi Zoltan (#133)

#135

Boszormenyi Zoltan

zb@cybertec.at

over 15 years ago

In reply to: Fujii Masao (#134)

#136

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Boszormenyi Zoltan (#135)

#137

Fujii Masao

masao.fujii@gmail.com

over 15 years ago

In reply to: Robert Haas (#136)

#138

Markus Wanner

markus@bluegap.ch

over 15 years ago

In reply to: Boszormenyi Zoltan (#133)

#139

Simon Riggs

simon@2ndQuadrant.com

over 15 years ago

In reply to: Boszormenyi Zoltan (#133)

#140

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Fujii Masao (#137)

#141

Fujii Masao

masao.fujii@gmail.com

over 15 years ago

In reply to: Robert Haas (#140)

#142

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Fujii Masao (#141)

#143

Simon Riggs

simon@2ndQuadrant.com

over 15 years ago

In reply to: Robert Haas (#142)

#144

David Fetter

david@fetter.org

over 15 years ago

In reply to: Simon Riggs (#143)

#145

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: Simon Riggs (#143)