Sync Rep for 2011CF1
Here's the latest patch for sync rep.
From here, I will be developing the patch further on public git
repository towards commit. My expectation is that commit is at least 2
weeks away, though there are no major unresolved problems. I expect
essential follow on patches to continue for a further 2-4 weeks after
that first commit.
I will add my own reviewer's notes tomorrow.
In terms of testing, the patch hasn't been tested further than my own
laptop as yet, so it seems likely there's a few trivial howlers in
there. That is simply because of my recent flu.
I've requested Heikki as main reviewer and he's accepted. Other comments
are also welcome about the user interface and the reply protocol are
also welcome. Please don't bother performance testing yet. I'll let you
know when that is appropriate.
--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services
Attachments:
syncrep.v9.patchtext/x-patch; charset=UTF-8; name=syncrep.v9.patchDownload+1765-79
On Sat, Jan 15, 2011 at 22:40, Simon Riggs <simon@2ndquadrant.com> wrote:
Here's the latest patch for sync rep.
From here, I will be developing the patch further on public git
repository towards commit. My expectation is that commit is at least 2
That's great. Just one tiny detail - which repository and which branch? ;)
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
(grr, I wrote this on Monday already, but just found it in my drafts
folder, unsent)
On 15.01.2011 23:40, Simon Riggs wrote:
Here's the latest patch for sync rep.
From here, I will be developing the patch further on public git
repository towards commit. My expectation is that commit is at least 2
weeks away, though there are no major unresolved problems. I expect
essential follow on patches to continue for a further 2-4 weeks after
that first commit.
Thanks! Some quick observations after first read-through:
* The docs for synchronous_replication still claim that it means two
different things in master and standby. Looking at the code, I believe
that's not true anymore.
* it seems like overkill to not let clients to even connect when
allow_standalone_primary=off and no synchronous standbys are available.
What if you just want to run a read-only query?
* Please separate the hot standby feedback loop into a separate patch on
top of the synch rep patch. I know it's not a lot of code, but it's
still easier to handle features separately.
* The UI differs from what was agreed on here:
http://archives.postgresql.org/message-id/4D1DCF5A.7070808@enterprisedb.com.
* Instead of the short-circuit for autovacuum in SyncRepWaitOnQueue(),
it's probably better to set synchronous_commit=off locally when the
autovacuum process starts.
* the "queue id" thing is dead code at the moment, as there is only one
queue. I gather this is a leftover from having different queues for
"apply", "sync", "write" modes, but I think it would be better to just
remove it for now.
PS, I'm surprised how small this patch is. Thinking about it some more,
I don't know why I expected this to be a big patch.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
(grr, I wrote this on Monday already, but just found it in my drafts
folder, unsent)
No worries, thanks for commenting.
Thanks! Some quick observations after first read-through:
* The docs for synchronous_replication still claim that it means two
different things in master and standby. Looking at the code, I believe
that's not true anymore.
Probably. The docs changed so many times I had gone "code-blind".
* it seems like overkill to not let clients to even connect when
allow_standalone_primary=off and no synchronous standbys are available.
What if you just want to run a read-only query?
That's what Aidan requested, I agreed and so its there. You're using
sync rep because of writes, so you have a read-write app. If you allow
connections then half of the app will work, half will not. Half-working
isn't very useful, as Aidan eloquently explained. If your app is all
read-only you wouldn't be using sync rep anyway. That's the argument,
but I've not got especially strong feelings it has to be this way.
Perhaps discuss that on a separate thread? See what everyone thinks?
* Please separate the hot standby feedback loop into a separate patch on
top of the synch rep patch. I know it's not a lot of code, but it's
still easier to handle features separately.
I tried to do that initially, but there is interaction between those
features. The way I have it is that the replies from the standby act as
keepalives to the master. So the hot standby feedback is just an extra
parameter and an extra field. Removing that doesn't really make the
patch any easier to understand.
* The UI differs from what was agreed on here:
http://archives.postgresql.org/message-id/4D1DCF5A.7070808@enterprisedb.com.
You mean synchronous_standbys is not there yet? Yes, I know. It can be
added after we commit this, its only a small bit of code and no
dependencies. I figured we had bigger things to agree first.
* Instead of the short-circuit for autovacuum in SyncRepWaitOnQueue(),
it's probably better to set synchronous_commit=off locally when the
autovacuum process starts.
Even better plan, thanks.
* the "queue id" thing is dead code at the moment, as there is only one
queue. I gather this is a leftover from having different queues for
"apply", "sync", "write" modes, but I think it would be better to just
remove it for now.
It's a trivial patch to add options to either fsync or apply, so I was
expecting to add that back in this release also.
PS, I'm surprised how small this patch is. Thinking about it some more,
I don't know why I expected this to be a big patch.
Yes, it's the decisions which seem fairly big this time.
--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services
On Fri, Jan 21, 2011 at 7:45 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
* it seems like overkill to not let clients to even connect when
allow_standalone_primary=off and no synchronous standbys are available. What
if you just want to run a read-only query?
For what it's worth, +1.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, Jan 21, 2011 at 14:24, Simon Riggs <simon@2ndquadrant.com> wrote:
On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
* it seems like overkill to not let clients to even connect when
allow_standalone_primary=off and no synchronous standbys are available.
What if you just want to run a read-only query?That's what Aidan requested, I agreed and so its there. You're using
sync rep because of writes, so you have a read-write app. If you allow
connections then half of the app will work, half will not. Half-working
isn't very useful, as Aidan eloquently explained. If your app is all
read-only you wouldn't be using sync rep anyway. That's the argument,
but I've not got especially strong feelings it has to be this way.Perhaps discuss that on a separate thread? See what everyone thinks?
I'll respond here once, and we'll see if more people want to comment
then we can move it :-)
Doesn't this make a pretty strange assumption - namely that you have a
single application? We support multiple databases, and multiple users,
and multiple pretty much anything - in most cases, people deploy
multiple apps. (They may well be part of the same "solution" or
whatever you want to call it, but parts may well be readonly - like a
reporting app, or even just a monitoring client)
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 21.01.2011 15:24, Simon Riggs wrote:
On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
* it seems like overkill to not let clients to even connect when
allow_standalone_primary=off and no synchronous standbys are available.
What if you just want to run a read-only query?That's what Aidan requested, I agreed and so its there. You're using
sync rep because of writes, so you have a read-write app. If you allow
connections then half of the app will work, half will not. Half-working
isn't very useful, as Aidan eloquently explained. If your app is all
read-only you wouldn't be using sync rep anyway. That's the argument,
but I've not got especially strong feelings it has to be this way.
It's also possible that most of your transactions in fact do "set
synchronous_replication=off", and only a few actually do synchronous
replication. It would be pretty bad to not allow connections in that
case. And what if you want to connect to the server to diagnose the
issue? Oh, you can't... Besides, we're not kicking out existing
connections, are we? Seems inconsistent to let the old connections live.
IMHO the only reasonable option is to allow connections as usual, and
only fail (or block forever) at COMMIT.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Fri, Jan 21, 2011 at 10:33 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
It's also possible that most of your transactions in fact do "set
synchronous_replication=off", and only a few actually do synchronous
replication. It would be pretty bad to not allow connections in that case.
And what if you want to connect to the server to diagnose the issue? Oh, you
can't... Besides, we're not kicking out existing connections, are we? Seems
inconsistent to let the old connections live.IMHO the only reasonable option is to allow connections as usual, and only
fail (or block forever) at COMMIT.
Another point is that the synchronous standby could come back at any
time. There's no reason not to let the client do all the work they
want up until the commit - maybe the standby will pop back up before
the COMMIT actually issued. Or even if it doesn't, as soon as it pops
back up, all those COMMITs get released.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, 2011-01-21 at 17:33 +0200, Heikki Linnakangas wrote:
On 21.01.2011 15:24, Simon Riggs wrote:
On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
* it seems like overkill to not let clients to even connect when
allow_standalone_primary=off and no synchronous standbys are available.
What if you just want to run a read-only query?That's what Aidan requested, I agreed and so its there. You're using
sync rep because of writes, so you have a read-write app. If you allow
connections then half of the app will work, half will not. Half-working
isn't very useful, as Aidan eloquently explained. If your app is all
read-only you wouldn't be using sync rep anyway. That's the argument,
but I've not got especially strong feelings it has to be this way.It's also possible that most of your transactions in fact do "set
synchronous_replication=off", and only a few actually do synchronous
replication. It would be pretty bad to not allow connections in that
case. And what if you want to connect to the server to diagnose the
issue? Oh, you can't... Besides, we're not kicking out existing
connections, are we? Seems inconsistent to let the old connections live.IMHO the only reasonable option is to allow connections as usual, and
only fail (or block forever) at COMMIT.
We all think our own proposed options are the only reasonable thing, but
that helps us not at all in moving forwards. I've put much time into
delivering options many other people want, so there is a range of
function. I think we should hear from Aidan first before we decide to
remove that aspect.
--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services
On Fri, 2011-01-21 at 14:34 +0100, Magnus Hagander wrote:
On Fri, Jan 21, 2011 at 14:24, Simon Riggs <simon@2ndquadrant.com> wrote:
On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
* it seems like overkill to not let clients to even connect when
allow_standalone_primary=off and no synchronous standbys are available.
What if you just want to run a read-only query?That's what Aidan requested, I agreed and so its there. You're using
sync rep because of writes, so you have a read-write app. If you allow
connections then half of the app will work, half will not. Half-working
isn't very useful, as Aidan eloquently explained. If your app is all
read-only you wouldn't be using sync rep anyway. That's the argument,
but I've not got especially strong feelings it has to be this way.Perhaps discuss that on a separate thread? See what everyone thinks?
I'll respond here once, and we'll see if more people want to comment
then we can move it :-)Doesn't this make a pretty strange assumption - namely that you have a
single application? We support multiple databases, and multiple users,
and multiple pretty much anything - in most cases, people deploy
multiple apps. (They may well be part of the same "solution" or
whatever you want to call it, but parts may well be readonly - like a
reporting app, or even just a monitoring client)
There are various problems whatever we do. If we don't like one way, we
must balance that by judging what happens if we do things the other way.
--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services
On Fri, Jan 21, 2011 at 11:59 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
We all think our own proposed options are the only reasonable thing, but
that helps us not at all in moving forwards. I've put much time into
delivering options many other people want, so there is a range of
function. I think we should hear from Aidan first before we decide to
remove that aspect.
Since invited, I'll describe what I *want* do to do. I understand I
may not get it ;-)
When no sync slave is connected, yes, I want to stop things hard. I
don't mind read-only queries working, but what I want to avoid (if
possible) is having the master do lots of inserts/updates/deletes for
clients, fsyncing them all to disk (so on some strange event causing
recovery they'll be considered commit) and just delay the commit
return until it has a valid sync slave connected and caught up again.
And *I*'ld prefer if client transactions get errors right away rather
than begin to hang if a sync slave is not connected.
Even with single server, there's the window where stuff could be
"committed" but the client not notified yet. And that leads to
transactions which need to be verified. And with sync rep, that
window get's a little larger. But I'ld prefer not to make it a hanger
door, *especially* when it gets flung open at the point where the shit
has hit the fan and we're in the midst of switching over to manual
processing...
So, in my case, I'ld like it if PG couldn't do anything to generate
any user-initiated WAL unless there is a sync slave connected. Yes, I
understand that leads to hard-fail, and yes, I understand I'm in the
minority, maybe almost singular in that desire.
a.
--
Aidan Van Dyk Create like a god,
aidan@highrise.ca command like a king,
http://www.highrise.ca/ work like a slave.
On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
On Fri, Jan 21, 2011 at 11:59 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
We all think our own proposed options are the only reasonable thing, but
that helps us not at all in moving forwards. I've put much time into
delivering options many other people want, so there is a range of
function. I think we should hear from Aidan first before we decide to
remove that aspect.Since invited, I'll describe what I *want* do to do. I understand I
may not get it ;-)When no sync slave is connected, yes, I want to stop things hard. I
don't mind read-only queries working, but what I want to avoid (if
possible) is having the master do lots of inserts/updates/deletes for
clients, fsyncing them all to disk (so on some strange event causing
recovery they'll be considered commit) and just delay the commit
return until it has a valid sync slave connected and caught up again.
And *I*'ld prefer if client transactions get errors right away rather
than begin to hang if a sync slave is not connected.Even with single server, there's the window where stuff could be
"committed" but the client not notified yet. And that leads to
transactions which need to be verified. And with sync rep, that
window get's a little larger. But I'ld prefer not to make it a hanger
door, *especially* when it gets flung open at the point where the shit
has hit the fan and we're in the midst of switching over to manual
processing...So, in my case, I'ld like it if PG couldn't do anything to generate
any user-initiated WAL unless there is a sync slave connected. Yes, I
understand that leads to hard-fail, and yes, I understand I'm in the
minority, maybe almost singular in that desire.
What you're proposing is to fail things earlier than absolutely
necessary (when they try to XLOG, rather than at commit) but still
later than what I think Simon is proposing (not even letting them log
in).
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
When no sync slave is connected, yes, I want to stop things hard.
What you're proposing is to fail things earlier than absolutely
necessary (when they try to XLOG, rather than at commit) but still
later than what I think Simon is proposing (not even letting them log
in).
I can't see a reason to disallow login, because read-only transactions
can still run in such a situation --- and, indeed, might be fairly
essential if you need to inspect the database state on the way to fixing
the replication problem. (Of course, we've already had the discussion
about it being a terrible idea to configure replication from inside the
database, but that doesn't mean there might not be views or status you
would wish to look at.)
regards, tom lane
On Fri, Jan 21, 2011 at 1:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
When no sync slave is connected, yes, I want to stop things hard.
What you're proposing is to fail things earlier than absolutely
necessary (when they try to XLOG, rather than at commit) but still
later than what I think Simon is proposing (not even letting them log
in).I can't see a reason to disallow login, because read-only transactions
can still run in such a situation --- and, indeed, might be fairly
essential if you need to inspect the database state on the way to fixing
the replication problem. (Of course, we've already had the discussion
about it being a terrible idea to configure replication from inside the
database, but that doesn't mean there might not be views or status you
would wish to look at.)
And just disallowing new logins is probably not even enough, because
it allows current logged in clients "forward progress", leading
towards an eventual hang (with now committed data on the master).
Again, I'm trying to stop "forward progress" as soon as possible when
a sync slave isn't replicating. And I'ld like clients to fail with
errors sooner (hopefully they get to the commit point) rather than
accumulate the WAL synced to the master and just wait at the commit.
So I think that's a more complete picture of my quick "not do anything
with no synchronous slave replicating" that I think was what led to
the no-login approach.
a.
--
Aidan Van Dyk Create like a god,
aidan@highrise.ca command like a king,
http://www.highrise.ca/ work like a slave.
On Fri, Jan 21, 2011 at 1:09 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
On Fri, Jan 21, 2011 at 1:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
When no sync slave is connected, yes, I want to stop things hard.
What you're proposing is to fail things earlier than absolutely
necessary (when they try to XLOG, rather than at commit) but still
later than what I think Simon is proposing (not even letting them log
in).I can't see a reason to disallow login, because read-only transactions
can still run in such a situation --- and, indeed, might be fairly
essential if you need to inspect the database state on the way to fixing
the replication problem. (Of course, we've already had the discussion
about it being a terrible idea to configure replication from inside the
database, but that doesn't mean there might not be views or status you
would wish to look at.)And just disallowing new logins is probably not even enough, because
it allows current logged in clients "forward progress", leading
towards an eventual hang (with now committed data on the master).Again, I'm trying to stop "forward progress" as soon as possible when
a sync slave isn't replicating. And I'ld like clients to fail with
errors sooner (hopefully they get to the commit point) rather than
accumulate the WAL synced to the master and just wait at the commit.So I think that's a more complete picture of my quick "not do anything
with no synchronous slave replicating" that I think was what led to
the no-login approach.
Well, stopping all WAL activity with an error sounds *more* reasonable
than refusing all logins, but I'm not personally sold on it. For
example, a brief network disruption on the connection between master
and standby would cause the master to grind to a halt... and then
almost immediately resume operations. More generally, if you have
short-running transactions, there's not much difference between
wait-at-commit and wait-at-WAL, and if you have long-running
transactions, then wait-at-WAL might be gumming up the works more than
necessary.
One idea might be to wait both before and after commit. If
allow_standalone_primary is off, and a commit is attempted, we check
whether there's a slave connected, and if not, wait for one to
connect. Then, we write and sync the commit WAL record. Next, we
wait for the WAL to be ack'd. Of course, the standby might disappear
between the first check and the second, but it would greatly reduce
the possibility of the master being ahead of the standby after a
crash, which might be useful for some people.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, Jan 21, 2011 at 1:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:
Again, I'm trying to stop "forward progress" as soon as possible when
a sync slave isn't replicating. And I'ld like clients to fail with
errors sooner (hopefully they get to the commit point) rather than
accumulate the WAL synced to the master and just wait at the commit.
Well, stopping all WAL activity with an error sounds *more* reasonable
than refusing all logins, but I'm not personally sold on it. For
example, a brief network disruption on the connection between master
and standby would cause the master to grind to a halt... and then
almost immediately resume operations.
Yup. And I'm OK with that. In my case, it would be much better to
have a few quick failures, which can complete automatically a few
seconds later then to have a big buildup of transactions to re-verify
by hand upon starting manual processing.
But again, I'll stress that I'm talking about whe the master has no
sync slave connected. a "brief netowrk disruption" between the
master/slave isn't likely going to disconnect the slave. TCP is
pretty good at handling those. If the master thinks it has a sync
slave connected, I'm fine with it continuing to queue WAL for it even
if it's lagging noticeably.
More generally, if you have
short-running transactions, there's not much difference between
wait-at-commit and wait-at-WAL, and if you have long-running
transactions, then wait-at-WAL might be gumming up the works more than
necessary.
Again, when there is not sync slave *connected*, I don't want to wait
*at all*. I want to fail ASAP. If there is a sync slave, and it's
just slow, I don't really care where it waits.
From my experience, if the slave is not connected (i.e TCP connection
has been disconnected), then we're in something like:
1) Proper slave shutdown: pilot error here stopping it if the master requires it
2) Master start, slave not connected yet: I'm fine with getting
errors here... We *hope* a slave will be here soon, but...
3) network has seperated master/slave: TCP means it's been like this
for a long time already...
4) Slave hardware/os low-level hang/crash: TCP means it's been like
this for a while already before master's os tears down the connection
5) Slave has crashed (or rebooted) and slave OS has closed/rejected
our TCP connection
In all of these, I'ld love for my master not to be generating WAL and
letting clients think they are making progress. And I'm hoping that
for #3 & 4 above, PG will have keepalive type traffic that will
prevent me from queing WAL for normal TCP connection time values.
One idea might be to wait both before and after commit. If
allow_standalone_primary is off, and a commit is attempted, we check
whether there's a slave connected, and if not, wait for one to
connect. Then, we write and sync the commit WAL record. Next, we
wait for the WAL to be ack'd. Of course, the standby might disappear
between the first check and the second, but it would greatly reduce
the possibility of the master being ahead of the standby after a
crash, which might be useful for some people.
Ya, but that becomes much more expensive. Instead of it just being a
"write WAL, fsync WAL, send WAL, wait for slave", it becomes "write
WAL, fsync WAL, send WAL, wait for slave fsync, write WAL, fsync WAL,
send WAL, wait for slave fsync". And it's expense is all the time,
rather than just when the "no slave no go" situations arise.
And it doesn't reduce the transactions I need to verify by hand
either, because that waiting/error still only happens at the COMMIT
statement from the client.
--
Aidan Van Dyk Create like a god,
aidan@highrise.ca command like a king,
http://www.highrise.ca/ work like a slave.
On Fri, Jan 21, 2011 at 1:59 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
Yup. And I'm OK with that. In my case, it would be much better to
have a few quick failures, which can complete automatically a few
seconds later then to have a big buildup of transactions to re-verify
by hand upon starting manual processing.
Why would you need to do that?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, 2011-01-21 at 13:32 -0500, Robert Haas wrote:
One idea might be to wait both before and after commit. If
allow_standalone_primary is off, and a commit is attempted, we check
whether there's a slave connected, and if not, wait for one to
connect. Then, we write and sync the commit WAL record. Next, we
wait for the WAL to be ack'd. Of course, the standby might disappear
between the first check and the second, but it would greatly reduce
the possibility of the master being ahead of the standby after a
crash, which might be useful for some people.
I like this idea.
I think it would be too invasive to make a check before we insert each
WAL record, as Aidan suggests. Even if we did that, you aren't protected
when a standby goes down because you'll still have written half a
transaction and still be waiting.
So I propose that
if (!allow_standalone_primary)
ConfirmSyncRepAvailable();
before PreCommit_Notify(). That puts transaction into a wait state that
lasts until a sync rep standby is available. Note that it is before the
actual commit, so if we decide we need to we can cancel those
transactions and have them properly abort.
I won't add that code yet, in case better ideas emerge.
There is no support for preventing connections at startup, so I will
remove that completely, now.
--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services
On Sat, Jan 22, 2011 at 8:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Fri, 2011-01-21 at 13:32 -0500, Robert Haas wrote:
One idea might be to wait both before and after commit. If
allow_standalone_primary is off, and a commit is attempted, we check
whether there's a slave connected, and if not, wait for one to
connect. Then, we write and sync the commit WAL record. Next, we
wait for the WAL to be ack'd. Of course, the standby might disappear
between the first check and the second, but it would greatly reduce
the possibility of the master being ahead of the standby after a
crash, which might be useful for some people.I like this idea.
I think it would be too invasive to make a check before we insert each
WAL record, as Aidan suggests. Even if we did that, you aren't protected
when a standby goes down because you'll still have written half a
transaction and still be waiting.So I propose that
if (!allow_standalone_primary)
ConfirmSyncRepAvailable();before PreCommit_Notify(). That puts transaction into a wait state that
lasts until a sync rep standby is available. Note that it is before the
actual commit, so if we decide we need to we can cancel those
transactions and have them properly abort.I won't add that code yet, in case better ideas emerge.
There is no support for preventing connections at startup, so I will
remove that completely, now.
Time's running short - do you have an updated patch?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sun, Jan 30, 2011 at 11:44 AM, Robert Haas <robertmhaas@gmail.com> wrote:
Time's running short - do you have an updated patch?
This patch hasn't been updated in more than three weeks. I assume
this should now be marked Returned with Feedback, and we'll revisit
synchronous replication for 9.2?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company