Parameter name standby_mode
We want to teach people that Hot Standby and Streaming Replication are
two different features. However, Streaming Replication calls its main
parameter "standby_mode" which reminds more of Hot Standby than of
Streaming Replication.
People could also run a warm standby without streaming replication,
which would result in a standby that has standby_mode = 'off'.
I found the parameter name confusing and I'd vote for changing its name.
Joachim
Joachim Wieland wrote:
We want to teach people that Hot Standby and Streaming Replication are
two different features.
I'm not sure about that, actually. Now that they're both in the tree,
they work nicely together and many users will think of them as one.
However, Streaming Replication calls its main
parameter "standby_mode" which reminds more of Hot Standby than of
Streaming Replication.People could also run a warm standby without streaming replication,
which would result in a standby that has standby_mode = 'off'.
If they want to implement the warm standby using the (new) built-in
logic to keep retrying restore_command, they would set
standby_mode='on'. standby_mode='on' doesn't imply streaming replication.
If you want to use pg_standby or similar tools, then you would indeed
set standby_mode='off', but I think that makes sense because you're
implementing the standby functionality outside the server in that case.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Wed, Feb 10, 2010 at 12:16 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
If they want to implement the warm standby using the (new) built-in
logic to keep retrying restore_command, they would set
standby_mode='on'. standby_mode='on' doesn't imply streaming replication.If you want to use pg_standby or similar tools, then you would indeed
set standby_mode='off', but I think that makes sense because you're
implementing the standby functionality outside the server in that case.
Okay, got it now with your explanations.
For some reason it didn't work before with standby_mode = 'on' (it
does now) and the warning "FATAL: sorry, too many standbys already"
gave me a first suspicion that SR is the only use case for this. Then
I checked the docs and there it said "If this parameter is on, the
streaming replication is enabled". I understand now what it does and
that it is a prerequisite but that there is also a non-SR use case...
So the name is okay for me :-)
Thanks again,
Joachim
On Wed, 2010-02-10 at 13:16 +0200, Heikki Linnakangas wrote:
If they want to implement the warm standby using the (new) built-in
logic to keep retrying restore_command, they would set
standby_mode='on'. standby_mode='on' doesn't imply streaming replication.
The docs say "If this parameter is on, the streaming replication is
enabled". So who is wrong?
ISTM that Joachim's viewpoint is right and that most people will be
confused about this.
I think we need something named more intuitively. Something that better
describes what action (i.e. a verb) will occur when this is set.
Suggestions: streaming_replication = on
We may need to split out various complexities into multiple parameters,
or have valued parameters, e.g. standby_mode = REPLICA.
--
Simon Riggs www.2ndQuadrant.com
Simon Riggs wrote:
The docs say "If this parameter is on, the streaming replication is
enabled". So who is wrong?
The docs.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Wed, Feb 10, 2010 at 8:16 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
If they want to implement the warm standby using the (new) built-in
logic to keep retrying restore_command, they would set
standby_mode='on'. standby_mode='on' doesn't imply streaming replication.
But if we fail in restoring the archived WAL file, "standby_mode = on"
*always* tries to start streaming replication.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Fujii Masao wrote:
On Wed, Feb 10, 2010 at 8:16 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:If they want to implement the warm standby using the (new) built-in
logic to keep retrying restore_command, they would set
standby_mode='on'. standby_mode='on' doesn't imply streaming replication.But if we fail in restoring the archived WAL file, "standby_mode = on"
*always* tries to start streaming replication.
Hmm, somehow I thought it doesn't if you don't set primary_conninfo. I
think that's the way it should work, ie. if primary_conninfo is not set,
don't launch walreceiver but just keep trying to restore from the archive.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Fri, Feb 12, 2010 at 3:19 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Fujii Masao wrote:
On Wed, Feb 10, 2010 at 8:16 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:If they want to implement the warm standby using the (new) built-in
logic to keep retrying restore_command, they would set
standby_mode='on'. standby_mode='on' doesn't imply streaming replication.But if we fail in restoring the archived WAL file, "standby_mode = on"
*always* tries to start streaming replication.Hmm, somehow I thought it doesn't if you don't set primary_conninfo. I
think that's the way it should work, ie. if primary_conninfo is not set,
don't launch walreceiver but just keep trying to restore from the archive.
Yeah, even if primary_conninfo is not given, the standby tries to invoke
walreceiver by using the another connection settings (environment variables
or defaults). This is intentional behavior, and would make the setup of SR
easier. So I'd like to leave it be.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Fri, Feb 12, 2010 at 7:28 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
Yeah, even if primary_conninfo is not given, the standby tries to invoke
walreceiver by using the another connection settings (environment variables
or defaults). This is intentional behavior, and would make the setup of SR
easier. So I'd like to leave it be.
On the other hand, if it has to use defaults for the target host/port,
chances are high that either it connects to the wrong host/port or
that SR is just not wanted :-)
Whoever sets up SR will also take the effort to configure
primary_conninfo and will have a different primary than the default -
which I think is just the standby itself, no?
Joachim
Fujii Masao wrote:
On Fri, Feb 12, 2010 at 3:19 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:Fujii Masao wrote:
But if we fail in restoring the archived WAL file, "standby_mode = on"
*always* tries to start streaming replication.Hmm, somehow I thought it doesn't if you don't set primary_conninfo. I
think that's the way it should work, ie. if primary_conninfo is not set,
don't launch walreceiver but just keep trying to restore from the archive.Yeah, even if primary_conninfo is not given, the standby tries to invoke
walreceiver by using the another connection settings (environment variables
or defaults). This is intentional behavior, and would make the setup of SR
easier. So I'd like to leave it be.
You could do primary_conninfo='' for that.
Maybe we should have two options, "streaming_mode='on'" and
"primary_conninfo='...'".
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Fri, Feb 12, 2010 at 4:04 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Fujii Masao wrote:
On Fri, Feb 12, 2010 at 3:19 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:Fujii Masao wrote:
But if we fail in restoring the archived WAL file, "standby_mode = on"
*always* tries to start streaming replication.Hmm, somehow I thought it doesn't if you don't set primary_conninfo. I
think that's the way it should work, ie. if primary_conninfo is not set,
don't launch walreceiver but just keep trying to restore from the archive.Yeah, even if primary_conninfo is not given, the standby tries to invoke
walreceiver by using the another connection settings (environment variables
or defaults). This is intentional behavior, and would make the setup of SR
easier. So I'd like to leave it be.You could do primary_conninfo='' for that.
Maybe we should have two options, "streaming_mode='on'" and
"primary_conninfo='...'".
It looks better for me to extend the "standby_mode":
For example, standby_mode = 'streaming' or 'archive'.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Fujii Masao wrote:
On Fri, Feb 12, 2010 at 4:04 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:Fujii Masao wrote:
On Fri, Feb 12, 2010 at 3:19 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:Fujii Masao wrote:
But if we fail in restoring the archived WAL file, "standby_mode = on"
*always* tries to start streaming replication.Hmm, somehow I thought it doesn't if you don't set primary_conninfo. I
think that's the way it should work, ie. if primary_conninfo is not set,
don't launch walreceiver but just keep trying to restore from the archive.Yeah, even if primary_conninfo is not given, the standby tries to invoke
walreceiver by using the another connection settings (environment variables
or defaults). This is intentional behavior, and would make the setup of SR
easier. So I'd like to leave it be.You could do primary_conninfo='' for that.
Maybe we should have two options, "streaming_mode='on'" and
"primary_conninfo='...'".It looks better for me to extend the "standby_mode":
For example, standby_mode = 'streaming' or 'archive'.
There's yet another mode that would be useful with hot standby: start up
the standby, but don't poll the archive and don't try to connect to the
master. Kind of 'paused' mode. Simon had functions to do that and more
in the original hot standby patch.
I've been thinking that this would work with just the three options we
have now:
standby_mode (true/false) controls whether the server keeps retrying
until trigger file is found (if trigger_file is set), rather than finish
recovery.
primary_conninfo (string) specifies a connection string to use to
connect to the master. If not given, don't try to connect.
restore_command (string) specifies a command to use to restore a file
from archive. If not given, don't try to restore files from archive.
I think this is pretty coherent and easy to explain, and makes all the
combinations restoring files from archive/streaming possible. But if
someone comes up with an even better scheme, I'm all ears.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Joachim Wieland wrote:
On Fri, Feb 12, 2010 at 7:28 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
Yeah, even if primary_conninfo is not given, the standby tries to invoke
walreceiver by using the another connection settings (environment variables
or defaults). This is intentional behavior, and would make the setup of SR
easier. So I'd like to leave it be.On the other hand, if it has to use defaults for the target host/port,
chances are high that either it connects to the wrong host/port or
that SR is just not wanted :-)
Agreed. I've changed it now so that if primary_conninfo is not set, it
doesn't try to establish a streaming connection. If you want to get the
connection information from environment variables, you can use
primary_conninfo=''.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Fri, Feb 12, 2010 at 4:59 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Agreed. I've changed it now so that if primary_conninfo is not set, it
doesn't try to establish a streaming connection. If you want to get the
connection information from environment variables, you can use
primary_conninfo=''.
OK, you win. I would live with primary_conninfo=''.
And you need to change the document, recovery.conf.sample and
so on.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Fri, Feb 12, 2010 at 8:59 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Agreed. I've changed it now so that if primary_conninfo is not set, it
doesn't try to establish a streaming connection. If you want to get the
connection information from environment variables, you can use
primary_conninfo=''.
Why not just remove the default:
If no primary_conninfo variable is set explicitly in the configuration
file, check the environment variables. If the environment variable is
not set, don't try to establish a connection.
?
Joachim
Joachim Wieland wrote:
On Fri, Feb 12, 2010 at 8:59 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:Agreed. I've changed it now so that if primary_conninfo is not set, it
doesn't try to establish a streaming connection. If you want to get the
connection information from environment variables, you can use
primary_conninfo=''.Why not just remove the default:
If no primary_conninfo variable is set explicitly in the configuration
file, check the environment variables. If the environment variable is
not set, don't try to establish a connection.
The environment variables in question are the libpq environment
variables like PGHOST, PGPORT. The server shouldn't need to know about
them. Besides, there'd still be the corner case that you really want to
use the built-in defaults, ie. connect to a server running in the same
host at the default port, so you'd not set any environment variables either.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
There's yet another mode that would be useful with hot standby: start up
the standby, but don't poll the archive and don't try to connect to the
master. Kind of 'paused' mode. Simon had functions to do that and more
in the original hot standby patch.
And having the pause/resume functions would lower the need for perfect
conflict resolution too. When you want to run this huge reporting query
set and not get interrupted, pause the standby. Afterward, resume it.
Of course, while paused, it's not a good HA standby anymore, but you
just did pause it, so you're not surprised, right?
I've been thinking that this would work with just the three options we
have now:
I like that, because it exposes exactly the code logic, and it is not
complex enough to merit being hidden from the users. Also, you depend on
understanding how the server really works to setup a trustworthy HA
solution, so exposing the very used concepts is a win.
primary_conninfo (string) specifies a connection string to use to
connect to the master. If not given, don't try to connect.
Would it be possible to expose that at the SQL level, so that you can
easily check in scripts what master you're a slave of? Think nagios
cascading alerts or topology graphs, etc.
Regards,
--
dim
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Joachim Wieland wrote:
If no primary_conninfo variable is set explicitly in the configuration
file, check the environment variables. If the environment variable is
not set, don't try to establish a connection.
The environment variables in question are the libpq environment
variables like PGHOST, PGPORT. The server shouldn't need to know about
them.
Even more to the point is that some of them, like PGPORT, are highly
likely to be set in a server's environment to point to the server
itself. It would be extremely dangerous to automatically try to start
replication just because we find those set. In fact, I would argue that
we should fix things so that any such variables inherited from the
server environment are intentionally *NOT* used for making SR
connections.
regards, tom lane
On Fri, Feb 12, 2010 at 11:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Even more to the point is that some of them, like PGPORT, are highly
likely to be set in a server's environment to point to the server
itself. It would be extremely dangerous to automatically try to start
replication just because we find those set. In fact, I would argue that
we should fix things so that any such variables inherited from the
server environment are intentionally *NOT* used for making SR
connections.
There are many environment variables which libpq automatically uses.
Which variables should not be used for SR connection? All?
If both primary_conninfo and environment variables are not given,
the default value (e.g., port = 5432) is automatically used for SR
connection. Is this OK? or NG as well as the environment variables?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Fri, Feb 12, 2010 at 4:59 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Joachim Wieland wrote:
On Fri, Feb 12, 2010 at 7:28 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
Yeah, even if primary_conninfo is not given, the standby tries to invoke
walreceiver by using the another connection settings (environment variables
or defaults). This is intentional behavior, and would make the setup of SR
easier. So I'd like to leave it be.On the other hand, if it has to use defaults for the target host/port,
chances are high that either it connects to the wrong host/port or
that SR is just not wanted :-)Agreed. I've changed it now so that if primary_conninfo is not set, it
doesn't try to establish a streaming connection. If you want to get the
connection information from environment variables, you can use
primary_conninfo=''.
If standby_mode is enabled, and neither primary_conninfo nor restore_command
are set, the standby would get stuck. How about forbidding (i.e., causing a
FATAL message) this wrong setting?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center