recovery_connections cannot start (was Re: master in standby mode croaks)

Started by Robert Haasalmost 16 years ago68 messageshackers
Jump to latest
#1Robert Haas
robertmhaas@gmail.com

On Sat, Apr 17, 2010 at 6:52 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Apr 17, 2010 at 6:41 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Sat, 2010-04-17 at 17:44 -0400, Robert Haas wrote:

I will change the error message.

I gave a good deal of thought to trying to figure out a cleaner
solution to this problem than just changing the error message and
failed.  So let's change the error message.  Of course I'm not quite
sure what we should change it TO, given that the situation is the
result of an interaction between three different GUCs and we have no
way to distinguish which one(s) are the problem.

"You need all three" covers it.

Actually you need standby_connections and either archive_mode=on or
max_wal_senders>0, I think.

One way we could fix this is use 2 bits rather than 1 for
XLogStandbyInfoMode. One bit could indicate that either
archive_mode=on or max_wal_senders>0, and the second bit could
indicate that recovery_connections=on. If the second bit is unset, we
could emit the existing complaint:

recovery connections cannot start because the recovery_connections
parameter is disabled on the WAL source server

If the other bit is unset, then we could instead complain:

recovery connections cannot start because archive_mode=off and
max_wal_senders=0 on the WAL source server

If we don't want to use two bits there, it's hard to really describe
all the possibilities in a reasonable number of characters. The only
thing I can think of is to print a message and a hint:

recovery_connections cannot start due to incorrect settings on the WAL
source server
HINT: make sure recovery_connections=on and either archive_mode=on or
max_wal_senders>0

I haven't checked whether the hint would be displayed in the log on
the standby, but presumably we could make that be the case if it's not
already.

I think the first way is better because it gives the user more
specific information about what they need to fix. Thinking about how
each case might happen, since the default for recovery_connections is
'on', it seems that recovery_connections=off will likely only be an
issue if the user has explicitly turned it off. The other case, where
archive_mode=off and max_wal_senders=0, will likely only occur if
someone takes a snapshot of the master without first setting up
archiving or SR. Both of these will probably happen relatively
rarely, but since we're burning a whole byte for XLogStandbyInfoMode
(plus 3 more bytes of padding?), it seems like we might as well snag
one more bit for clarity.

Thoughts?

...Robert

#2Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#1)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

On Fri, Apr 23, 2010 at 1:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

One way we could fix this is use 2 bits rather than 1 for
XLogStandbyInfoMode.  One bit could indicate that either
archive_mode=on or max_wal_senders>0, and the second bit could
indicate that recovery_connections=on.  If the second bit is unset, we
could emit the existing complaint:

recovery connections cannot start because the recovery_connections
parameter is disabled on the WAL source server

If the other bit is unset, then we could instead complain:

recovery connections cannot start because archive_mode=off and
max_wal_senders=0 on the WAL source server

If we don't want to use two bits there, it's hard to really describe
all the possibilities in a reasonable number of characters.  The only
thing I can think of is to print a message and a hint:

recovery_connections cannot start due to incorrect settings on the WAL
source server
HINT: make sure recovery_connections=on and either archive_mode=on or
max_wal_senders>0

I haven't checked whether the hint would be displayed in the log on
the standby, but presumably we could make that be the case if it's not
already.

I think the first way is better because it gives the user more
specific information about what they need to fix.  Thinking about how
each case might happen, since the default for recovery_connections is
'on', it seems that recovery_connections=off will likely only be an
issue if the user has explicitly turned it off.  The other case, where
archive_mode=off and max_wal_senders=0, will likely only occur if
someone takes a snapshot of the master without first setting up
archiving or SR.  Both of these will probably happen relatively
rarely, but since we're burning a whole byte for XLogStandbyInfoMode
(plus 3 more bytes of padding?), it seems like we might as well snag
one more bit for clarity.

Thoughts?

I like the second choice since it's simpler and enough for me.
But I have no objection to the first.

When we encounter the error, we would need to not only change
those parameter values but also take a fresh base backup and
restart the standby using it. The description of this required
procedure needs to be in the document or error message, I think.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#2)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

Fujii Masao wrote:

On Fri, Apr 23, 2010 at 1:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

One way we could fix this is use 2 bits rather than 1 for
XLogStandbyInfoMode. One bit could indicate that either
archive_mode=on or max_wal_senders>0, and the second bit could
indicate that recovery_connections=on. If the second bit is unset, we
could emit the existing complaint:

recovery connections cannot start because the recovery_connections
parameter is disabled on the WAL source server

If the other bit is unset, then we could instead complain:

recovery connections cannot start because archive_mode=off and
max_wal_senders=0 on the WAL source server

If we don't want to use two bits there, it's hard to really describe
all the possibilities in a reasonable number of characters. The only
thing I can think of is to print a message and a hint:

recovery_connections cannot start due to incorrect settings on the WAL
source server
HINT: make sure recovery_connections=on and either archive_mode=on or
max_wal_senders>0

I haven't checked whether the hint would be displayed in the log on
the standby, but presumably we could make that be the case if it's not
already.

I think the first way is better because it gives the user more
specific information about what they need to fix. Thinking about how
each case might happen, since the default for recovery_connections is
'on', it seems that recovery_connections=off will likely only be an
issue if the user has explicitly turned it off. The other case, where
archive_mode=off and max_wal_senders=0, will likely only occur if
someone takes a snapshot of the master without first setting up
archiving or SR. Both of these will probably happen relatively
rarely, but since we're burning a whole byte for XLogStandbyInfoMode
(plus 3 more bytes of padding?), it seems like we might as well snag
one more bit for clarity.

Thoughts?

I like the second choice since it's simpler and enough for me.
But I have no objection to the first.

When we encounter the error, we would need to not only change
those parameter values but also take a fresh base backup and
restart the standby using it. The description of this required
procedure needs to be in the document or error message, I think.

I quite liked Robert's proposal to add an explicit GUC to control what
extra information is logged
(http://archives.postgresql.org/pgsql-hackers/2010-04/msg00509.php). It
is quite difficult to explain the current behavior, a simple explicit
wal_mode GUC would be a lot simpler. It wouldn't add any extra steps to
setting the system up, you currently need to set archive_mode='on'
anyway to enable archiving. You would just set wal_mode='archive' or
wal_mode='standby' instead, depending on what you want to do with the WAL.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#3)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

On Fri, Apr 23, 2010 at 5:24 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Fujii Masao wrote:

On Fri, Apr 23, 2010 at 1:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

One way we could fix this is use 2 bits rather than 1 for
XLogStandbyInfoMode.  One bit could indicate that either
archive_mode=on or max_wal_senders>0, and the second bit could
indicate that recovery_connections=on.  If the second bit is unset, we
could emit the existing complaint:

recovery connections cannot start because the recovery_connections
parameter is disabled on the WAL source server

If the other bit is unset, then we could instead complain:

recovery connections cannot start because archive_mode=off and
max_wal_senders=0 on the WAL source server

If we don't want to use two bits there, it's hard to really describe
all the possibilities in a reasonable number of characters.  The only
thing I can think of is to print a message and a hint:

recovery_connections cannot start due to incorrect settings on the WAL
source server
HINT: make sure recovery_connections=on and either archive_mode=on or
max_wal_senders>0

I haven't checked whether the hint would be displayed in the log on
the standby, but presumably we could make that be the case if it's not
already.

I think the first way is better because it gives the user more
specific information about what they need to fix.  Thinking about how
each case might happen, since the default for recovery_connections is
'on', it seems that recovery_connections=off will likely only be an
issue if the user has explicitly turned it off.  The other case, where
archive_mode=off and max_wal_senders=0, will likely only occur if
someone takes a snapshot of the master without first setting up
archiving or SR.  Both of these will probably happen relatively
rarely, but since we're burning a whole byte for XLogStandbyInfoMode
(plus 3 more bytes of padding?), it seems like we might as well snag
one more bit for clarity.

Thoughts?

I like the second choice since it's  simpler and enough for me.
But I have no objection to the first.

When we encounter the error, we would need to not only change
those parameter values but also take a fresh base backup and
restart the standby using it. The description of this required
procedure needs to be in the document or error message, I think.

I quite liked Robert's proposal to add an explicit GUC to control what
extra information is logged
(http://archives.postgresql.org/pgsql-hackers/2010-04/msg00509.php). It
is quite difficult to explain the current behavior, a simple explicit
wal_mode GUC would be a lot simpler. It wouldn't add any extra steps to
setting the system up, you currently need to set archive_mode='on'
anyway to enable archiving. You would just set wal_mode='archive' or
wal_mode='standby' instead, depending on what you want to do with the WAL.

I liked it, too, but I sort of decided it didn't buy much. There are
three separate sets of things that need to be controlled:

1. What WAL to emit - (a) just enough for crash recovery, (b) enough
for log shipping, (c) enough for log shipping with recovery
connections.

2. Whether to run the archiver.

3. Whether to allow streaming replication connections (and if so, how many).

If the answer to (1) is "just enough for crash recovery", then (2) and
(3) must be "no". But if (1) is either of the other two options, then
any combination of answers for (2) and (3) is seemingly sensible,
though having both (2) and (3) as no is probably of limited utility.
But at a mimium, you could certainly have:

crash recovery/no archiver/no SR
log shipping/archiver/no SR
log shipping/no archiver/SR
log shipping/archiver/SR
recovery connections/archiver/no SR
recovery connections/no archiver/SR
recovery connections/archiver/SR

I don't see any reasonable way to package all of that up in a single
GUC. Thoughts?

...Robert

#5Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#4)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

Robert Haas wrote:

On Fri, Apr 23, 2010 at 5:24 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I quite liked Robert's proposal to add an explicit GUC to control what
extra information is logged
(http://archives.postgresql.org/pgsql-hackers/2010-04/msg00509.php). It
is quite difficult to explain the current behavior, a simple explicit
wal_mode GUC would be a lot simpler. It wouldn't add any extra steps to
setting the system up, you currently need to set archive_mode='on'
anyway to enable archiving. You would just set wal_mode='archive' or
wal_mode='standby' instead, depending on what you want to do with the WAL.

I liked it, too, but I sort of decided it didn't buy much. There are
three separate sets of things that need to be controlled:

1. What WAL to emit - (a) just enough for crash recovery, (b) enough
for log shipping, (c) enough for log shipping with recovery
connections.

2. Whether to run the archiver.

3. Whether to allow streaming replication connections (and if so, how many).

Streaming replication needs the same information in the WAL as archiving
does, there's no difference between 2 and 3. (the "how many" aspect of 3
is controlled by max_wal_senders).

Let's have these three settings:

wal_mode = crash/archive/standby (replaces archive_mode)
archive_command
max_wal_senders

If wal_mode is set to 'crash', you can't set archive_command or
max_wal_senders>0. If it's set to 'archive', you can set archive_command
and/or max_wal_senders for archiving and streaming replication, but the
standby server won't allow queries. If you set it to 'standby', it will
(assuming you've set recovery_connections=on in the standby).

Note that "wal_mode=standby" replaces "recovery_connections=on" in the
primary.

I think this would be much easier to understand than the current
situation. I'm not wedded to the GUC name or values, though, maybe it
should be archive_mode=off/on/standby, or wal_mode=minimal/archive/full.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#6Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#5)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

On Fri, Apr 23, 2010 at 7:12 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Robert Haas wrote:

On Fri, Apr 23, 2010 at 5:24 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I quite liked Robert's proposal to add an explicit GUC to control what
extra information is logged
(http://archives.postgresql.org/pgsql-hackers/2010-04/msg00509.php). It
is quite difficult to explain the current behavior, a simple explicit
wal_mode GUC would be a lot simpler. It wouldn't add any extra steps to
setting the system up, you currently need to set archive_mode='on'
anyway to enable archiving. You would just set wal_mode='archive' or
wal_mode='standby' instead, depending on what you want to do with the WAL.

I liked it, too, but I sort of decided it didn't buy much.  There are
three separate sets of things that need to be controlled:

1. What WAL to emit - (a) just enough for crash recovery, (b) enough
for log shipping, (c) enough for log shipping with recovery
connections.

2. Whether to run the archiver.

3. Whether to allow streaming replication connections (and if so, how many).

Streaming replication needs the same information in the WAL as archiving
does,

True.

there's no difference between 2 and 3. (the "how many" aspect of 3
is controlled by max_wal_senders).

False.

I thought what you think too, but discovered otherwise when I read the
code. Some uses of archive_mode are used to control what WAL is
generated, but others control a *process* called the archiver.

...Robert

#7Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#6)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

Robert Haas wrote:

On Fri, Apr 23, 2010 at 7:12 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Robert Haas wrote:

On Fri, Apr 23, 2010 at 5:24 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I quite liked Robert's proposal to add an explicit GUC to control what
extra information is logged
(http://archives.postgresql.org/pgsql-hackers/2010-04/msg00509.php). It
is quite difficult to explain the current behavior, a simple explicit
wal_mode GUC would be a lot simpler. It wouldn't add any extra steps to
setting the system up, you currently need to set archive_mode='on'
anyway to enable archiving. You would just set wal_mode='archive' or
wal_mode='standby' instead, depending on what you want to do with the WAL.

I liked it, too, but I sort of decided it didn't buy much. There are
three separate sets of things that need to be controlled:

1. What WAL to emit - (a) just enough for crash recovery, (b) enough
for log shipping, (c) enough for log shipping with recovery
connections.

2. Whether to run the archiver.

3. Whether to allow streaming replication connections (and if so, how many).

Streaming replication needs the same information in the WAL as archiving
does,

True.

there's no difference between 2 and 3. (the "how many" aspect of 3
is controlled by max_wal_senders).

False.

I thought what you think too, but discovered otherwise when I read the
code. Some uses of archive_mode are used to control what WAL is
generated, but others control a *process* called the archiver.

Hmm, never mind the archiver process, we could just launch it always and
it would just sit idle if archive_command was not set. But a more
serious concern is that if you set "archive_mode=on", and
"archive_command=''", we retain all WAL indefinitely, because it's not
being archived, until you set archive_command to something that succeeds
again. You're right, with the wal_mode='crash/archive/standby" there
would be no way to distinguish "archiving is temporarily disabled, keep
all accumulated WAL around" and "we're not archiving, but
wal_mode='archive' to enable streaming replication".

Ok, that brings us back to square one. We could still add the wal_mode
GUC to explicitly control how much WAL is written (replacing
recovery_connections in the primary), I think it would still make the
system easier to explain. But it would add an extra hurdle to enabling
archiving, you'd have to set wal_mode='archive', archive_mode='on', and
archive_command. I'm not sure if that would be better or worse than the
current situation.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#8Florian Pflug
fgp@phlo.org
In reply to: Heikki Linnakangas (#5)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

On Apr 23, 2010, at 13:12 , Heikki Linnakangas wrote:

Let's have these three settings:

wal_mode = crash/archive/standby (replaces archive_mode)
archive_command
max_wal_senders

If wal_mode is set to 'crash', you can't set archive_command or
max_wal_senders>0. If it's set to 'archive', you can set archive_command
and/or max_wal_senders for archiving and streaming replication, but the
standby server won't allow queries. If you set it to 'standby', it will
(assuming you've set recovery_connections=on in the standby).

Note that "wal_mode=standby" replaces "recovery_connections=on" in the
primary.

I think this would be much easier to understand than the current
situation. I'm not wedded to the GUC name or values, though, maybe it
should be archive_mode=off/on/standby, or wal_mode=minimal/archive/full.

Hm, but but that would preclude the possibility of running master and (log-shipping) slave off the same configuration, since one would need wal_mode=standby and the other recovery_connections=on.

Whereas with the current GUCs, i"archive_mode=on, recovery_connections=on, archive_command=..." should be a valid configuration for both master and slave, no?

best regards,
Florian Pflug

Attachments:

smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#9Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#7)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

On Fri, Apr 23, 2010 at 7:40 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Ok, that brings us back to square one. We could still add the wal_mode
GUC to explicitly control how much WAL is written (replacing
recovery_connections in the primary), I think it would still make the
system easier to explain. But it would add an extra hurdle to enabling
archiving, you'd have to set wal_mode='archive', archive_mode='on', and
archive_command. I'm not sure if that would be better or worse than the
current situation.

I wasn't either, that's why I gave up. It didn't seem worth doing a
major GUC reorganization on the eve of beta unless there was a clear
win. I think there may be a way to improve this but I don't think
it's we should take the time now to figure out what it is. Let's
revisit it for 9.1, and just improve the error reporting for now.

...Robert

#10Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#9)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

On Fri, Apr 23, 2010 at 8:54 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Apr 23, 2010 at 7:40 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Ok, that brings us back to square one. We could still add the wal_mode
GUC to explicitly control how much WAL is written (replacing
recovery_connections in the primary), I think it would still make the
system easier to explain. But it would add an extra hurdle to enabling
archiving, you'd have to set wal_mode='archive', archive_mode='on', and
archive_command. I'm not sure if that would be better or worse than the
current situation.

I wasn't either, that's why I gave up.  It didn't seem worth doing a
major GUC reorganization on the eve of beta unless there was a clear
win.  I think there may be a way to improve this but I don't think
it's we should take the time now to figure out what it is.  Let's
revisit it for 9.1, and just improve the error reporting for now.

+1

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#6)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Apr 23, 2010 at 7:12 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Streaming replication needs the same information in the WAL as archiving
does,

True.

FWIW, I still don't believe that claim, and I think it's complete folly
to set the assumption in stone by choosing a user-visible GUC API that
depends on it being true.

regards, tom lane

#12Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#9)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

On Fri, 2010-04-23 at 07:54 -0400, Robert Haas wrote:

Let's
revisit it for 9.1, and just improve the error reporting for now.

+1

--
Simon Riggs www.2ndQuadrant.com

#13Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#11)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

On Fri, Apr 23, 2010 at 12:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Apr 23, 2010 at 7:12 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Streaming replication needs the same information in the WAL as archiving
does,

True.

FWIW, I still don't believe that claim, and I think it's complete folly
to set the assumption in stone by choosing a user-visible GUC API that
depends on it being true.

Huh? We're clearly talking about two different things here, because
that doesn't make any sense. Archiving and streaming replication are
just two means of transporting WAL records from point A to point B.
By definition, any two manners of moving a byte stream around are
isomorphic and can't possibly affect what that byte stream does or
does not need to contain. What affects the WAL that must be emitted
is the purpose for which it is to be used. As to that, I believe
everyone (including the code) is in agreement that a minimum amount of
WAL is always needed for crash recovery, plus if we want to do archive
recovery on another server there are some additional bits that must be
emitted (XLogIsNeeded) and plus if further want to process queries on
the standby then there are a few more bits beyond that
(XLogStandbyInfoActive).

...Robert

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#13)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Apr 23, 2010 at 12:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

FWIW, I still don't believe that claim, and I think it's complete folly
to set the assumption in stone by choosing a user-visible GUC API that
depends on it being true.

Huh? We're clearly talking about two different things here, because
that doesn't make any sense. Archiving and streaming replication are
just two means of transporting WAL records from point A to point B.

Sorry, not enough caffeine. What I should have said was that Hot
Standby could put stronger requirements on what gets put into WAL than
archiving for recovery does. Heikki's proposal upthread was
wal_mode='standby' versus wal_mode='archive' (versus 'off'), which
seemed sensible to me.

We realized some time ago that it was a good idea to separate
archive_mode (what to put in WAL) from archive_command (whether we are
actually archiving right now). If we fail to apply that same principle
to Hot Standby, I think we'll come to regret it.

regards, tom lane

#15Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#14)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

Tom Lane wrote:

We realized some time ago that it was a good idea to separate
archive_mode (what to put in WAL) from archive_command (whether we are
actually archiving right now). If we fail to apply that same principle
to Hot Standby, I think we'll come to regret it.

The recovery_connections GUC does that. If you enable it, the extra
information required for hot standby is written to the WAL, otherwise
it's not.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#15)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Tom Lane wrote:

We realized some time ago that it was a good idea to separate
archive_mode (what to put in WAL) from archive_command (whether we are
actually archiving right now). If we fail to apply that same principle
to Hot Standby, I think we'll come to regret it.

The recovery_connections GUC does that. If you enable it, the extra
information required for hot standby is written to the WAL, otherwise
it's not.

No, driving it off recovery_connections is exactly NOT that. It's
confusing the transport mechanism with the desired WAL contents.
I maintain that this design is exactly isomorphic to our original PITR
GUC design wherein what got written to WAL was determined by the current
state of archive_command. We eventually realized that was a bad idea.
So is this.

As a concrete example, there is nothing logically wrong with driving
a hot standby slave from WAL records shipped via old-style pg_standby.
Or how about wanting to turn off recovery_connections temporarily, but
not wanting the archived WAL to be unable to support HS?

regards, tom lane

#17Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#16)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

On Fri, Apr 23, 2010 at 2:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Tom Lane wrote:

We realized some time ago that it was a good idea to separate
archive_mode (what to put in WAL) from archive_command (whether we are
actually archiving right now).  If we fail to apply that same principle
to Hot Standby, I think we'll come to regret it.

The recovery_connections GUC does that. If you enable it, the extra
information required for hot standby is written to the WAL, otherwise
it's not.

No, driving it off recovery_connections is exactly NOT that.  It's
confusing the transport mechanism with the desired WAL contents.
I maintain that this design is exactly isomorphic to our original PITR
GUC design wherein what got written to WAL was determined by the current
state of archive_command.  We eventually realized that was a bad idea.
So is this.

As a concrete example, there is nothing logically wrong with driving
a hot standby slave from WAL records shipped via old-style pg_standby.
Or how about wanting to turn off recovery_connections temporarily, but
not wanting the archived WAL to be unable to support HS?

You're all confused about what the different GUCs actually do. Which
is probably not a good sign for their usability. But yeah, that's one
of the things that concerned me, too. If you turn off
max_wal_senders, it doesn't just make it so that no WAL senders can
connect: it actually changes what gets WAL-logged.

...Robert

#18Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tom Lane (#16)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

Tom Lane <tgl@sss.pgh.pa.us> wrote:

As a concrete example, there is nothing logically wrong with
driving a hot standby slave from WAL records shipped via old-style
pg_standby. Or how about wanting to turn off recovery_connections
temporarily, but not wanting the archived WAL to be unable to
support HS?

As one more concrete example, we are likely to find SR beneficial if
it can feed into a warm standby, but only if we can also do
traditional WAL file archiving from the same source at the same
time. The extra logging for HS would be useless for us in any
event.

+1 for *not* tying WAL contents to the transport mechanism.

-Kevin

#19Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#13)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

On Fri, 2010-04-23 at 13:45 -0400, Robert Haas wrote:

Archiving and streaming replication are
just two means of transporting WAL records from point A to point B.

By definition, any two manners of moving a byte stream around are
isomorphic and can't possibly affect what that byte stream does or
does not need to contain.

It is currently true, but there is no benefit in us constraining future
implementation routes without good reason.

--
Simon Riggs www.2ndQuadrant.com

#20Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#18)
Re: recovery_connections cannot start (was Re: master in standby mode croaks)

On Fri, Apr 23, 2010 at 2:43 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

As a concrete example, there is nothing logically wrong with
driving a hot standby slave from WAL records shipped via old-style
pg_standby.  Or how about wanting to turn off recovery_connections
temporarily, but not wanting the archived WAL to be unable to
support HS?

As one more concrete example, we are likely to find SR beneficial if
it can feed into a warm standby, but only if we can also do
traditional WAL file archiving from the same source at the same
time.  The extra logging for HS would be useless for us in any
event.

+1 for *not* tying WAL contents to the transport mechanism.

OK. Well, it's a shame we didn't get this settled last week when I
first brought it up, but it's not too late to try to straighten it out
if we have a consensus behind changing it, which it's starting to
sound like we do.

...Robert

#21Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#20)
#22Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#21)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#22)
#25Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#22)
#26Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#25)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#25)
#28Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#24)
#29Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#26)
#30Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#24)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#28)
#32Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#30)
#33Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#32)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#32)
#35Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#34)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#34)
#37Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#36)
#38Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#30)
#39Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#38)
#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#38)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#35)
#42Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#39)
#43Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#40)
#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#43)
#45Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#44)
#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#45)
#47Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#46)
#48Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#47)
#49Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#48)
#50Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#49)
#51Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#50)
#52Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#49)
#53Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Tom Lane (#42)
#54Simon Riggs
simon@2ndQuadrant.com
In reply to: Dimitri Fontaine (#53)
#55Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#31)
#56Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#54)
#57Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#34)
#58Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Robert Haas (#55)
#59Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#57)
#60Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#59)
#61Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#60)
#62Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#61)
#63Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#62)
#64Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Robert Haas (#56)
#65Robert Haas
robertmhaas@gmail.com
In reply to: Dimitri Fontaine (#64)
#66Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Robert Haas (#65)
#67Simon Riggs
simon@2ndQuadrant.com
In reply to: Dimitri Fontaine (#66)
#68Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#67)