Streaming replication status

Started by Heikki Linnakangasover 16 years ago76 messageshackers
Jump to latest
#1Heikki Linnakangas
heikki.linnakangas@enterprisedb.com

I've gone through the patch in detail now. Here's my list of remaining
issues:

* If there's no WAL to send, walsender doesn't notice if the client has
closed connection already. This is the issue Fujii reported already.
We'll need to add a select() call to the walsender main loop to check if
the socket has been closed.

* I removed the feature that archiver was started during recovery. The
idea of that was to enable archiving from a standby server, to relieve
the master server of that duty, but I found it annoying because it
causes trouble if the standby and master are configured to archive to
the same location; they will fight over which copies the file to the
archive first. Frankly the feature doesn't seem very useful as the patch
stands, because you still have to configure archiving in the master in
practice; you can't take an online base backup otherwise, and you have
the risk of standby falling too much behind and having to restore from
base backup whenever the standby is disconnected for any reason. Let's
revisit this later when it's truly useful.

* We still have a related issue, though: if standby is configured to
archive to the same location as master (as it always is on my laptop,
where I use the postgresql.conf of the master unmodified in the server),
right after failover the standby server will try to archive all the old
WAL files that were streamed from the master; but they exist already in
the archive, as the master archived them already. I'm not sure if this
is a pilot error, or if we should do something in the server to tell
apart WAL segments streamed from master and those generated in the
standby server after failover. Maybe we should immediately create a
.done file for every file received from master?

* I don't think we should require superuser rights for replication.
Although you see all WAL and potentially all data in the system through
that, a standby doesn't need any write access to the master, so it would
be good practice to create a dedicated account with limited privileges
for replication.

* A standby that connects to master, initiates streaming, and then sits
idle without stalls recycling of old WAL files in the master. That will
eventually lead to a full disk in master. Do we need some kind of a
emergency valve on that?

* Do we really need REPLICATION_DEBUG_ENABLED? The output doesn't seem
very useful to me.

* Need to add comments somewhere to note that ReadRecord depends on the
fact that a WAL record is always send as whole, never split across two
messages.

* Do we really need to split the sleep in walsender to NAPTIME_PER_CYCLE
increments?

* Walreceiver should flush less aggresively than after each received
piece of WAL as noted by XXX comment.

* Consider renaming PREPARE_REPLICATION to IDENTIFY_SYSTEM or something.

* What's the change in bgwriter.c for?

* ReadRecord/FetchRecord is a bit of mess. I earlier tried to refactor
it into something simpler a couple of times, but failed. So I'm going to
leave it as it is, but if someone else wants to give it a shot, that
would be good.

* Documentation. The patch used to move around some sections, but I
think that has been partially reverted so that it now just duplicates
them. It probably needs other work too, I haven't looked at the docs in
any detail.

These are all the issues I know of right now. Assuming no new issues
crop up (which often does happen), the patch is ready for committing
after those have been addressed.

Attached is my latest version as a patch, also available in my git
repository.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachments:

replication-20100108.patch.gzapplication/x-gzip; name=replication-20100108.patch.gzDownload
#2Josh Berkus
josh@agliodbs.com
In reply to: Heikki Linnakangas (#1)
Re: Streaming replication status

On 1/8/10 1:16 PM, Heikki Linnakangas wrote:

* A standby that connects to master, initiates streaming, and then sits
idle without stalls recycling of old WAL files in the master. That will
eventually lead to a full disk in master. Do we need some kind of a
emergency valve on that?

WARNING: I haven't thought about how this would work together with HS yes.

I think this needs to be administrator-configurable.

I'd suggest a GUC approach:

archiving_lag_action = { ignore, shutdown, stop }

"Ignore" would be the default. Some users would rather have the master
shut down if the slave has stopped taking segments; that's "shutdown".
Otherwise, it's "stop" which simply stops archiving and starts recylcing
when we reach that number of segments.

Better name for the GUC very welcome ...

--Josh Berkus

#3Bruce Momjian
bruce@momjian.us
In reply to: Heikki Linnakangas (#1)
Re: Streaming replication status

On Fri, Jan 8, 2010 at 9:16 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

* We still have a related issue, though: if standby is configured to
archive to the same location as master (as it always is on my laptop,
where I use the postgresql.conf of the master unmodified in the server),
right after failover the standby server will try to archive all the old
WAL files that were streamed from the master; but they exist already in
the archive, as the master archived them already. I'm not sure if this
is a pilot error, or if we should do something in the server to tell
apart WAL segments streamed from master and those generated in the
standby server after failover. Maybe we should immediately create a
.done file for every file received from master?

How do we know the master has finished archiving them? If the master
crashes suddenly and you fail over couldn't it have failed to archive
segments that have been received by the standby via streaming
replication?

* Need to add comments somewhere to note that ReadRecord depends on the
fact that a WAL record is always send as whole, never split across two
messages.

What happens in the case of the very large records Tom was describing
recently. If the entire record doesn't fit in a WAL segment is it the
whole record or the partial record with the continuation bit that
needs to fit in a message?

--
greg

#4Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#1)
Re: Streaming replication status

On Sat, Jan 9, 2010 at 6:16 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I've gone through the patch in detail now. Here's my list of remaining
issues:

Great! Thanks a lot!

* If there's no WAL to send, walsender doesn't notice if the client has
closed connection already. This is the issue Fujii reported already.
We'll need to add a select() call to the walsender main loop to check if
the socket has been closed.

We should reactivate pq_wait() and secure_poll()?

* I removed the feature that archiver was started during recovery. The
idea of that was to enable archiving from a standby server, to relieve
the master server of that duty, but I found it annoying because it
causes trouble if the standby and master are configured to archive to
the same location; they will fight over which copies the file to the
archive first. Frankly the feature doesn't seem very useful as the patch
stands, because you still have to configure archiving in the master in
practice; you can't take an online base backup otherwise, and you have
the risk of standby falling too much behind and having to restore from
base backup whenever the standby is disconnected for any reason. Let's
revisit this later when it's truly useful.

Okey.

* We still have a related issue, though: if standby is configured to
archive to the same location as master (as it always is on my laptop,
where I use the postgresql.conf of the master unmodified in the server),
right after failover the standby server will try to archive all the old
WAL files that were streamed from the master; but they exist already in
the archive, as the master archived them already. I'm not sure if this
is a pilot error, or if we should do something in the server to tell
apart WAL segments streamed from master and those generated in the
standby server after failover. Maybe we should immediately create a
.done file for every file received from master?

There is no guarantee that such file has already been archived by master.
This is just an idea, but new WAL record indicating the completion of the
archiving would be useful for the standby to create .done file. But, this idea
might kill the "archiving during recovery" idea discussed above.

Personally, I'm OK with that issue because we can avoid it by tweaking
archive_command. Could we revisit this discussion with the "archiving
during recovery" discussion later?

* I don't think we should require superuser rights for replication.
Although you see all WAL and potentially all data in the system through
that, a standby doesn't need any write access to the master, so it would
be good practice to create a dedicated account with limited privileges
for replication.

Okey to just drop the superuser() check from walsender.c.

* A standby that connects to master, initiates streaming, and then sits
idle without stalls recycling of old WAL files in the master. That will
eventually lead to a full disk in master. Do we need some kind of a
emergency valve on that?

I think that we need the GUC parameter to specify the maximum number
of log file segments held in pg_xlog directory to send to the standby server.
The replication to the standby which falls more than that GUC value behind
is just terminated.
http://archives.postgresql.org/pgsql-hackers/2009-12/msg01901.php

* Do we really need REPLICATION_DEBUG_ENABLED? The output doesn't seem
very useful to me.

This was useful for me to debug the code. But, right now, Okey to drop it.

* Need to add comments somewhere to note that ReadRecord depends on the
fact that a WAL record is always send as whole, never split across two
messages.

Okey.

* Do we really need to split the sleep in walsender to NAPTIME_PER_CYCLE
 increments?

Yes. It's required for some platforms (probably HP-UX) in which signals
cannot interrupt the sleep.

* Walreceiver should flush less aggresively than after each received
piece of WAL as noted by XXX comment.

* XXX: Flushing after each received message is overly aggressive. Should
* implement some sort of lazy flushing. Perhaps check in the main loop
* if there's any more messages before blocking and waiting for one, and
* flush the WAL if there isn't, just blocking.

In this approach, if messages continuously arrive from master, the fsync
would be delayed until WAL segment is switched. Likewise, recovery also
would be delayed, which seems to be problem.

How about the straightforward approach; let the process which wants to
flush the buffer send the fsync-request to walreceiver and wait until WAL
is flushed up to the buffer's LSN?

* Consider renaming PREPARE_REPLICATION to IDENTIFY_SYSTEM or something.

Okey.

* What's the change in bgwriter.c for?

It's for the bgwriter to know the current timeline for recycling the WAL files.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#5Fujii Masao
masao.fujii@gmail.com
In reply to: Bruce Momjian (#3)
Re: Streaming replication status

On Sat, Jan 9, 2010 at 10:38 AM, Greg Stark <gsstark@mit.edu> wrote:

* Need to add comments somewhere to note that ReadRecord depends on the
fact that a WAL record is always send as whole, never split across two
messages.

What happens in the case of the very large records Tom was describing
recently. If the entire record doesn't fit in a WAL segment is it the
whole record or the partial record with the continuation bit that
needs to fit in a message?

It's the partial record with the continuation bit.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#6Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#4)
Re: Streaming replication status

Fujii Masao wrote:

On Sat, Jan 9, 2010 at 6:16 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

* If there's no WAL to send, walsender doesn't notice if the client has
closed connection already. This is the issue Fujii reported already.
We'll need to add a select() call to the walsender main loop to check if
the socket has been closed.

We should reactivate pq_wait() and secure_poll()?

I don't think we need all that, a simple select() should be enough.
Though I must admit I'm not very familiar with select/poll().

* We still have a related issue, though: if standby is configured to
archive to the same location as master (as it always is on my laptop,
where I use the postgresql.conf of the master unmodified in the server),
right after failover the standby server will try to archive all the old
WAL files that were streamed from the master; but they exist already in
the archive, as the master archived them already. I'm not sure if this
is a pilot error, or if we should do something in the server to tell
apart WAL segments streamed from master and those generated in the
standby server after failover. Maybe we should immediately create a
.done file for every file received from master?

There is no guarantee that such file has already been archived by master.
This is just an idea, but new WAL record indicating the completion of the
archiving would be useful for the standby to create .done file. But, this idea
might kill the "archiving during recovery" idea discussed above.

Personally, I'm OK with that issue because we can avoid it by tweaking
archive_command. Could we revisit this discussion with the "archiving
during recovery" discussion later?

Ok. The workaround is to configure standby to archive to a different
location. If you need to restore from that, you'll need to stitch
together the logs from the old master and the new one.

* A standby that connects to master, initiates streaming, and then sits
idle without stalls recycling of old WAL files in the master. That will
eventually lead to a full disk in master. Do we need some kind of a
emergency valve on that?

I think that we need the GUC parameter to specify the maximum number
of log file segments held in pg_xlog directory to send to the standby server.
The replication to the standby which falls more than that GUC value behind
is just terminated.
http://archives.postgresql.org/pgsql-hackers/2009-12/msg01901.php

Oh yes, sounds good.

* Do we really need to split the sleep in walsender to NAPTIME_PER_CYCLE
increments?

Yes. It's required for some platforms (probably HP-UX) in which signals
cannot interrupt the sleep.

I'm thinking that the wal_sender_delay is so small that maybe it's not
worth worrying about.

* Walreceiver should flush less aggresively than after each received
piece of WAL as noted by XXX comment.

* XXX: Flushing after each received message is overly aggressive. Should
* implement some sort of lazy flushing. Perhaps check in the main loop
* if there's any more messages before blocking and waiting for one, and
* flush the WAL if there isn't, just blocking.

In this approach, if messages continuously arrive from master, the fsync
would be delayed until WAL segment is switched. Likewise, recovery also
would be delayed, which seems to be problem.

That seems OK to me. If messages are really coming in that fast,
fsyncing the whole WAL segment at a time is probably most efficient.

But if that really is too much, you could still do extra flushes within
XLogRecv() every few megabytes for example.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#7Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#1)
Re: Streaming replication status

On Fri, 2010-01-08 at 23:16 +0200, Heikki Linnakangas wrote:

* I removed the feature that archiver was started during recovery. The
idea of that was to enable archiving from a standby server, to relieve
the master server of that duty, but I found it annoying because it
causes trouble if the standby and master are configured to archive to
the same location; they will fight over which copies the file to the
archive first. Frankly the feature doesn't seem very useful as the patch
stands, because you still have to configure archiving in the master in
practice; you can't take an online base backup otherwise, and you have
the risk of standby falling too much behind and having to restore from
base backup whenever the standby is disconnected for any reason. Let's
revisit this later when it's truly useful.

Agreed

* We still have a related issue, though: if standby is configured to
archive to the same location as master (as it always is on my laptop,
where I use the postgresql.conf of the master unmodified in the server),
right after failover the standby server will try to archive all the old
WAL files that were streamed from the master; but they exist already in
the archive, as the master archived them already. I'm not sure if this
is a pilot error, or if we should do something in the server to tell
apart WAL segments streamed from master and those generated in the
standby server after failover. Maybe we should immediately create a
.done file for every file received from master?

That sounds like the right thing to do.

* I don't think we should require superuser rights for replication.
Although you see all WAL and potentially all data in the system through
that, a standby doesn't need any write access to the master, so it would
be good practice to create a dedicated account with limited privileges
for replication.

Agreed. I think we should have a predefined user, called "replication"
that has only the correct rights.

* A standby that connects to master, initiates streaming, and then sits
idle without stalls recycling of old WAL files in the master. That will
eventually lead to a full disk in master. Do we need some kind of a
emergency valve on that?

Can you explain how this could occur? My understanding was that the
walreceiver and startup processes were capable of independent action
specifically to avoid for this kind of effect.

* Documentation. The patch used to move around some sections, but I
think that has been partially reverted so that it now just duplicates
them. It probably needs other work too, I haven't looked at the docs in
any detail.

I believe the docs need urgent attention. We need more people to read
the docs and understand the implications so that people can then
comment. It is extremely non-obvious from the patch how things work at a
behaviour level.

I am very concerned that there is no thought given to monitoring
replication. This will make the feature difficult to use in practice.

--
Simon Riggs www.2ndQuadrant.com

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#2)
Re: Streaming replication status

On Fri, 2010-01-08 at 14:20 -0800, Josh Berkus wrote:

On 1/8/10 1:16 PM, Heikki Linnakangas wrote:

* A standby that connects to master, initiates streaming, and then sits
idle without stalls recycling of old WAL files in the master. That will
eventually lead to a full disk in master. Do we need some kind of a
emergency valve on that?

WARNING: I haven't thought about how this would work together with HS yes.

I've been reviewing things as we go along, so I'm not that tense
overall. Having said that I don't understand why the problem above would
occur and the sentence seems to be missing a verb between "without" and
"stalls". More explanation please.

What could happen is that the standby could slowly lag behind master. We
don't have any way of monitoring that, as yet. Setting ps display is not
enough here.

--
Simon Riggs www.2ndQuadrant.com

#9Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#8)
Re: Streaming replication status

Simon Riggs wrote:

On Fri, 2010-01-08 at 14:20 -0800, Josh Berkus wrote:

On 1/8/10 1:16 PM, Heikki Linnakangas wrote:

* A standby that connects to master, initiates streaming, and then sits
idle without stalls recycling of old WAL files in the master. That will
eventually lead to a full disk in master. Do we need some kind of a
emergency valve on that?

WARNING: I haven't thought about how this would work together with HS yes.

I've been reviewing things as we go along, so I'm not that tense
overall. Having said that I don't understand why the problem above would
occur and the sentence seems to be missing a verb between "without" and
"stalls". More explanation please.

Yeah, that sentence was broken.

What could happen is that the standby could slowly lag behind master.

Right, that's what I'm worried about. In the worst case it the
walreceiver process in the standby might stall completely for some
reason, e.g hardware problem or SIGSTOP by an administrator.

We
don't have any way of monitoring that, as yet. Setting ps display is not
enough here.

Yeah, monitoring would be nice too. But what I was wondering is whether
we need some way of stopping that from filling the disk in master.
(Fujii-san's suggestion of a GUC to set the max. amount of WAL to keep
in the master for standbys feels good to me).

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#10Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#9)
Re: Streaming replication status

On Sun, 2010-01-10 at 18:40 +0200, Heikki Linnakangas wrote:

We
don't have any way of monitoring that, as yet. Setting ps display is not
enough here.

Yeah, monitoring would be nice too. But what I was wondering is whether
we need some way of stopping that from filling the disk in master.
(Fujii-san's suggestion of a GUC to set the max. amount of WAL to keep
in the master for standbys feels good to me).

OK, now I got you. I thought that was already agreed; guess it is now.

We need monitoring anywhere we have a max_* parameter. Otherwise we
won't know how close we are to disaster until we hit the limit and
things break down. Otherwise we will have to set parameters by trial and
error, or set them so high they are meaningless.

--
Simon Riggs www.2ndQuadrant.com

#11Josh Berkus
josh@agliodbs.com
In reply to: Simon Riggs (#10)
Re: Streaming replication status

We need monitoring anywhere we have a max_* parameter. Otherwise we
won't know how close we are to disaster until we hit the limit and
things break down. Otherwise we will have to set parameters by trial and
error, or set them so high they are meaningless.

I agree.

Thing is, though, we have a de-facto max already ... when pgxlog runs
out of disk space. And no monitoring *in postgresql* for that, although
obviously you can use OS monitoring for it.

I'm saying, even for plain PITR, it would be an improvement in
manageablity if the DBA could set a maximum number of checkpoint
segments before replication is abandonded or the master shuts down.
It's something we've been missing.

--Josh Berkus

#12Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#11)
Re: Streaming replication status

On Sun, 2010-01-10 at 12:10 -0800, Josh Berkus wrote:

We need monitoring anywhere we have a max_* parameter. Otherwise we
won't know how close we are to disaster until we hit the limit and
things break down. Otherwise we will have to set parameters by trial and
error, or set them so high they are meaningless.

I agree.

Thing is, though, we have a de-facto max already ... when pgxlog runs
out of disk space.

What I mean is this: The purpose of monitoring is to avoid bad things
happening by being able to predict that a bad thing will happen before
it actually does happen. Cars have windows to allow us to see we are
about to hit something.

And no monitoring *in postgresql* for that, although
obviously you can use OS monitoring for it.

PostgreSQL doesn't need to monitor that. If the user wants to avoid
out-of-space they can write a script to monitor files/space. The info is
accessible, if you wish to monitor it.

Currently there is no way of knowing what the average/current transit
time is on replication, no way of knowing what is happening if we go
idle etc.. Those things need to be included because they are not
otherwise accessible. Cars need windows, not just a finely tuned engine.

--
Simon Riggs www.2ndQuadrant.com

#13Josh Berkus
josh@agliodbs.com
In reply to: Simon Riggs (#12)
Re: Streaming replication status

Currently there is no way of knowing what the average/current transit
time is on replication, no way of knowing what is happening if we go
idle etc.. Those things need to be included because they are not
otherwise accessible. Cars need windows, not just a finely tuned engine.

Like I said, I agree. I'm just pointing out that the monitoring
deficiency already exists whether or not we add a max_* parameter.

--Josh Berkus

#14Craig Ringer
craig@2ndquadrant.com
In reply to: Josh Berkus (#2)
Re: Streaming replication status

On 9/01/2010 6:20 AM, Josh Berkus wrote:

On 1/8/10 1:16 PM, Heikki Linnakangas wrote:

* A standby that connects to master, initiates streaming, and then sits
idle without stalls recycling of old WAL files in the master. That will
eventually lead to a full disk in master. Do we need some kind of a
emergency valve on that?

WARNING: I haven't thought about how this would work together with HS yes.

I think this needs to be administrator-configurable.

I'd suggest a GUC approach:

archiving_lag_action = { ignore, shutdown, stop }

"Ignore" would be the default. Some users would rather have the master
shut down if the slave has stopped taking segments; that's "shutdown".
Otherwise, it's "stop" which simply stops archiving and starts recylcing
when we reach that number of segments.

IMO "stop" would be *really* bad without some sort of administrator
alert support (scream for help) and/or the ability to refresh the
slave's base backup when it started responding again. We'd start seeing
mailing list posts along the lines of "my master failed over to the
slave, and it's missing the last 3 months of data! Help!".

Personally, I'd be uncomfortable enabling something like that without
_both_ an admin alert _and_ the ability to refresh the slave's base
backup without admin intervention.

It'd also be necessary to define what exactly "lag" means here,
preferably in a way that doesn't generally need admin tuning for most
users. Ideally there'd be separate thresholds for "scream to the admin
for help, something's wrong" and "forced to act, slave is holding up the
master".

--
Craig Ringer

#15Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#7)
Re: Streaming replication status

Simon Riggs wrote:

* I don't think we should require superuser rights for replication.
Although you see all WAL and potentially all data in the system through
that, a standby doesn't need any write access to the master, so it would
be good practice to create a dedicated account with limited privileges
for replication.

Agreed. I think we should have a predefined user, called "replication"
that has only the correct rights.

I am concerned that knowledge of this new read-only replication user
would have to be spread all over the backend code, which is really not
something we should be doing at this stage in 8.5 development. I am
also thinking such a special user might fall out of work on mandatory
access control, so maybe we should just require super-user for 8.5 and
revisit this for 8.6.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#16Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#6)
Re: Streaming replication status

On Sat, Jan 9, 2010 at 4:25 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I don't think we need all that, a simple select() should be enough.
Though I must admit I'm not very familiar with select/poll().

I'm not sure whether poll(2) should be called for this purpose. But
poll(2) and select(2) seem to often come together in the existing code.
We should follow such custom?

* Do we really need to split the sleep in walsender to NAPTIME_PER_CYCLE
 increments?

Yes. It's required for some platforms (probably HP-UX) in which signals
cannot interrupt the sleep.

I'm thinking that the wal_sender_delay is so small that maybe it's not
worth worrying about.

The same problem exists in walwriter.c, too. Though we can expect that
wal_writer_delay is small, its sleep has been broken down into smaller bit.
We should follow such existing code? Or just remove that feature from
walwriter?

* Walreceiver should flush less aggresively than after each received
piece of WAL as noted by XXX comment.

      * XXX: Flushing after each received message is overly aggressive. Should
      * implement some sort of lazy flushing. Perhaps check in the main loop
      * if there's any more messages before blocking and waiting for one, and
      * flush the WAL if there isn't, just blocking.

In this approach, if messages continuously arrive from master, the fsync
would be delayed until WAL segment is switched. Likewise, recovery also
would be delayed, which seems to be problem.

That seems OK to me. If messages are really coming in that fast,
fsyncing the whole WAL segment at a time is probably most efficient.

OK, I'll implement your idea. But that seems to be inefficient in the
synchronous replication (especially "wait WAL-replay" mode). So let's
revisit this discussion later.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#17Fujii Masao
masao.fujii@gmail.com
In reply to: Simon Riggs (#8)
Re: Streaming replication status

On Sun, Jan 10, 2010 at 8:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

What could happen is that the standby could slowly lag behind master. We
don't have any way of monitoring that, as yet. Setting ps display is not
enough here.

I agree that the statistical information about replication activity is
very useful. But I think that it's not an urgent issue. Shall we think
it later?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#18Fujii Masao
masao.fujii@gmail.com
In reply to: Craig Ringer (#14)
Re: Streaming replication status

On Mon, Jan 11, 2010 at 5:36 PM, Craig Ringer
<craig@postnewspapers.com.au> wrote:

Personally, I'd be uncomfortable enabling something like that without _both_
an admin alert _and_ the ability to refresh the slave's base backup without
admin intervention.

What feature do you specifically need as an alert? Just writing
the warning into the logfile is enough? Or need to notify by
using SNMP trap message? Though I'm not sure if this is a role
of Postgres.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#19Greg Smith
gsmith@gregsmith.com
In reply to: Fujii Masao (#17)
Re: Streaming replication status

Fujii Masao wrote:

On Sun, Jan 10, 2010 at 8:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

What could happen is that the standby could slowly lag behind master. We
don't have any way of monitoring that, as yet. Setting ps display is not
enough here.

I agree that the statistical information about replication activity is
very useful. But I think that it's not an urgent issue. Shall we think
it later?

I don't think anybody can deploy this feature without at least some very
basic monitoring here. I like the basic proposal you made back in
September for adding a pg_standbys_xlog_location to replace what you
have to get from ps right now:
http://archives.postgresql.org/pgsql-hackers/2009-09/msg00889.php

That's basic, but enough that people could get by for a V1.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com

#20Greg Smith
gsmith@gregsmith.com
In reply to: Fujii Masao (#18)
Re: Streaming replication status

Fujii Masao wrote:

On Mon, Jan 11, 2010 at 5:36 PM, Craig Ringer
<craig@postnewspapers.com.au> wrote:

Personally, I'd be uncomfortable enabling something like that without _both_
an admin alert _and_ the ability to refresh the slave's base backup without
admin intervention.

What feature do you specifically need as an alert? Just writing
the warning into the logfile is enough? Or need to notify by
using SNMP trap message? Though I'm not sure if this is a role
of Postgres.

It's impossible for the database to have any idea whatsoever how people
are going to want to be alerted. Provide functions to monitor things
like replication lag, like the number of segments queued up to feed to
archive_command, and let people build their own alerting mechanism for
now. They're going to do that anyway, so why waste precious time here
building someone that's unlikely to fit any but a very narrow use case?

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com

#21Fujii Masao
masao.fujii@gmail.com
In reply to: Greg Smith (#19)
#22Fujii Masao
masao.fujii@gmail.com
In reply to: Greg Smith (#20)
#23Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Greg Smith (#19)
#24Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Fujii Masao (#21)
#25Simon Riggs
simon@2ndQuadrant.com
In reply to: Stefan Kaltenbrunner (#24)
#26Greg Smith
gsmith@gregsmith.com
In reply to: Heikki Linnakangas (#23)
#27Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#23)
#28Magnus Hagander
magnus@hagander.net
In reply to: Heikki Linnakangas (#23)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fujii Masao (#16)
#30Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#29)
#31Magnus Hagander
magnus@hagander.net
In reply to: Andrew Dunstan (#30)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#28)
#33Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Simon Riggs (#25)
#34Marko Kreen
markokr@gmail.com
In reply to: Tom Lane (#29)
#35Bruce Momjian
bruce@momjian.us
In reply to: Stefan Kaltenbrunner (#33)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#35)
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marko Kreen (#34)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#35)
#39Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#35)
#40Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Simon Riggs (#39)
#41Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#38)
#42Bruce Momjian
bruce@momjian.us
In reply to: Stefan Kaltenbrunner (#40)
#43Joshua D. Drake
jd@commandprompt.com
In reply to: Bruce Momjian (#41)
#44Josh Berkus
josh@agliodbs.com
In reply to: Bruce Momjian (#42)
#45Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#38)
#46Greg Smith
gsmith@gregsmith.com
In reply to: Bruce Momjian (#42)
#47Joshua D. Drake
jd@commandprompt.com
In reply to: Greg Smith (#46)
#48Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Greg Smith (#46)
#49Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Stefan Kaltenbrunner (#48)
#50Robert Treat
xzilla@users.sourceforge.net
In reply to: Greg Smith (#20)
#51Robert Treat
xzilla@users.sourceforge.net
In reply to: Simon Riggs (#45)
#52Josh Berkus
josh@agliodbs.com
In reply to: Greg Smith (#46)
#53Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#52)
#54Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#45)
#55Fujii Masao
masao.fujii@gmail.com
In reply to: Tom Lane (#29)
#56Fujii Masao
masao.fujii@gmail.com
In reply to: Bruce Momjian (#15)
#57Greg Smith
gsmith@gregsmith.com
In reply to: Stefan Kaltenbrunner (#48)
#58Fujii Masao
masao.fujii@gmail.com
In reply to: Greg Smith (#57)
#59Greg Smith
gsmith@gregsmith.com
In reply to: Fujii Masao (#58)
#60Simon Riggs
simon@2ndQuadrant.com
In reply to: Greg Smith (#59)
#61Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Greg Smith (#59)
#62Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Stefan Kaltenbrunner (#61)
#63Greg Smith
gsmith@gregsmith.com
In reply to: Stefan Kaltenbrunner (#61)
#64Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Greg Smith (#63)
#65Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Greg Smith (#63)
#66Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Kevin Grittner (#64)
#67Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Stefan Kaltenbrunner (#66)
#68Greg Smith
gsmith@gregsmith.com
In reply to: Stefan Kaltenbrunner (#65)
#69Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Greg Smith (#68)
#70Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Kevin Grittner (#67)
#71Josh Berkus
josh@agliodbs.com
In reply to: Greg Smith (#59)
#72Fujii Masao
masao.fujii@gmail.com
In reply to: Josh Berkus (#71)
#73Simon Riggs
simon@2ndQuadrant.com
In reply to: Fujii Masao (#58)
#74Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#73)
#75Fujii Masao
masao.fujii@gmail.com
In reply to: Bruce Momjian (#74)
#76Bruce Momjian
bruce@momjian.us
In reply to: Fujii Masao (#75)