warning message in standby
Hi,
When an error is found in the WAL streamed from the master, a warning
message is repeated without interval forever in the standby. This
consumes CPU load very much, and would interfere with read-only queries.
To fix this problem, we should add a sleep into emode_for_corrupt_record()
or somewhere? Or we should stop walreceiver and retry to read WAL from
pg_xlog or the archive?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Mon, Jun 7, 2010 at 9:21 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
When an error is found in the WAL streamed from the master, a warning
message is repeated without interval forever in the standby. This
consumes CPU load very much, and would interfere with read-only queries.
To fix this problem, we should add a sleep into emode_for_corrupt_record()
or somewhere? Or we should stop walreceiver and retry to read WAL from
pg_xlog or the archive?
I ran into this problem at one point, too, but was in the middle of
trying to investigate a different bug and didn't have time to track
down what was causing it.
I think the basic question here is - if there's an error in the WAL,
how do we expect to EVER recover? Even if we can read from the
archive or pg_xlog, presumably it's the same WAL - why should we be
any more successful the second time?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jun 7, 2010 at 9:21 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
When an error is found in the WAL streamed from the master, a warning
message is repeated without interval forever in the standby. This
consumes CPU load very much, and would interfere with read-only queries.
To fix this problem, we should add a sleep into emode_for_corrupt_record()
or somewhere? Or we should stop walreceiver and retry to read WAL from
pg_xlog or the archive?
I ran into this problem at one point, too, but was in the middle of
trying to investigate a different bug and didn't have time to track
down what was causing it.
I think the basic question here is - if there's an error in the WAL,
how do we expect to EVER recover? Even if we can read from the
archive or pg_xlog, presumably it's the same WAL - why should we be
any more successful the second time?
What "warning message" are we talking about? All the error cases I can
think of in WAL-application are ERROR, or likely even PANIC.
regards, tom lane
On 10/06/10 17:38, Tom Lane wrote:
Robert Haas<robertmhaas@gmail.com> writes:
On Mon, Jun 7, 2010 at 9:21 AM, Fujii Masao<masao.fujii@gmail.com> wrote:
When an error is found in the WAL streamed from the master, a warning
message is repeated without interval forever in the standby. This
consumes CPU load very much, and would interfere with read-only queries.
To fix this problem, we should add a sleep into emode_for_corrupt_record()
or somewhere? Or we should stop walreceiver and retry to read WAL from
pg_xlog or the archive?I ran into this problem at one point, too, but was in the middle of
trying to investigate a different bug and didn't have time to track
down what was causing it.I think the basic question here is - if there's an error in the WAL,
how do we expect to EVER recover? Even if we can read from the
archive or pg_xlog, presumably it's the same WAL - why should we be
any more successful the second time?What "warning message" are we talking about? All the error cases I can
think of in WAL-application are ERROR, or likely even PANIC.
We're talking about a corrupt record (incorrect CRC, incorrect backlink
etc.), not errors within redo functions. During crash recovery, a
corrupt record means you've reached end of WAL. In standby mode, when
streaming WAL from master, that shouldn't happen, and it's not clear
what to do if it does. PANIC is not a good idea, at least if the server
uses hot standby, because that only makes the situation worse from
availability point of view. So we log the error as a WARNING, and keep
retrying. It's unlikely that the problem will just go away, but we keep
retrying anyway in the hope that it does. However, it seems that we're
too aggressive with the retries.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Thu, Jun 10, 2010 at 12:01 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
We're talking about a corrupt record (incorrect CRC, incorrect backlink
etc.), not errors within redo functions. During crash recovery, a corrupt
record means you've reached end of WAL. In standby mode, when streaming WAL
from master, that shouldn't happen, and it's not clear what to do if it
does. PANIC is not a good idea, at least if the server uses hot standby,
because that only makes the situation worse from availability point of view.
So we log the error as a WARNING, and keep retrying. It's unlikely that the
problem will just go away, but we keep retrying anyway in the hope that it
does. However, it seems that we're too aggressive with the retries.
You can reproduce this problem by doing the following.
1. initdb
2. edit postgresql.conf, set wal_level=hot_standby, max_wal_senders=1;
edit pg_hba.conf, trust local replication connections
3. pg_ctl start
4. make some changes to the database
5. take a hot backup to another directory (call it pgslave)
6. create pgslave/recovery.conf with standby_mode='on',
primary_conninfo=whatever, edit pgslave/postgresql.conf change the
port number, set hot_standby=on
7. pg_ctl start -D pgslave
At this point you should have a working HS/SR setup. Now:
8. shut the slave down
9. move recovery.conf out of the way
10. restart the slave - it will do recovery and enter normal running
11. make some database changes
12. stop the slave
13. put recovery.conf back
14. restart the slave
15. make a bunch of changes on the master
When the slave then tries to replay, you then get something like:
WARNING: invalid record length at 0/4005330
WARNING: invalid record length at 0/4005330
WARNING: invalid record length at 0/4005330
...ad infinitum.
Obviously there are other ways this could occur - the WAL could really
be corrupted, for example - but the current handling is not too
graceful. I'm actually thinking it might be better to trigger a
shutdown if this happens. Probably something has gone haywire and
manual intervention is required. Retrying when there's no hope of
success isn't really that helpful.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Thu, Jun 10, 2010 at 5:13 PM, Robert Haas <robertmhaas@gmail.com> wrote:
At this point you should have a working HS/SR setup. Now:
8. shut the slave down
9. move recovery.conf out of the way
10. restart the slave - it will do recovery and enter normal running
11. make some database changes
12. stop the slave
13. put recovery.conf back
14. restart the slave
15. make a bunch of changes on the masterWhen the slave then tries to replay, you then get something like:
WARNING: invalid record length at 0/4005330
WARNING: invalid record length at 0/4005330
WARNING: invalid record length at 0/4005330
Woah, why does this procedure lead to this situation? I would hope
there's nothing a user could do which would cause it short of invoking
dd to corrupt the WAL files.
At precisely which step of the procedure did the user do something
wrong? Is there any reason we can't detect that they've done it and
throw a specific error message saying the configuration is invalid?
--
greg
On Thu, Jun 10, 2010 at 12:49 PM, Greg Stark <gsstark@mit.edu> wrote:
On Thu, Jun 10, 2010 at 5:13 PM, Robert Haas <robertmhaas@gmail.com> wrote:
At this point you should have a working HS/SR setup. Now:
8. shut the slave down
9. move recovery.conf out of the way
10. restart the slave - it will do recovery and enter normal running
11. make some database changes
12. stop the slave
13. put recovery.conf back
14. restart the slave
15. make a bunch of changes on the masterWhen the slave then tries to replay, you then get something like:
WARNING: invalid record length at 0/4005330
WARNING: invalid record length at 0/4005330
WARNING: invalid record length at 0/4005330Woah, why does this procedure lead to this situation? I would hope
there's nothing a user could do which would cause it short of invoking
dd to corrupt the WAL files.At precisely which step of the procedure did the user do something
wrong?
13.
Is there any reason we can't detect that they've done it and
throw a specific error message saying the configuration is invalid?
I'm not sure how we'd go about doing that, but I agree it would be nice.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Fri, Jun 11, 2010 at 1:01 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
We're talking about a corrupt record (incorrect CRC, incorrect backlink
etc.), not errors within redo functions. During crash recovery, a corrupt
record means you've reached end of WAL. In standby mode, when streaming WAL
from master, that shouldn't happen, and it's not clear what to do if it
does. PANIC is not a good idea, at least if the server uses hot standby,
because that only makes the situation worse from availability point of view.
So we log the error as a WARNING, and keep retrying. It's unlikely that the
problem will just go away, but we keep retrying anyway in the hope that it
does. However, it seems that we're too aggressive with the retries.
Right. The attached patch calms down the retries: if we found an invalid
record while streaming WAL from master, we sleep for 5 seconds (needs to
be reduced?) before retrying to replay the record which is in the same
location where the invalid one was found. Comments?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
calm_down_retries_v1.patchapplication/octet-stream; name=calm_down_retries_v1.patchDownload+14-0
On Thu, 2010-06-10 at 09:57 -0400, Robert Haas wrote:
On Mon, Jun 7, 2010 at 9:21 AM, Fujii Masao <masao.fujii@gmail.com>
wrote:When an error is found in the WAL streamed from the master, a
warning
message is repeated without interval forever in the standby. This
consumes CPU load very much, and would interfere with read-onlyqueries.
To fix this problem, we should add a sleep into
emode_for_corrupt_record()
or somewhere? Or we should stop walreceiver and retry to read WAL
from
pg_xlog or the archive?
I ran into this problem at one point, too, but was in the middle of
trying to investigate a different bug and didn't have time to track
down what was causing it.
So you saw a bug, but didn't report it? Why was that?
I find it amazing that such an obvious problem wasn't covered in
testing, nor do we have a clear plan in the design for handling it.
How many other things are known about but not reported?
If anybody has bugs, report them now.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services
On 11/06/10 07:18, Fujii Masao wrote:
On Fri, Jun 11, 2010 at 1:01 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:We're talking about a corrupt record (incorrect CRC, incorrect backlink
etc.), not errors within redo functions. During crash recovery, a corrupt
record means you've reached end of WAL. In standby mode, when streaming WAL
from master, that shouldn't happen, and it's not clear what to do if it
does. PANIC is not a good idea, at least if the server uses hot standby,
because that only makes the situation worse from availability point of view.
So we log the error as a WARNING, and keep retrying. It's unlikely that the
problem will just go away, but we keep retrying anyway in the hope that it
does. However, it seems that we're too aggressive with the retries.Right. The attached patch calms down the retries: if we found an invalid
record while streaming WAL from master, we sleep for 5 seconds (needs to
be reduced?) before retrying to replay the record which is in the same
location where the invalid one was found. Comments?
Hmm, right now it doesn't even reconnect when it sees a corrupt record
streamed from the master. It's really pointless to retry in that case,
reapplying the exact same piece of WAL surely won't work. I think it
should disconnect, and then retry reading from archive and pg_xlog, and
then retry streaming again. That's pretty hopeless too, but it's at
least theoretically possible that something went wrong in the
transmission and the file in the archive is fine.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Attachments:
calm_down_retries_v2.patchtext/x-diff; name=calm_down_retries_v2.patchDownload+16-0
On Fri, Jun 11, 2010 at 8:19 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Thu, 2010-06-10 at 09:57 -0400, Robert Haas wrote:
On Mon, Jun 7, 2010 at 9:21 AM, Fujii Masao <masao.fujii@gmail.com>
wrote:When an error is found in the WAL streamed from the master, a
warning
message is repeated without interval forever in the standby. This
consumes CPU load very much, and would interfere with read-onlyqueries.
To fix this problem, we should add a sleep into
emode_for_corrupt_record()
or somewhere? Or we should stop walreceiver and retry to read WAL
from
pg_xlog or the archive?
I ran into this problem at one point, too, but was in the middle of
trying to investigate a different bug and didn't have time to track
down what was causing it.So you saw a bug, but didn't report it? Why was that?
Because it happened while I was in the middle of trying to investigate
a different bug, and I didn't know precisely what I had done to cause
it. I could have reported it anyway, but I thought it would not have
been that helpful to report a bug that I didn't know for sure how to
reproduce. Like everyone else here, I am working very hard and do not
always have time to follow up on every issue. I had something else
weird happen yesterday but I'm not sure I know exactly what I did and
I'm not at all sure that it wasn't just pilot error. If I start
reporting every pilot error as a bug, I'm going to get laughed out of
town, because I make a pilot error *every time* I tried to set up SR
w/HS. :-(
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Fri, Jun 11, 2010 at 9:32 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Hmm, right now it doesn't even reconnect when it sees a corrupt record
streamed from the master. It's really pointless to retry in that case,
reapplying the exact same piece of WAL surely won't work. I think it should
disconnect, and then retry reading from archive and pg_xlog, and then retry
streaming again. That's pretty hopeless too, but it's at least theoretically
possible that something went wrong in the transmission and the file in the
archive is fine.
Yep, this makes more sense.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Thu, 2010-06-10 at 19:01 +0300, Heikki Linnakangas wrote:
What "warning message" are we talking about? All the error cases I can
think of in WAL-application are ERROR, or likely even PANIC.We're talking about a corrupt record (incorrect CRC, incorrect backlink
etc.), not errors within redo functions. During crash recovery, a
corrupt record means you've reached end of WAL. In standby mode, when
streaming WAL from master, that shouldn't happen, and it's not clear
what to do if it does. PANIC is not a good idea, at least if the server
uses hot standby, because that only makes the situation worse from
availability point of view. So we log the error as a WARNING, and keep
retrying. It's unlikely that the problem will just go away, but we keep
retrying anyway in the hope that it does. However, it seems that we're
too aggressive with the retries.
If my streaming replication stops working, I want to know about it as
soon as possible. WARNING just doesn't cut it.
This needs some better thought.
If we PANIC, then surely it will PANIC again when we restart unless we
do something. So we can't do that. But we need to do something better
than
WARNING there is a bug that will likely cause major data loss
HINT you'll be sacked if you miss this message
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services
On Fri, Jun 11, 2010 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Thu, 2010-06-10 at 19:01 +0300, Heikki Linnakangas wrote:
What "warning message" are we talking about? All the error cases I can
think of in WAL-application are ERROR, or likely even PANIC.We're talking about a corrupt record (incorrect CRC, incorrect backlink
etc.), not errors within redo functions. During crash recovery, a
corrupt record means you've reached end of WAL. In standby mode, when
streaming WAL from master, that shouldn't happen, and it's not clear
what to do if it does. PANIC is not a good idea, at least if the server
uses hot standby, because that only makes the situation worse from
availability point of view. So we log the error as a WARNING, and keep
retrying. It's unlikely that the problem will just go away, but we keep
retrying anyway in the hope that it does. However, it seems that we're
too aggressive with the retries.If my streaming replication stops working, I want to know about it as
soon as possible. WARNING just doesn't cut it.This needs some better thought.
If we PANIC, then surely it will PANIC again when we restart unless we
do something. So we can't do that. But we need to do something better
thanWARNING there is a bug that will likely cause major data loss
HINT you'll be sacked if you miss this message
+1. I was making this same argument (less eloquently) upthread.
I particularly like the errhint().
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Robert Haas wrote:
If my streaming replication stops working, I want to know about it as
soon as possible. WARNING just doesn't cut it.This needs some better thought.
If we PANIC, then surely it will PANIC again when we restart unless we
do something. So we can't do that. But we need to do something better
thanWARNING there is a bug that will likely cause major data loss
HINT you'll be sacked if you miss this message+1. I was making this same argument (less eloquently) upthread.
I particularly like the errhint().
I am wondering what action would be most likely to get the
administrator's attention.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ None of us is going to be here forever. +
On 12/06/10 04:19, Bruce Momjian wrote:
Robert Haas wrote:
If my streaming replication stops working, I want to know about it as
soon as possible. WARNING just doesn't cut it.This needs some better thought.
If we PANIC, then surely it will PANIC again when we restart unless we
do something. So we can't do that. But we need to do something better
thanWARNING there is a bug that will likely cause major data loss
HINT you'll be sacked if you miss this message+1. I was making this same argument (less eloquently) upthread.
I particularly like the errhint().I am wondering what action would be most likely to get the
administrator's attention.
I've committed the patch to disconnect the SR connection in that case.
If the message needs improvement, let's do that separately once we
figure out what to do.
Seems like we need something like WARNING that doesn't cause the process
to die, but more alarming like ERROR/FATAL/PANIC. Or maybe just adding a
hint to the warning will do. How about
WARNING: invalid record length at 0/4005330
HINT: An invalid record was streamed from master. That can be a sign of
corruption in the master, or inconsistency between master and standby
state. The record will be re-fetched, but that is unlikely to fix the
problem. You may have to restore standby from base backup.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote:
On 12/06/10 04:19, Bruce Momjian wrote:
Robert Haas wrote:
If my streaming replication stops working, I want to know about it as
soon as possible. WARNING just doesn't cut it.This needs some better thought.
If we PANIC, then surely it will PANIC again when we restart unless we
do something. So we can't do that. But we need to do something better
thanWARNING there is a bug that will likely cause major data loss
HINT you'll be sacked if you miss this message+1. I was making this same argument (less eloquently) upthread.
I particularly like the errhint().I am wondering what action would be most likely to get the
administrator's attention.I've committed the patch to disconnect the SR connection in that case.
If the message needs improvement, let's do that separately once we
figure out what to do.Seems like we need something like WARNING that doesn't cause the process
to die, but more alarming like ERROR/FATAL/PANIC. Or maybe just adding a
hint to the warning will do. How aboutWARNING: invalid record length at 0/4005330
HINT: An invalid record was streamed from master. That can be a sign of
corruption in the master, or inconsistency between master and standby
state. The record will be re-fetched, but that is unlikely to fix the
problem. You may have to restore standby from base backup.
I am thinking about log monitoring tools like Nagios. I am afraid
they are never going to pick up something tagged WARNING, no matter
what the wording is. Crazy idea, but can we force a fatal error line
into the logs with something like "WARNING ...\nFATAL: ...".
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ None of us is going to be here forever. +
On Mon, Jun 14, 2010 at 12:16, Bruce Momjian <bruce@momjian.us> wrote:
Heikki Linnakangas wrote:
On 12/06/10 04:19, Bruce Momjian wrote:
Robert Haas wrote:
If my streaming replication stops working, I want to know about it as
soon as possible. WARNING just doesn't cut it.This needs some better thought.
If we PANIC, then surely it will PANIC again when we restart unless we
do something. So we can't do that. But we need to do something better
thanWARNING there is a bug that will likely cause major data loss
HINT you'll be sacked if you miss this message+1. I was making this same argument (less eloquently) upthread.
I particularly like the errhint().I am wondering what action would be most likely to get the
administrator's attention.I've committed the patch to disconnect the SR connection in that case.
If the message needs improvement, let's do that separately once we
figure out what to do.Seems like we need something like WARNING that doesn't cause the process
to die, but more alarming like ERROR/FATAL/PANIC. Or maybe just adding a
hint to the warning will do. How aboutWARNING: invalid record length at 0/4005330
HINT: An invalid record was streamed from master. That can be a sign of
corruption in the master, or inconsistency between master and standby
state. The record will be re-fetched, but that is unlikely to fix the
problem. You may have to restore standby from base backup.I am thinking about log monitoring tools like Nagios. I am afraid
they are never going to pick up something tagged WARNING, no matter
If they are properly configured, I imagine they would. And if they're
not, well, there's not much for us to do.
(What would be more usful then would be to separate "user-warnings"
like warnings about cast from actual system-warnings like this, but
that's a whole different story)
what the wording is. Crazy idea, but can we force a fatal error line
into the logs with something like "WARNING ...\nFATAL: ...".
That's way too crazy :P And btw, randomly sticking newlines into that
will mess up *most* log displayers and I bet a lot of the log
monitoring tools as well...
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Magnus Hagander wrote:
Seems like we need something like WARNING that doesn't cause the process
to die, but more alarming like ERROR/FATAL/PANIC. Or maybe just adding a
hint to the warning will do. How aboutWARNING: ?invalid record length at 0/4005330
HINT: An invalid record was streamed from master. That can be a sign of
corruption in the master, or inconsistency between master and standby
state. The record will be re-fetched, but that is unlikely to fix the
problem. You may have to restore standby from base backup.I am thinking about log monitoring tools like Nagios. ?I am afraid
they are never going to pick up something tagged WARNING, no matterIf they are properly configured, I imagine they would. And if they're
not, well, there's not much for us to do.
What does that mean?
(What would be more usful then would be to separate "user-warnings"
like warnings about cast from actual system-warnings like this, but
that's a whole different story)what the wording is. ?Crazy idea, but can we force a fatal error line
into the logs with something like "WARNING ...\nFATAL: ...".That's way too crazy :P And btw, randomly sticking newlines into that
will mess up *most* log displayers and I bet a lot of the log
monitoring tools as well...
Yeah, it would mess up CSV log output, for example?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ None of us is going to be here forever. +
On Mon, Jun 14, 2010 at 13:11, Bruce Momjian <bruce@momjian.us> wrote:
Magnus Hagander wrote:
Seems like we need something like WARNING that doesn't cause the process
to die, but more alarming like ERROR/FATAL/PANIC. Or maybe just adding a
hint to the warning will do. How aboutWARNING: ?invalid record length at 0/4005330
HINT: An invalid record was streamed from master. That can be a sign of
corruption in the master, or inconsistency between master and standby
state. The record will be re-fetched, but that is unlikely to fix the
problem. You may have to restore standby from base backup.I am thinking about log monitoring tools like Nagios. ?I am afraid
they are never going to pick up something tagged WARNING, no matterIf they are properly configured, I imagine they would. And if they're
not, well, there's not much for us to do.What does that mean?
It means that we can't prevent people from configuring their tools to
ignore important warning. We can't prevent them rom ignoring ERROR or
FATAL either...
(What would be more usful then would be to separate "user-warnings"
like warnings about cast from actual system-warnings like this, but
that's a whole different story)what the wording is. ?Crazy idea, but can we force a fatal error line
into the logs with something like "WARNING ...\nFATAL: ...".That's way too crazy :P And btw, randomly sticking newlines into that
will mess up *most* log displayers and I bet a lot of the log
monitoring tools as well...Yeah, it would mess up CSV log output, for example?
Possibly. And your general syslog output would become pretty hard to
read once this kind of stuff shows up.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/