pgsql: Check compulsory parameters in recovery.conf in standby_mode, per
Log Message:
-----------
Check compulsory parameters in recovery.conf in standby_mode, per docs.
Modified Files:
--------------
pgsql/src/backend/access/transam:
xlog.c (r1.387 -> r1.388)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/xlog.c?r1=1.387&r2=1.388)
On Sat, Apr 3, 2010 at 6:50 AM, Simon Riggs <sriggs@postgresql.org> wrote:
Log Message:
-----------
Check compulsory parameters in recovery.conf in standby_mode, per docs.
On the recent discussion (*1), some people argued that specifying neither
primary_conninfo nor restore_command in the standby mode is not unreasonable,
and we reached the consensus that the setting should be allowed. So this
commit doesn't reflect the discussion. How about reverting the commit,
and restarting the discussion if you have complaint against the consensus?
(*1) http://archives.postgresql.org/pgsql-hackers/2010-03/msg01306.php
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Mon, 2010-04-05 at 17:08 +0900, Fujii Masao wrote:
On Sat, Apr 3, 2010 at 6:50 AM, Simon Riggs <sriggs@postgresql.org> wrote:
Log Message:
-----------
Check compulsory parameters in recovery.conf in standby_mode, per docs.On the recent discussion (*1), some people argued that specifying neither
primary_conninfo nor restore_command in the standby mode is not unreasonable,
and we reached the consensus that the setting should be allowed. So this
commit doesn't reflect the discussion. How about reverting the commit,
and restarting the discussion if you have complaint against the consensus?(*1) http://archives.postgresql.org/pgsql-hackers/2010-03/msg01306.php
This change was made because on a separate thread Robert Haas complained
about the server hanging when only standby_mode was set. I verified the
hang and "fixed" the issue. I have added a comment to the previous
thread to restart the discussion.
If the capability that the server sit quietly doing nothing for ever was
somehow important then we should document that both in code and in docs.
I am very disconcerted that there are still no docs whatsoever to
describe how the server works in these new modes.
--
Simon Riggs www.2ndQuadrant.com
Simon Riggs wrote:
I am very disconcerted that there are still no docs whatsoever to
describe how the server works in these new modes.
I did add a few paragraphs last week, see
http://developer.postgresql.org/pgdocs/postgres/warm-standby.html#STANDBY-SERVER-OPERATION.
It doesn't explicitly talk about that configuration where both
primary_conninfo and restore_command are missing (or misconfigured), but
it does say that it polls pg_xlog. How does that read to you?
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Mon, 2010-04-05 at 19:01 +0300, Heikki Linnakangas wrote:
Simon Riggs wrote:
I am very disconcerted that there are still no docs whatsoever to
describe how the server works in these new modes.I did add a few paragraphs last week, see
http://developer.postgresql.org/pgdocs/postgres/warm-standby.html#STANDBY-SERVER-OPERATION.
It doesn't explicitly talk about that configuration where both
primary_conninfo and restore_command are missing (or misconfigured), but
it does say that it polls pg_xlog. How does that read to you?
It is better, though it might mislead others into thinking I mean
sufficient, which it is not. I do not mean to be rude, but we must be
clear that a major feature requires matching documentation.
Looking through the code some more I note that their are two timing
related parameters that are hardcoded into XLogPageRead(). At the very
least such things should be #defines, though one of them was previously
configurable using pg_standby, so I would like to see them both
accessible to user tuning.
I was also surprised to note that the Startup process is not signaled by
WALReceiver when new WAL is received, so it will continue sleeping even
though it has work to do.
--
Simon Riggs www.2ndQuadrant.com
On Mon, 2010-04-05 at 17:08 +0900, Fujii Masao wrote:
On Sat, Apr 3, 2010 at 6:50 AM, Simon Riggs <sriggs@postgresql.org> wrote:
Log Message:
-----------
Check compulsory parameters in recovery.conf in standby_mode, per docs.On the recent discussion (*1), some people argued that specifying neither
primary_conninfo nor restore_command in the standby mode is not unreasonable,
and we reached the consensus that the setting should be allowed. So this
commit doesn't reflect the discussion. How about reverting the commit,
and restarting the discussion if you have complaint against the consensus?
The attached patch changes the messages and downgrades FATAL to WARNING.
Comments?
--
Simon Riggs www.2ndQuadrant.com
Attachments:
msg_change.patchtext/x-patch; charset=UTF-8; name=msg_change.patchDownload+4-3
On Mon, 2010-04-05 at 19:29 +0100, Simon Riggs wrote:
Looking through the code some more I note that their are two timing
related parameters that are hardcoded into XLogPageRead(). At the very
least such things should be #defines, though one of them was previously
configurable using pg_standby, so I would like to see them both
accessible to user tuning.
...the code says "we want to react quickly when the next WAL record
arrives" and then sleeps for 100ms.
I was also surprised to note that the Startup process is not signaled by
WALReceiver when new WAL is received, so it will continue sleeping even
though it has work to do.
--
Simon Riggs www.2ndQuadrant.com
Simon Riggs escribi�:
On Mon, 2010-04-05 at 17:08 +0900, Fujii Masao wrote:
On Sat, Apr 3, 2010 at 6:50 AM, Simon Riggs <sriggs@postgresql.org> wrote:
Log Message:
-----------
Check compulsory parameters in recovery.conf in standby_mode, per docs.On the recent discussion (*1), some people argued that specifying neither
primary_conninfo nor restore_command in the standby mode is not unreasonable,
and we reached the consensus that the setting should be allowed. So this
commit doesn't reflect the discussion. How about reverting the commit,
and restarting the discussion if you have complaint against the consensus?The attached patch changes the messages and downgrades FATAL to WARNING.
Comments?
Please note that errdetail must be a complete sentence.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
On Mon, 2010-04-05 at 15:02 -0400, Alvaro Herrera wrote:
Simon Riggs escribió:
On Mon, 2010-04-05 at 17:08 +0900, Fujii Masao wrote:
On Sat, Apr 3, 2010 at 6:50 AM, Simon Riggs <sriggs@postgresql.org> wrote:
Log Message:
-----------
Check compulsory parameters in recovery.conf in standby_mode, per docs.On the recent discussion (*1), some people argued that specifying neither
primary_conninfo nor restore_command in the standby mode is not unreasonable,
and we reached the consensus that the setting should be allowed. So this
commit doesn't reflect the discussion. How about reverting the commit,
and restarting the discussion if you have complaint against the consensus?The attached patch changes the messages and downgrades FATAL to WARNING.
Comments?
Please note that errdetail must be a complete sentence.
Better?
--
Simon Riggs www.2ndQuadrant.com
Attachments:
msg_change.patchtext/x-patch; charset=UTF-8; name=msg_change.patchDownload+4-3
On Mon, Apr 5, 2010 at 3:42 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Mon, 2010-04-05 at 15:02 -0400, Alvaro Herrera wrote:
Simon Riggs escribió:
On Mon, 2010-04-05 at 17:08 +0900, Fujii Masao wrote:
On Sat, Apr 3, 2010 at 6:50 AM, Simon Riggs <sriggs@postgresql.org> wrote:
Log Message:
-----------
Check compulsory parameters in recovery.conf in standby_mode, per docs.On the recent discussion (*1), some people argued that specifying neither
primary_conninfo nor restore_command in the standby mode is not unreasonable,
and we reached the consensus that the setting should be allowed. So this
commit doesn't reflect the discussion. How about reverting the commit,
and restarting the discussion if you have complaint against the consensus?The attached patch changes the messages and downgrades FATAL to WARNING.
Comments?
Please note that errdetail must be a complete sentence.
Better?
I haven't tested this (bad words to start out with, I know) but
assuming this message is going to print once on startup rather than
repeatedly, +1 from me.
...Robert
Simon Riggs <simon@2ndQuadrant.com> writes:
(errmsg("recovery command file \"%s\" specified neither primary_conninfo nor restore_command", - RECOVERY_COMMAND_FILE))); + RECOVERY_COMMAND_FILE), + errdetail("The database server will regularly poll the pg_xlog subdirectory to check for files placed there.")));
That's not a "detail", as it has nothing to do with details of the error
condition. It might pass muster as a "hint", though I'm not entirely
sure what the point is.
regards, tom lane
On Mon, 2010-04-05 at 15:58 -0400, Tom Lane wrote:
Simon Riggs <simon@2ndQuadrant.com> writes:
(errmsg("recovery command file \"%s\" specified neither primary_conninfo nor restore_command", - RECOVERY_COMMAND_FILE))); + RECOVERY_COMMAND_FILE), + errdetail("The database server will regularly poll the pg_xlog subdirectory to check for files placed there.")));That's not a "detail", as it has nothing to do with details of the error
condition. It might pass muster as a "hint", though I'm not entirely
sure what the point is.
The server sits around doing nothing and the point of the message is to
explain why that is and give a hint as to what might change that
situation, if anything.
--
Simon Riggs www.2ndQuadrant.com
On Mon, Apr 5, 2010 at 4:18 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Mon, 2010-04-05 at 15:58 -0400, Tom Lane wrote:
Simon Riggs <simon@2ndQuadrant.com> writes:
(errmsg("recovery command file \"%s\" specified neither primary_conninfo nor restore_command", - RECOVERY_COMMAND_FILE))); + RECOVERY_COMMAND_FILE), + errdetail("The database server will regularly poll the pg_xlog subdirectory to check for files placed there.")));That's not a "detail", as it has nothing to do with details of the error
condition. It might pass muster as a "hint", though I'm not entirely
sure what the point is.The server sits around doing nothing and the point of the message is to
explain why that is and give a hint as to what might change that
situation, if anything.
Yeah, I think that's good information to provide. But I have to admit
that I too considered suggesting changing it from errdetail to
errhint, so the fact that Tom had the same thought suggests to me that
that's probably better.
...Robert
On Tue, Apr 6, 2010 at 3:29 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
I was also surprised to note that the Startup process is not signaled by
WALReceiver when new WAL is received, so it will continue sleeping even
though it has work to do.
I don't think this is so useful in 9.0 since synchronous replication
(i.e., transaction commit wait for WAL to be replayed by the standby)
is not supported.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Simon Riggs wrote:
On Mon, 2010-04-05 at 19:29 +0100, Simon Riggs wrote:
Looking through the code some more I note that their are two timing
related parameters that are hardcoded into XLogPageRead(). At the very
least such things should be #defines, though one of them was previously
configurable using pg_standby, so I would like to see them both
accessible to user tuning....the code says "we want to react quickly when the next WAL record
arrives" and then sleeps for 100ms.
The comment continues ", so sleep only a bit". That's in comparison to
the 5 s wait when polling the archive.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Fujii Masao wrote:
On Tue, Apr 6, 2010 at 3:29 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
I was also surprised to note that the Startup process is not signaled by
WALReceiver when new WAL is received, so it will continue sleeping even
though it has work to do.I don't think this is so useful in 9.0 since synchronous replication
(i.e., transaction commit wait for WAL to be replayed by the standby)
is not supported.
Well, it would still be useful, as it would shorten the delay. But yeah,
there's a delay in asynchronous replication anyway, so we decided to
keep it simple and just poll. It's not ideal and definitely needs to be
revisited for synchronous mode in the future. Same for walsenders.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Tue, 2010-04-06 at 10:19 +0300, Heikki Linnakangas wrote:
Fujii Masao wrote:
On Tue, Apr 6, 2010 at 3:29 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
I was also surprised to note that the Startup process is not signaled by
WALReceiver when new WAL is received, so it will continue sleeping even
though it has work to do.I don't think this is so useful in 9.0 since synchronous replication
(i.e., transaction commit wait for WAL to be replayed by the standby)
is not supported.Well, it would still be useful, as it would shorten the delay. But yeah,
there's a delay in asynchronous replication anyway, so we decided to
keep it simple and just poll. It's not ideal and definitely needs to be
revisited for synchronous mode in the future. Same for walsenders.
A signal seems fairly straightforward to me, the archiver did this in
8.0 and it was not considered complex then. Quite why it would be
complex here is not clear.
I'm not happy that it waits, nor that the wait is non-tunable. I would
like to see a new parameter added for this.
--
Simon Riggs www.2ndQuadrant.com
Simon Riggs wrote:
On Tue, 2010-04-06 at 10:19 +0300, Heikki Linnakangas wrote:
Fujii Masao wrote:
On Tue, Apr 6, 2010 at 3:29 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
I was also surprised to note that the Startup process is not signaled by
WALReceiver when new WAL is received, so it will continue sleeping even
though it has work to do.I don't think this is so useful in 9.0 since synchronous replication
(i.e., transaction commit wait for WAL to be replayed by the standby)
is not supported.Well, it would still be useful, as it would shorten the delay. But yeah,
there's a delay in asynchronous replication anyway, so we decided to
keep it simple and just poll. It's not ideal and definitely needs to be
revisited for synchronous mode in the future. Same for walsenders.A signal seems fairly straightforward to me, the archiver did this in
8.0 and it was not considered complex then. Quite why it would be
complex here is not clear.
The other side of the problem is that walsender polls too. Eliminating
the delay from walreceiver doesn't buy you much unless you eliminate the
delay from the walsender too. And things get complicated there. Do you
signal the walsenders at every commit? That would be a lot of volume,
and adds more work for every normal transaction in the primary. Maybe
not much, but it would be one more thing to worry about and test.
I'm not happy that it waits, nor that the wait is non-tunable. I would
like to see a new parameter added for this.
I wanted to keep it simple for users, but feel free to add a parameter
if you feel it must be configurable.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Tue, 2010-04-06 at 12:38 +0300, Heikki Linnakangas wrote:
Simon Riggs wrote:
On Tue, 2010-04-06 at 10:19 +0300, Heikki Linnakangas wrote:
Fujii Masao wrote:
On Tue, Apr 6, 2010 at 3:29 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
I was also surprised to note that the Startup process is not signaled by
WALReceiver when new WAL is received, so it will continue sleeping even
though it has work to do.I don't think this is so useful in 9.0 since synchronous replication
(i.e., transaction commit wait for WAL to be replayed by the standby)
is not supported.Well, it would still be useful, as it would shorten the delay. But yeah,
there's a delay in asynchronous replication anyway, so we decided to
keep it simple and just poll. It's not ideal and definitely needs to be
revisited for synchronous mode in the future. Same for walsenders.A signal seems fairly straightforward to me, the archiver did this in
8.0 and it was not considered complex then. Quite why it would be
complex here is not clear.The other side of the problem is that walsender polls too. Eliminating
the delay from walreceiver doesn't buy you much unless you eliminate the
delay from the walsender too. And things get complicated there. Do you
signal the walsenders at every commit? That would be a lot of volume,
and adds more work for every normal transaction in the primary. Maybe
not much, but it would be one more thing to worry about and test.
You are trying to connect two unrelated things.
We can argue that the WALSender's delay allows it to build up a good
sized batch of work to transfer.
Having the Startup process wait does not buy us anything at all.
Currently if the Startup process finishes more quickly than the
WALreceiver it will wait for 100ms.
I am surprised at your arguments for simplicity. With Hot Standby you
have insisted that everything should be in place. With SR you seem to
have just stopped at a barely working, poorly documented implementation.
We both know you can fix these things easily and quickly. Please do so.
Not because I say so, but because everybody else will soon notice that
you could have and did not.
--
Simon Riggs www.2ndQuadrant.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Well, it would still be useful, as it would shorten the delay. But yeah,
there's a delay in asynchronous replication anyway, so we decided to
keep it simple and just poll. It's not ideal and definitely needs to be
revisited for synchronous mode in the future. Same for walsenders.
Stop me if I misunderstood the case at hand, but while waiting some more
for having a sizeable batch to send makes a lot of sense to me, waiting
on the receiver side when there's some work to do will only forbids a
slow slave to keep up with the load, increasing lag artificially.
I'm used to asynchronous replication where you're never allowed to rest
if some batch is ready for you to process.
Regards,
--
dim