repmgr problem with registering standby
Hi,
I have repmgr working to some degree on a couple of servers, but am
having trouble with the "register" part on the slave.
On the master, I run:
# repmgr -f /etc/repmgr/validator/repmgr.conf \
--verbose --force master register
Opening configuration file: /etc/repmgr/validator/repmgr.conf
repmgr connecting to master database
repmgr connected to master, checking its state
finding node list for cluster 'validator'
Master node correctly registered for cluster validator with id 0
(conninfo: host=10.133.54.2 port=5432 user=repmgr dbname=repmgr)
So that looks good, but then I try this on the slave:
# repmgr -f /etc/repmgr/validator/repmgr.conf \
--verbose standby register
Opening configuration file: /etc/repmgr/validator/repmgr.conf
repmgr connecting to standby database
repmgr connected to standby, checking its state
repmgr connecting to master database
finding node list for cluster 'validator'
A master must be defined before configuring a slave
I can query the database like so though, and it seems like it's all good:
repmgr=# select * from repmgr_validator.repl_nodes;
id | cluster | conninfo
----+-----------+------------------------------------------------------
0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr
(1 row)
Does anyone have an idea of what might be going wrong here?
Thanks,
Toby
On Wed, Jul 27, 2011 at 10:36 AM, Toby Corkindale
<toby.corkindale@strategicdata.com.au> wrote:
Hi,
I have repmgr working to some degree on a couple of servers, but am having
trouble with the "register" part on the slave.On the master, I run:
# repmgr -f /etc/repmgr/validator/repmgr.conf \
--verbose --force master registerOpening configuration file: /etc/repmgr/validator/repmgr.conf
repmgr connecting to master database
repmgr connected to master, checking its state
finding node list for cluster 'validator'
Master node correctly registered for cluster validator with id 0 (conninfo:
host=10.133.54.2 port=5432 user=repmgr dbname=repmgr)So that looks good, but then I try this on the slave:
# repmgr -f /etc/repmgr/validator/repmgr.conf \
--verbose standby registerOpening configuration file: /etc/repmgr/validator/repmgr.conf
repmgr connecting to standby database
repmgr connected to standby, checking its state
repmgr connecting to master database
finding node list for cluster 'validator'
A master must be defined before configuring a slaveI can query the database like so though, and it seems like it's all good:
repmgr=# select * from repmgr_validator.repl_nodes;
id | cluster | conninfo
----+-----------+------------------------------------------------------
0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr
(1 row)Does anyone have an idea of what might be going wrong here?
Hi, thanks for using repmgr.
What version of repmgr are you using? What version of PostgreSQL?
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
----- Original Message -----
On Wed, Jul 27, 2011 at 10:36 AM, Toby Corkindale wrote:
Hi,
I have repmgr working to some degree on a couple of servers, but am
having
trouble with the "register" part on the slave.On the master, I run:
# repmgr -f /etc/repmgr/validator/repmgr.conf \
--verbose --force master registerOpening configuration file: /etc/repmgr/validator/repmgr.conf
repmgr connecting to master database
repmgr connected to master, checking its state
finding node list for cluster 'validator'
Master node correctly registered for cluster validator with id 0
(conninfo:
host=10.133.54.2 port=5432 user=repmgr dbname=repmgr)So that looks good, but then I try this on the slave:
# repmgr -f /etc/repmgr/validator/repmgr.conf \
--verbose standby registerOpening configuration file: /etc/repmgr/validator/repmgr.conf
repmgr connecting to standby database
repmgr connected to standby, checking its state
repmgr connecting to master database
finding node list for cluster 'validator'
A master must be defined before configuring a slaveI can query the database like so though, and it seems like it's all
good:
repmgr=# select * from repmgr_validator.repl_nodes;
id | cluster | conninfo
----+-----------+------------------------------------------------------
0 | validator | host=10.133.54.2 port=5432 user=repmgr
dbname=repmgr
(1 row)Does anyone have an idea of what might be going wrong here?
Hi, thanks for using repmgr.
What version of repmgr are you using? What version of PostgreSQL?
Hi Simon,
We're using version 1.1.0 of repmgr, against PostgreSQL 9.0.4-1~bpo60+1 (ie. the version from backports) on Debian squeeze.
To complicate matters, we have several postgresql instances per machine, using Debian's pg cluster stuff. This seems to work elsewhere with repmgr though (as long as we make sure the ports are specified in the repmgr configs).
However, I'm not having any success even with just the default/single instance of pg either at the moment.
Cheers,
Toby
On Wed, Jul 27, 2011 at 4:36 AM, Toby Corkindale
<toby.corkindale@strategicdata.com.au> wrote:
So that looks good, but then I try this on the slave:
# repmgr -f /etc/repmgr/validator/repmgr.conf \
--verbose standby register
can you show the content of /etc/repmgr/validator/repmgr.conf?
[...]
I can query the database like so though, and it seems like it's all good:
repmgr=# select * from repmgr_validator.repl_nodes;
id | cluster | conninfo
----+-----------+------------------------------------------------------
0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr
(1 row)
this is on the master or the slave?
--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
On 28/07/11 03:47, Jaime Casanova wrote:
On Wed, Jul 27, 2011 at 4:36 AM, Toby Corkindale
<toby.corkindale@strategicdata.com.au> wrote:So that looks good, but then I try this on the slave:
# repmgr -f /etc/repmgr/validator/repmgr.conf \
--verbose standby registercan you show the content of /etc/repmgr/validator/repmgr.conf?
cluster=validator
node=mel-db06
conninfo='host=10.133.54.1 port=5432 user=repmgr dbname=repmgr'
I can query the database like so though, and it seems like it's all good:
repmgr=# select * from repmgr_validator.repl_nodes;
id | cluster | conninfo
----+-----------+------------------------------------------------------
0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr
(1 row)this is on the master or the slave?
I ran that on the slave; however I've just checked now, and the same
results are given on both nodes.
Just so you know, db06=10.133.54.1 and db07=10.133.54.2.
They also have a second address each on the 192.168.10.x network as well
though.
Toby
On Wed, Jul 27, 2011 at 7:24 PM, Toby Corkindale
<toby.corkindale@strategicdata.com.au> wrote:
On 28/07/11 03:47, Jaime Casanova wrote:
On Wed, Jul 27, 2011 at 4:36 AM, Toby Corkindale
<toby.corkindale@strategicdata.com.au> wrote:So that looks good, but then I try this on the slave:
# repmgr -f /etc/repmgr/validator/repmgr.conf \
--verbose standby registercan you show the content of /etc/repmgr/validator/repmgr.conf?
cluster=validator
node=mel-db06
conninfo='host=10.133.54.1 port=5432 user=repmgr dbname=repmgr'
sorry for the delay on this... do you still have this problem?
the node parameter should be an integer value, i don't think that
string should work for you
I can query the database like so though, and it seems like it's all good:
repmgr=# select * from repmgr_validator.repl_nodes;
id | cluster | conninfo
----+-----------+------------------------------------------------------
0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr
(1 row)
if in the standby that string you're using as node value ends up as a
0 then it never asks for the node 0 (it couldn't be the master because
you're just registering as a standby)
so i bet that's the problem, use numbers in the node parameter and
everything will be ok
i will have to add a check against this case in repmgr, though
--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
On 02/08/11 01:05, Jaime Casanova wrote:
On Wed, Jul 27, 2011 at 7:24 PM, Toby Corkindale
<toby.corkindale@strategicdata.com.au> wrote:On 28/07/11 03:47, Jaime Casanova wrote:
On Wed, Jul 27, 2011 at 4:36 AM, Toby Corkindale
<toby.corkindale@strategicdata.com.au> wrote:So that looks good, but then I try this on the slave:
# repmgr -f /etc/repmgr/validator/repmgr.conf \
--verbose standby registercan you show the content of /etc/repmgr/validator/repmgr.conf?
cluster=validator
node=mel-db06
conninfo='host=10.133.54.1 port=5432 user=repmgr dbname=repmgr'sorry for the delay on this... do you still have this problem?
We did, yes..
the node parameter should be an integer value, i don't think that
string should work for you
Ah! Right, yes, changing that to integer values on all the nodes
concerned has indeed solved the problem - once I manually deleted the
repgmr schema from the database. (It wouldn't replace the master, even
with --force)
I can query the database like so though, and it seems like it's all good:
repmgr=# select * from repmgr_validator.repl_nodes;
id | cluster | conninfo
----+-----------+------------------------------------------------------
0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr
(1 row)if in the standby that string you're using as node value ends up as a
0 then it never asks for the node 0 (it couldn't be the master because
you're just registering as a standby)so i bet that's the problem, use numbers in the node parameter and
everything will be oki will have to add a check against this case in repmgr, though
Is there some documentation detailing the format of the repmgr.conf
file? Both I and another guy here have looked at it, and neither of us
spotted that node was only supposed to contain integers.
For that matter - is there a reason it has to be an integer? Allowing
hostnames there would be more friendly. Using integers means someone has
to maintain a mapping on node IDs to hostnames in a separate place, and
then that leads to mistakes, like someone thinking the standby node (2)
is the master hostname :/
Thanks for your help tracking this down!
Cheers,
Toby
For that matter - is there a reason it has to be an integer? Allowing hostnames there would be more friendly. Using integers means someone has to maintain a mapping on node IDs to hostnames in a separate place, and then that leads to mistakes, like someone thinking the standby node (2) is the master hostname :/
As a quick observation, the host name, by itself, doesn't seem to be a candidate key. It would probably have made sense to use a varchar instead of an integer but it seems people treat such a key type as forbidden.
David J.