bdr manual cleanup required

Started by Selim Tuviover 10 years ago7 messagesgeneral

stuvi@ilm.com

over 10 years ago

I am trying to repair a broken bdr cluster setup and so far everything I tried failed. Under the original node that ran bdr.bdr_group_create I am getting the following error:

2015-12-04 19:34:29.063 UTC,,,22991,,5661eac4.59cf,1,,2015-12-04 19:34:28 UTC,3/0,0,ERROR,55000,"previous init failed, manual cleanup is required","Found bdr.bdr_nodes entry for bdr (6224504646761731677,1,16389,) with state=i in remote bdr.bdr_nodes","Remove all replication identifiers and slots corresponding to this node from the init target node then drop and recreate this database and try again",,,,,,,"bdr (6224504646761731677,1,16389,): perdb"

Is there a way to get the cluster in a correct state without having to drop the db?

Thanks
-Selim

Sylvain MARECHAL

marechal.sylvain2@gmail.com

over 10 years ago

In reply to: Selim Tuvi (#1)

Re: bdr manual cleanup required

Did you try this :

https://github.com/2ndQuadrant/bdr/issues/127 :
<<<

BEGIN;
SET LOCAL bdr.skip_ddl_locking = on;
SET LOCAL bdr.permit_unsafe_ddl_commands = on;
SET LOCAL bdr.skip_ddl_replication = on;
SECURITY LABEL FOR bdr ON DATABASE mydb IS NULL;
DELETE FROM bdr.bdr_connections;
DELETE FROM bdr.bdr_nodes;
SELECT bdr.bdr_connections_changed();
COMMIT;

SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = current_database() AND application_name LIKE '%): perdb';

For now, I never went into situations where I had to destroy all the
databases in all nodes.

Sylvain

2015-12-04 20:40 GMT+01:00 Selim Tuvi <stuvi@ilm.com>:

Show quoted text

I am trying to repair a broken bdr cluster setup and so far everything I
tried failed. Under the original node that ran bdr.bdr_group_create I am
getting the following error:

2015-12-04 19:34:29.063 UTC,,,22991,,5661eac4.59cf,1,,2015-12-04 19:34:28
UTC,3/0,0,ERROR,55000,"previous init failed, manual cleanup is
required","Found bdr.bdr_nodes entry for bdr (6224504646761731677,1,16389,)
with state=i in remote bdr.bdr_nodes","Remove all replication identifiers
and slots corresponding to this node from the init target node then drop
and recreate this database and try again",,,,,,,"bdr
(6224504646761731677,1,16389,): perdb"

Is there a way to get the cluster in a correct state without having to
drop the db?

Thanks
-Selim

Selim Tuvi

stuvi@ilm.com

over 10 years ago

In reply to: Sylvain MARECHAL (#2)

Re: bdr manual cleanup required

Thanks Sylvain, I ran the following on all nodes and dropped the db on all but the first node and rejoined them to the cluster.

Unfortunately the node_status still says "i" for the second and third nodes when I look at bdr.bdr_nodes under the first node.

Under the second node, the node_status has "r" for all and under the third node it has "i" only for the second node.

No warning or error entries in the log file on all nodes but the replication works only from the first node to the second and third nodes and from the second node to the third node.

-Selim

________________________________
From: Sylvain Marechal [marechal.sylvain2@gmail.com]
Sent: Sunday, December 06, 2015 4:23 AM
To: Selim Tuvi
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] bdr manual cleanup required

Did you try this :

https://github.com/2ndQuadrant/bdr/issues/127 :
<<<

SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = current_database() AND application_name LIKE '%): perdb';

For now, I never went into situations where I had to destroy all the databases in all nodes.
[https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif]

Sylvain

2015-12-04 20:40 GMT+01:00 Selim Tuvi <stuvi@ilm.com<mailto:stuvi@ilm.com>>:
I am trying to repair a broken bdr cluster setup and so far everything I tried failed. Under the original node that ran bdr.bdr_group_create I am getting the following error:

Is there a way to get the cluster in a correct state without having to drop the db?

Thanks
-Selim

Sylvain MARECHAL

marechal.sylvain2@gmail.com

over 10 years ago

In reply to: Selim Tuvi (#3)

Re: bdr manual cleanup required

I notice this 'i' state with bdr 0.9.1
(https://github.com/2ndQuadrant/bdr/issues/145)
But this is not the same problem as far as I understand.
In my case, I notice this problem when constantly updating the database.
(I was not able to reproduce it with 0.9.3)

Note that I sometimes saw this 'i' state with two nodes only and 0.9.3
version, but it didn't seem to affect the replication, even if I am not
confortable with this ...

Sylvain

Le 08/12/2015 18:36, Selim Tuvi a ï¿½crit :

Show quoted text

Thanks Sylvain, I ran the following on all nodes and dropped the db on
all but the first node and rejoined them to the cluster.

Unfortunately the node_status still says "i" for the second and third
nodes when I look at bdr.bdr_nodes under the first node.

Under the second node, the node_status has "r" for all and under the
third node it has "i" only for the second node.

No warning or error entries in the log file on all nodes but the
replication works only from the first node to the second and third
nodes and from the second node to the third node.

-Selim

------------------------------------------------------------------------
*From:* Sylvain Marechal [marechal.sylvain2@gmail.com]
*Sent:* Sunday, December 06, 2015 4:23 AM
*To:* Selim Tuvi
*Cc:* pgsql-general@postgresql.org
*Subject:* Re: [GENERAL] bdr manual cleanup required

Did you try this :

https://github.com/2ndQuadrant/bdr/issues/127 :
<<<
|BEGIN; SET LOCAL bdr.skip_ddl_locking = on; SET LOCAL
bdr.permit_unsafe_ddl_commands = on; SET LOCAL
bdr.skip_ddl_replication = on; SECURITY LABEL FOR bdr ON DATABASE mydb
IS NULL; DELETE FROM bdr.bdr_connections; DELETE FROM bdr.bdr_nodes;
SELECT bdr.bdr_connections_changed(); COMMIT; SELECT
pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname =
current_database() AND application_name LIKE '%): perdb'; |

For now, I never went into situations where I had to destroy all the
databases in all nodes.

Sylvain

2015-12-04 20:40 GMT+01:00 Selim Tuvi <stuvi@ilm.com
<mailto:stuvi@ilm.com>>:

I am trying to repair a broken bdr cluster setup and so far
everything I tried failed. Under the original node that ran
bdr.bdr_group_create I am getting the following error:

2015-12-04 19:34:29.063 UTC,,,22991,,5661eac4.59cf,1,,2015-12-04
19:34:28 UTC,3/0,0,ERROR,55000,"previous init failed, manual
cleanup is required","Found bdr.bdr_nodes entry for bdr
(6224504646761731677,1,16389,) with state=i in remote
bdr.bdr_nodes","Remove all replication identifiers and slots
corresponding to this node from the init target node then drop and
recreate this database and try again",,,,,,,"bdr
(6224504646761731677,1,16389,): perdb"

Is there a way to get the cluster in a correct state without
having to drop the db?

Thanks
-Selim

Craig Ringer

craig@2ndquadrant.com

over 10 years ago

In reply to: Sylvain MARECHAL (#4)

Re: bdr manual cleanup required

Are you adding more than one node at once?

BDR isn't currently smart enough to handle that. Make sure to wait until
one node is fully synced up before adding another.

Sylvain MARECHAL

marechal.sylvain2@gmail.com

over 10 years ago

In reply to: Craig Ringer (#5)

Re: bdr manual cleanup required

Le 09/12/2015 05:18, Craig Ringer a écrit :

Are you adding more than one node at once?

BDR isn't currently smart enough to handle that. Make sure to wait
until one node is fully synced up before adding another.

In other words, one shall not attemp to add a new node if the other
nodes are not in the 'r'eady state, when more than two nodes ?

But what about if one gets this 'i' state with two nodes only? in my
case, with two node only, in one side, both nodes had the state 'r',
while the states were 'r' and 'i' on the other side.

Thank-you,

Sylvain

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Craig Ringer

craig@2ndquadrant.com

over 10 years ago

In reply to: Sylvain MARECHAL (#6)

Re: bdr manual cleanup required

I really couldn't say with the available information.

Can you set provide a step-by-step process by which you set up these nodes?