Postgresql Split Brain: Which one is latest

Started by Vikas Sharmaalmost 8 years ago13 messagesgeneral
Jump to latest
#1Vikas Sharma
shavikas@gmail.com

Hi,

We have postgresql 9.5 with streaming replication(Master-slave) and
automatic failover. Due to network glitch we are in master-master situation
for quite some time. Please, could you advise best way to confirm which
node is latest in terms of updates to the postgres databases.

Regards
Vikas Sharma

#2Achilleas Mantzios
achill@matrix.gatewaynet.com
In reply to: Vikas Sharma (#1)
Re: Postgresql Split Brain: Which one is latest

On 10/04/2018 16:50, Vikas Sharma wrote:

Hi,

We have postgresql 9.5 with streaming replication(Master-slave) and automatic failover. Due to network glitch we are in master-master situation for quite some time. Please, could you advise best way
to confirm which node is latest in terms of updates to the postgres databases.

The one with the latest timeline.

Regards
Vikas Sharma

--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Vikas Sharma (#1)
Re: Postgresql Split Brain: Which one is latest

On 04/10/2018 06:50 AM, Vikas Sharma wrote:

Hi,

We have postgresql 9.5 with streaming replication(Master-slave) and
automatic failover. Due to network glitch we are in master-master
situation for quite some time. Please, could you advise best way to
confirm which node is latest in terms of updates to the postgres databases.

It might help to know how the two masters received data when they where
operating independently.

Regards
Vikas Sharma

--
Adrian Klaver
adrian.klaver@aklaver.com

#4Vikas Sharma
shavikas@gmail.com
In reply to: Adrian Klaver (#3)
Re: Postgresql Split Brain: Which one is latest

Hi Adrian,

This can be a good example: Application server e.g. tomcat having two
entries to connect to databases, one for master and 2nd for Slave (ideally
used when slave becomes master). If application is not able to connect to
first, it will try to connect to 2nd.

Regards
Vikas

On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

Show quoted text

On 04/10/2018 06:50 AM, Vikas Sharma wrote:

Hi,

We have postgresql 9.5 with streaming replication(Master-slave) and
automatic failover. Due to network glitch we are in master-master situation
for quite some time. Please, could you advise best way to confirm which
node is latest in terms of updates to the postgres databases.

It might help to know how the two masters received data when they where
operating independently.

Regards
Vikas Sharma

--
Adrian Klaver
adrian.klaver@aklaver.com

#5Melvin Davidson
melvin6925@gmail.com
In reply to: Vikas Sharma (#4)
Re: Postgresql Split Brain: Which one is latest

On Tue, Apr 10, 2018 at 11:04 AM, Vikas Sharma <shavikas@gmail.com> wrote:

Hi Adrian,

This can be a good example: Application server e.g. tomcat having two
entries to connect to databases, one for master and 2nd for Slave (ideally
used when slave becomes master). If application is not able to connect to
first, it will try to connect to 2nd.

Regards
Vikas

On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 04/10/2018 06:50 AM, Vikas Sharma wrote:

Hi,

We have postgresql 9.5 with streaming replication(Master-slave) and
automatic failover. Due to network glitch we are in master-master situation
for quite some time. Please, could you advise best way to confirm which
node is latest in terms of updates to the postgres databases.

It might help to know how the two masters received data when they where
operating independently.

Regards
Vikas Sharma

--
Adrian Klaver
adrian.klaver@aklaver.com

*Vikas,*

*Presuming the the real "master" will have additional records/rows inserted
in the tables,*

*if you run ANALYZE on the database(s) in both "masters", then execute the
following query *

*on both, whichever returns the highest count would be the real
"master". SELECT sum(c.reltuples::bigint) FROM pg_stat_all_tables s
JOIN pg_class c ON c.oid = s.relid WHERE s.relname NOT LIKE 'pg_%' AND
s.relname NOT LIKE 'sql_%';*

--
*Melvin Davidson*
*Maj. Database & Exploration Specialist*
*Universe Exploration Command – UXC*
Employment by invitation only!

#6Edson Carlos Ericksson Richter
richter@simkorp.com.br
In reply to: Melvin Davidson (#5)
Re: Postgresql Split Brain: Which one is latest

Em 10/04/2018 12:28, Melvin Davidson escreveu:

On Tue, Apr 10, 2018 at 11:04 AM, Vikas Sharma <shavikas@gmail.com
<mailto:shavikas@gmail.com>> wrote:

Hi Adrian,

This can be a good example: Application server e.g. tomcat having
two entries to connect to databases, one for master and 2nd for
Slave (ideally used when slave becomes master). If application is
not able to connect to first, it will try to connect to 2nd.

Regards
Vikas

On 10 April 2018 at 15:26, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

On 04/10/2018 06:50 AM, Vikas Sharma wrote:

Hi,

We have postgresql 9.5 with streaming
replication(Master-slave) and automatic failover. Due to
network glitch we are in master-master situation for quite
some time. Please, could you advise best way to confirm
which node is latest in terms of updates to the postgres
databases.

It might help to know how the two masters received data when
they where operating independently.

Regards
Vikas Sharma

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

*Vikas,

*
*Presuming the the real "master" will have additional records/rows
inserted in the tables,
*
*if you run ANALYZE on the database(s) in both "masters", then execute
the following query
*
*on both, whichever returns the highest count would be the real "master".

 SELECT sum(c.reltuples::bigint)
    FROM pg_stat_all_tables s
      JOIN pg_class c ON c.oid = s.relid
 WHERE s.relname NOT LIKE 'pg_%'
   AND s.relname NOT LIKE 'sql_%';*

--
*Melvin Davidson**
Maj. Database & Exploration Specialist**
Universe Exploration Command – UXC*
Employment by invitation only!

I'm just trying to understand the scenario...

Correct me if I'm wrong, if you had two servers acting as master for a
while, then you have inserted/updated records on both servers, and you
will need some kind of "merge" of records into one of the databases,
that will become the new updated master...

If you have "sequences" (or "serial" fields), then you will get a bit
trouble in your hands.

Regards,

Edson

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Vikas Sharma (#4)
Re: Postgresql Split Brain: Which one is latest

On 04/10/2018 08:04 AM, Vikas Sharma wrote:

Hi Adrian,

This can be a good example: Application server e.g. tomcat having two
entries to connect to databases, one for master and 2nd for Slave
(ideally used when slave becomes master). If application is not able to
connect to first, it will try to connect to 2nd.

So the application server had a way of seeing the new master(old slave),
in spite of the network glitch, that the original master database did not?

If so and it was distributing data between the two masters on an unknown
schedule, then as Edison pointed out in another post, you really have a
split brain issue. Each master would have it's own view of the data and
latest update would really only be relevant for that master.

Regards
Vikas

On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 04/10/2018 06:50 AM, Vikas Sharma wrote:

Hi,

We have postgresql 9.5 with streaming replication(Master-slave)
and automatic failover. Due to network glitch we are in
master-master situation for quite some time. Please, could you
advise best way to confirm which node is latest in terms of
updates to the postgres databases.

It might help to know how the two masters received data when they
where operating independently.

Regards
Vikas Sharma

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#8Vikas Sharma
shavikas@gmail.com
In reply to: Adrian Klaver (#7)
Re: Postgresql Split Brain: Which one is latest

Thanks Adrian and Edison, I also think so. At the moment I have 2 masters,
as soon as slave is promoted to master it starts its own timeline and
application might have added data to either of them or both, only way to
find out correct master now is the instance with max count of data in
tables which could incur data loss as well. Correct me if wrong please?

Thanks and Regards
Vikas

On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com> wrote:

Show quoted text

On 04/10/2018 08:04 AM, Vikas Sharma wrote:

Hi Adrian,

This can be a good example: Application server e.g. tomcat having two
entries to connect to databases, one for master and 2nd for Slave
(ideally used when slave becomes master). If application is not able to
connect to first, it will try to connect to 2nd.

So the application server had a way of seeing the new master(old slave),
in spite of the network glitch, that the original master database did not?

If so and it was distributing data between the two masters on an unknown
schedule, then as Edison pointed out in another post, you really have a
split brain issue. Each master would have it's own view of the data and
latest update would really only be relevant for that master.

Regards
Vikas

On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 04/10/2018 06:50 AM, Vikas Sharma wrote:

Hi,

We have postgresql 9.5 with streaming replication(Master-slave)
and automatic failover. Due to network glitch we are in
master-master situation for quite some time. Please, could you
advise best way to confirm which node is latest in terms of
updates to the postgres databases.

It might help to know how the two masters received data when they
where operating independently.

Regards
Vikas Sharma

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#9Ron
ronljohnsonjr@gmail.com
In reply to: Vikas Sharma (#8)
Re: Postgresql Split Brain: Which one is latest

You need to find out when the split happened, and whether each new master
have records since then.

On 04/10/2018 11:47 AM, Vikas Sharma wrote:

Thanks Adrian and Edison, I also think so. At the moment I have 2 masters,
as soon as slave is promoted to master it starts its own timeline and
application might have added data to either of them or both, only way to
find out correct master now is the instance with max count of data in
tables which could incur data loss as well. Correct me if wrong please?

Thanks and Regards
Vikas

On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 04/10/2018 08:04 AM, Vikas Sharma wrote:

Hi Adrian,

This can be a good example: Application server e.g. tomcat having two
entries to connect to databases, one for master and 2nd for Slave
(ideally used when slave becomes master). If application is not able to
connect to first, it will try to connect to 2nd.

So the application server had a way of seeing the new master(old slave),
in spite of the network glitch, that the original master database did not?

If so and it was distributing data between the two masters on an unknown
schedule, then as Edison pointed out in another post, you really have a
split brain issue. Each master would have it's own view of the data and
latest update would really only be relevant for that master.

Regards
Vikas

On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com

<mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com

<mailto:adrian.klaver@aklaver.com>>> wrote:

     On 04/10/2018 06:50 AM, Vikas Sharma wrote:

         Hi,

         We have postgresql 9.5 with streaming replication(Master-slave)
         and automatic failover. Due to network glitch we are in
         master-master situation for quite some time. Please, could you
         advise best way to confirm which node is latest in terms of
         updates to the postgres databases.

     It might help to know how the two masters received data when they
     where operating independently.

         Regards
         Vikas Sharma

     --
     Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>>

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Angular momentum makes the world go 'round.

#10Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Vikas Sharma (#8)
Re: Postgresql Split Brain: Which one is latest

On 04/10/2018 09:47 AM, Vikas Sharma wrote:

Thanks Adrian and Edison, I also think so. At the moment I have 2
masters, as soon as slave is promoted to master it starts its own
timeline and application might have added data to either of them or
both, only way to find out correct master now is the instance with max
count of data in tables which could incur data loss as well. Correct me
if wrong please?

Not sure max count is necessarily a valid indicator:

1) What if there was a legitimate large delete process?

2) The application/end users where looking at two different views of the
data at different points in time. Just because the count is higher does
not mean the data is actually valid.

Thanks and Regards
Vikas

On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 04/10/2018 08:04 AM, Vikas Sharma wrote:

Hi Adrian,

This can be a good example: Application server e.g. tomcat having two
entries to connect to databases, one for master and 2nd for Slave
(ideally used when slave becomes master). If application is not

able to

connect to first, it will try to connect to 2nd.

So the application server had a way of seeing the new master(old slave),
in spite of the network glitch, that the original master database
did not?

If so and it was distributing data between the two masters on an unknown
schedule, then as Edison pointed out in another post, you really have a
split brain issue. Each master would have it's own view of the data and
latest update would really only be relevant for that master.

Regards
Vikas

On 10 April 2018 at 15:26, Adrian Klaver

<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com

<mailto:adrian.klaver@aklaver.com>>> wrote:

     On 04/10/2018 06:50 AM, Vikas Sharma wrote:

         Hi,

         We have postgresql 9.5 with streaming

replication(Master-slave)

         and automatic failover. Due to network glitch we are in
         master-master situation for quite some time. Please,

could you

         advise best way to confirm which node is latest in terms of
         updates to the postgres databases.

     It might help to know how the two masters received data when they
     where operating independently.

         Regards
         Vikas Sharma

     --
     Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>>

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#11Vikas Sharma
shavikas@gmail.com
In reply to: Adrian Klaver (#10)
Re: Postgresql Split Brain: Which one is latest

Max count is one way (vague I agree), before confirming I will ask the
application owner to have a look on data in tables as well.

Regards

On Tue, Apr 10, 2018, 17:55 Adrian Klaver <adrian.klaver@aklaver.com> wrote:

Show quoted text

On 04/10/2018 09:47 AM, Vikas Sharma wrote:

Thanks Adrian and Edison, I also think so. At the moment I have 2
masters, as soon as slave is promoted to master it starts its own
timeline and application might have added data to either of them or
both, only way to find out correct master now is the instance with max
count of data in tables which could incur data loss as well. Correct me
if wrong please?

Not sure max count is necessarily a valid indicator:

1) What if there was a legitimate large delete process?

2) The application/end users where looking at two different views of the
data at different points in time. Just because the count is higher does
not mean the data is actually valid.

Thanks and Regards
Vikas

On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 04/10/2018 08:04 AM, Vikas Sharma wrote:

Hi Adrian,

This can be a good example: Application server e.g. tomcat having

two

entries to connect to databases, one for master and 2nd for Slave
(ideally used when slave becomes master). If application is not

able to

connect to first, it will try to connect to 2nd.

So the application server had a way of seeing the new master(old

slave),

in spite of the network glitch, that the original master database
did not?

If so and it was distributing data between the two masters on an

unknown

schedule, then as Edison pointed out in another post, you really

have a

split brain issue. Each master would have it's own view of the data

and

latest update would really only be relevant for that master.

Regards
Vikas

On 10 April 2018 at 15:26, Adrian Klaver

<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com

<mailto:adrian.klaver@aklaver.com>>> wrote:

On 04/10/2018 06:50 AM, Vikas Sharma wrote:

Hi,

We have postgresql 9.5 with streaming

replication(Master-slave)

and automatic failover. Due to network glitch we are in
master-master situation for quite some time. Please,

could you

advise best way to confirm which node is latest in terms

of

updates to the postgres databases.

It might help to know how the two masters received data when

they

where operating independently.

Regards
Vikas Sharma

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#12Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Vikas Sharma (#11)
Re: Postgresql Split Brain: Which one is latest

On 04/10/2018 10:02 AM, Vikas Sharma wrote:

Max count is one way (vague I agree), before confirming I will ask the
application owner to have a look on data in tables as well.

Along that line and dependent on the size of the database and individual
tables you might try:

1) Do a plain text dump of the data from the same table from each master.

2) Diff the data dumps.

Regards

--
Adrian Klaver
adrian.klaver@aklaver.com

In reply to: Vikas Sharma (#11)
Re: Postgresql Split Brain: Which one is latest

On Tue, 10 Apr 2018 17:02:39 +0000
Vikas Sharma <shavikas@gmail.com> wrote:

Max count is one way (vague I agree), before confirming I will ask the
application owner to have a look on data in tables as well.

Maybe you could compare your tables on both sides using a tool like
pg_comparator? See:

https://cri.ensmp.fr/people/coelho/pg_comparator/pg_comparator.html

By the way, what are you using for your auto-failover? What went wrong to
end-up with a split brain situation?

Regards,

On Tue, Apr 10, 2018, 17:55 Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 04/10/2018 09:47 AM, Vikas Sharma wrote:

Thanks Adrian and Edison, I also think so. At the moment I have 2
masters, as soon as slave is promoted to master it starts its own
timeline and application might have added data to either of them or
both, only way to find out correct master now is the instance with max
count of data in tables which could incur data loss as well. Correct me
if wrong please?

Not sure max count is necessarily a valid indicator:

1) What if there was a legitimate large delete process?

2) The application/end users where looking at two different views of the
data at different points in time. Just because the count is higher does
not mean the data is actually valid.

Thanks and Regards
Vikas

On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 04/10/2018 08:04 AM, Vikas Sharma wrote:

Hi Adrian,

This can be a good example: Application server e.g. tomcat having

two

entries to connect to databases, one for master and 2nd for Slave
(ideally used when slave becomes master). If application is not

able to

connect to first, it will try to connect to 2nd.

So the application server had a way of seeing the new master(old

slave),

in spite of the network glitch, that the original master database
did not?

If so and it was distributing data between the two masters on an

unknown

schedule, then as Edison pointed out in another post, you really

have a

split brain issue. Each master would have it's own view of the data

and

latest update would really only be relevant for that master.

Regards
Vikas

On 10 April 2018 at 15:26, Adrian Klaver

<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com

<mailto:adrian.klaver@aklaver.com>>> wrote:

On 04/10/2018 06:50 AM, Vikas Sharma wrote:

Hi,

We have postgresql 9.5 with streaming

replication(Master-slave)

and automatic failover. Due to network glitch we are in
master-master situation for quite some time. Please,

could you

advise best way to confirm which node is latest in terms

of

updates to the postgres databases.

It might help to know how the two masters received data when

they

where operating independently.

Regards
Vikas Sharma

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Jehan-Guillaume de Rorthais
Dalibo