Thread: Postgresql Split Brain: Which one is latest

Postgresql Split Brain: Which one is latest

From
Vikas Sharma
Date:
Hi,

We have postgresql 9.5 with streaming replication(Master-slave) and automatic failover. Due to network glitch we are in master-master situation for quite some time. Please, could you advise best way to confirm which node is latest in terms of updates to the postgres databases.

Regards
Vikas Sharma

Re: Postgresql Split Brain: Which one is latest

From
Achilleas Mantzios
Date:
On 10/04/2018 16:50, Vikas Sharma wrote:
> Hi,
>
> We have postgresql 9.5 with streaming replication(Master-slave) and automatic failover. Due to network glitch we are
inmaster-master situation for quite some time. Please, could you advise best way 
 
> to confirm which node is latest in terms of updates to the postgres databases.
The one with the latest timeline.
>
> Regards
> Vikas Sharma


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Re: Postgresql Split Brain: Which one is latest

From
Adrian Klaver
Date:
On 04/10/2018 06:50 AM, Vikas Sharma wrote:
> Hi,
> 
> We have postgresql 9.5 with streaming replication(Master-slave) and 
> automatic failover. Due to network glitch we are in master-master 
> situation for quite some time. Please, could you advise best way to 
> confirm which node is latest in terms of updates to the postgres databases.

It might help to know how the two masters received data when they where 
operating independently.

> 
> Regards
> Vikas Sharma


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Postgresql Split Brain: Which one is latest

From
Vikas Sharma
Date:
Hi Adrian,

This can be a good example: Application server e.g. tomcat having two entries to connect to databases, one for master and 2nd for Slave (ideally used when slave becomes master). If application is not able to connect to first, it will try to connect to 2nd.

Regards
Vikas

On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 04/10/2018 06:50 AM, Vikas Sharma wrote:
Hi,

We have postgresql 9.5 with streaming replication(Master-slave) and automatic failover. Due to network glitch we are in master-master situation for quite some time. Please, could you advise best way to confirm which node is latest in terms of updates to the postgres databases.

It might help to know how the two masters received data when they where operating independently.


Regards
Vikas Sharma


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Postgresql Split Brain: Which one is latest

From
Melvin Davidson
Date:


On Tue, Apr 10, 2018 at 11:04 AM, Vikas Sharma <shavikas@gmail.com> wrote:
Hi Adrian,

This can be a good example: Application server e.g. tomcat having two entries to connect to databases, one for master and 2nd for Slave (ideally used when slave becomes master). If application is not able to connect to first, it will try to connect to 2nd.

Regards
Vikas

On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 04/10/2018 06:50 AM, Vikas Sharma wrote:
Hi,

We have postgresql 9.5 with streaming replication(Master-slave) and automatic failover. Due to network glitch we are in master-master situation for quite some time. Please, could you advise best way to confirm which node is latest in terms of updates to the postgres databases.

It might help to know how the two masters received data when they where operating independently.


Regards
Vikas Sharma


--
Adrian Klaver
adrian.klaver@aklaver.com


Vikas,

Presuming the the real "master" will have additional records/rows inserted in the tables,
if you run ANALYZE on the database(s) in both "masters", then execute the following query
on both, whichever returns the highest count would be the real "master".

 SELECT sum(c.reltuples::bigint)
    FROM pg_stat_all_tables s
      JOIN pg_class c ON c.oid = s.relid
 WHERE s.relname NOT LIKE 'pg_%'
   AND s.relname NOT LIKE 'sql_%';



--
Melvin Davidson
Maj. Database & Exploration Specialist

Universe Exploration Command – UXC

Employment by invitation only!

Re: Postgresql Split Brain: Which one is latest

From
Edson Carlos Ericksson Richter
Date:
Em 10/04/2018 12:28, Melvin Davidson escreveu:
>
>
> On Tue, Apr 10, 2018 at 11:04 AM, Vikas Sharma <shavikas@gmail.com 
> <mailto:shavikas@gmail.com>> wrote:
>
>     Hi Adrian,
>
>     This can be a good example: Application server e.g. tomcat having
>     two entries to connect to databases, one for master and 2nd for
>     Slave (ideally used when slave becomes master). If application is
>     not able to connect to first, it will try to connect to 2nd.
>
>     Regards
>     Vikas
>
>     On 10 April 2018 at 15:26, Adrian Klaver
>     <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>         On 04/10/2018 06:50 AM, Vikas Sharma wrote:
>
>             Hi,
>
>             We have postgresql 9.5 with streaming
>             replication(Master-slave) and automatic failover. Due to
>             network glitch we are in master-master situation for quite
>             some time. Please, could you advise best way to confirm
>             which node is latest in terms of updates to the postgres
>             databases.
>
>
>         It might help to know how the two masters received data when
>         they where operating independently.
>
>
>             Regards
>             Vikas Sharma
>
>
>
>         -- 
>         Adrian Klaver
>         adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>
>
> *Vikas,
>
> *
> *Presuming the the real "master" will have additional records/rows 
> inserted in the tables,
> *
> *if you run ANALYZE on the database(s) in both "masters", then execute 
> the following query
> *
> *on both, whichever returns the highest count would be the real "master".
>
>  SELECT sum(c.reltuples::bigint)
>     FROM pg_stat_all_tables s
>       JOIN pg_class c ON c.oid = s.relid
>  WHERE s.relname NOT LIKE 'pg_%'
>    AND s.relname NOT LIKE 'sql_%';*
>
>
> -- 
> *Melvin Davidson**
> Maj. Database & Exploration Specialist**
> Universe Exploration Command – UXC*
> Employment by invitation only!

I'm just trying to understand the scenario...

Correct me if I'm wrong, if you had two servers acting as master for a 
while, then you have inserted/updated records on both servers, and you 
will need some kind of "merge" of records into one of the databases, 
that will become the new updated master...

If you have "sequences" (or "serial" fields), then you will get a bit 
trouble in your hands.

Regards,

Edson



Re: Postgresql Split Brain: Which one is latest

From
Adrian Klaver
Date:
On 04/10/2018 08:04 AM, Vikas Sharma wrote:
> Hi Adrian,
> 
> This can be a good example: Application server e.g. tomcat having two 
> entries to connect to databases, one for master and 2nd for Slave 
> (ideally used when slave becomes master). If application is not able to 
> connect to first, it will try to connect to 2nd.

So the application server had a way of seeing the new master(old slave), 
in spite of the network glitch, that the original master database did not?

If so and it was distributing data between the two masters on an unknown 
schedule, then as Edison pointed out in another post, you really have a 
split brain issue. Each master would have it's own view of the data and 
latest update would really only be relevant for that master.

> 
> Regards
> Vikas
> 
> On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com 
> <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 04/10/2018 06:50 AM, Vikas Sharma wrote:
> 
>         Hi,
> 
>         We have postgresql 9.5 with streaming replication(Master-slave)
>         and automatic failover. Due to network glitch we are in
>         master-master situation for quite some time. Please, could you
>         advise best way to confirm which node is latest in terms of
>         updates to the postgres databases.
> 
> 
>     It might help to know how the two masters received data when they
>     where operating independently.
> 
> 
>         Regards
>         Vikas Sharma
> 
> 
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Postgresql Split Brain: Which one is latest

From
Vikas Sharma
Date:
Thanks Adrian and Edison, I also think so. At the moment I have 2 masters, as soon as slave is promoted to master it starts its own timeline and application might have added data to either of them or both, only way to find out correct master now is the instance with max count of data in tables which could incur data loss as well. Correct me if wrong please?

Thanks and Regards
Vikas

On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 04/10/2018 08:04 AM, Vikas Sharma wrote:
> Hi Adrian,
>
> This can be a good example: Application server e.g. tomcat having two
> entries to connect to databases, one for master and 2nd for Slave
> (ideally used when slave becomes master). If application is not able to
> connect to first, it will try to connect to 2nd.

So the application server had a way of seeing the new master(old slave),
in spite of the network glitch, that the original master database did not?

If so and it was distributing data between the two masters on an unknown
schedule, then as Edison pointed out in another post, you really have a
split brain issue. Each master would have it's own view of the data and
latest update would really only be relevant for that master.

>
> Regards
> Vikas
>
> On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 04/10/2018 06:50 AM, Vikas Sharma wrote:
>
>         Hi,
>
>         We have postgresql 9.5 with streaming replication(Master-slave)
>         and automatic failover. Due to network glitch we are in
>         master-master situation for quite some time. Please, could you
>         advise best way to confirm which node is latest in terms of
>         updates to the postgres databases.
>
>
>     It might help to know how the two masters received data when they
>     where operating independently.
>
>
>         Regards
>         Vikas Sharma
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Postgresql Split Brain: Which one is latest

From
Ron
Date:
You need to find out when the split happened, and whether each new master have records since then.

On 04/10/2018 11:47 AM, Vikas Sharma wrote:
Thanks Adrian and Edison, I also think so. At the moment I have 2 masters, as soon as slave is promoted to master it starts its own timeline and application might have added data to either of them or both, only way to find out correct master now is the instance with max count of data in tables which could incur data loss as well. Correct me if wrong please?

Thanks and Regards
Vikas

On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 04/10/2018 08:04 AM, Vikas Sharma wrote:
> Hi Adrian,
>
> This can be a good example: Application server e.g. tomcat having two
> entries to connect to databases, one for master and 2nd for Slave
> (ideally used when slave becomes master). If application is not able to
> connect to first, it will try to connect to 2nd.

So the application server had a way of seeing the new master(old slave),
in spite of the network glitch, that the original master database did not?

If so and it was distributing data between the two masters on an unknown
schedule, then as Edison pointed out in another post, you really have a
split brain issue. Each master would have it's own view of the data and
latest update would really only be relevant for that master.

>
> Regards
> Vikas
>
> On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 04/10/2018 06:50 AM, Vikas Sharma wrote:
>
>         Hi,
>
>         We have postgresql 9.5 with streaming replication(Master-slave)
>         and automatic failover. Due to network glitch we are in
>         master-master situation for quite some time. Please, could you
>         advise best way to confirm which node is latest in terms of
>         updates to the postgres databases.
>
>
>     It might help to know how the two masters received data when they
>     where operating independently.
>
>
>         Regards
>         Vikas Sharma
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

--
Angular momentum makes the world go 'round.

Re: Postgresql Split Brain: Which one is latest

From
Adrian Klaver
Date:
On 04/10/2018 09:47 AM, Vikas Sharma wrote:
> Thanks Adrian and Edison, I also think so. At the moment I have 2 
> masters, as soon as slave is promoted to master it starts its own 
> timeline and application might have added data to either of them or 
> both, only way to find out correct master now is the instance with max 
> count of data in tables which could incur data loss as well. Correct me 
> if wrong please?

Not sure max count is necessarily a valid indicator:

1) What if there was a legitimate large delete process?

2) The application/end users where looking at two different views of the 
data at different points in time. Just because the count is higher does 
not mean the data is actually valid.

> 
> Thanks and Regards
> Vikas
> 
> On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com 
> <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 04/10/2018 08:04 AM, Vikas Sharma wrote:
>      > Hi Adrian,
>      >
>      > This can be a good example: Application server e.g. tomcat having two
>      > entries to connect to databases, one for master and 2nd for Slave
>      > (ideally used when slave becomes master). If application is not
>     able to
>      > connect to first, it will try to connect to 2nd.
> 
>     So the application server had a way of seeing the new master(old slave),
>     in spite of the network glitch, that the original master database
>     did not?
> 
>     If so and it was distributing data between the two masters on an unknown
>     schedule, then as Edison pointed out in another post, you really have a
>     split brain issue. Each master would have it's own view of the data and
>     latest update would really only be relevant for that master.
> 
>      >
>      > Regards
>      > Vikas
>      >
>      > On 10 April 2018 at 15:26, Adrian Klaver
>     <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>      > <mailto:adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>>> wrote:
>      >
>      >     On 04/10/2018 06:50 AM, Vikas Sharma wrote:
>      >
>      >         Hi,
>      >
>      >         We have postgresql 9.5 with streaming
>     replication(Master-slave)
>      >         and automatic failover. Due to network glitch we are in
>      >         master-master situation for quite some time. Please,
>     could you
>      >         advise best way to confirm which node is latest in terms of
>      >         updates to the postgres databases.
>      >
>      >
>      >     It might help to know how the two masters received data when they
>      >     where operating independently.
>      >
>      >
>      >         Regards
>      >         Vikas Sharma
>      >
>      >
>      >
>      >     --
>      >     Adrian Klaver
>      > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>     <mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>>
>      >
>      >
> 
> 
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Postgresql Split Brain: Which one is latest

From
Vikas Sharma
Date:
Max count is one way (vague I agree), before confirming I will ask the application owner to have a look on data in tables as well. 

Regards


On Tue, Apr 10, 2018, 17:55 Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 04/10/2018 09:47 AM, Vikas Sharma wrote:
> Thanks Adrian and Edison, I also think so. At the moment I have 2
> masters, as soon as slave is promoted to master it starts its own
> timeline and application might have added data to either of them or
> both, only way to find out correct master now is the instance with max
> count of data in tables which could incur data loss as well. Correct me
> if wrong please?

Not sure max count is necessarily a valid indicator:

1) What if there was a legitimate large delete process?

2) The application/end users where looking at two different views of the
data at different points in time. Just because the count is higher does
not mean the data is actually valid.

>
> Thanks and Regards
> Vikas
>
> On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 04/10/2018 08:04 AM, Vikas Sharma wrote:
>      > Hi Adrian,
>      >
>      > This can be a good example: Application server e.g. tomcat having two
>      > entries to connect to databases, one for master and 2nd for Slave
>      > (ideally used when slave becomes master). If application is not
>     able to
>      > connect to first, it will try to connect to 2nd.
>
>     So the application server had a way of seeing the new master(old slave),
>     in spite of the network glitch, that the original master database
>     did not?
>
>     If so and it was distributing data between the two masters on an unknown
>     schedule, then as Edison pointed out in another post, you really have a
>     split brain issue. Each master would have it's own view of the data and
>     latest update would really only be relevant for that master.
>
>      >
>      > Regards
>      > Vikas
>      >
>      > On 10 April 2018 at 15:26, Adrian Klaver
>     <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>      > <mailto:adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>>> wrote:
>      >
>      >     On 04/10/2018 06:50 AM, Vikas Sharma wrote:
>      >
>      >         Hi,
>      >
>      >         We have postgresql 9.5 with streaming
>     replication(Master-slave)
>      >         and automatic failover. Due to network glitch we are in
>      >         master-master situation for quite some time. Please,
>     could you
>      >         advise best way to confirm which node is latest in terms of
>      >         updates to the postgres databases.
>      >
>      >
>      >     It might help to know how the two masters received data when they
>      >     where operating independently.
>      >
>      >
>      >         Regards
>      >         Vikas Sharma
>      >
>      >
>      >
>      >     --
>      >     Adrian Klaver
>      > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>     <mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>>
>      >
>      >
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Postgresql Split Brain: Which one is latest

From
Adrian Klaver
Date:
On 04/10/2018 10:02 AM, Vikas Sharma wrote:
> Max count is one way (vague I agree), before confirming I will ask the 
> application owner to have a look on data in tables as well.

Along that line and dependent on the size of the database and individual 
tables you might try:

1) Do a plain text dump of the data from the same table from each master.

2) Diff the data dumps.

> 
> Regards
> 
> 



-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Postgresql Split Brain: Which one is latest

From
"Jehan-Guillaume (ioguix) de Rorthais"
Date:
On Tue, 10 Apr 2018 17:02:39 +0000
Vikas Sharma <shavikas@gmail.com> wrote:

> Max count is one way (vague I agree), before confirming I will ask the
> application owner to have a look on data in tables as well.

Maybe you could compare your tables on both sides using a tool like
pg_comparator? See:

  https://cri.ensmp.fr/people/coelho/pg_comparator/pg_comparator.html

By the way, what are you using for your auto-failover? What went wrong to
end-up with a split brain situation?

Regards,

> On Tue, Apr 10, 2018, 17:55 Adrian Klaver <adrian.klaver@aklaver.com> wrote:
> 
> > On 04/10/2018 09:47 AM, Vikas Sharma wrote:  
> > > Thanks Adrian and Edison, I also think so. At the moment I have 2
> > > masters, as soon as slave is promoted to master it starts its own
> > > timeline and application might have added data to either of them or
> > > both, only way to find out correct master now is the instance with max
> > > count of data in tables which could incur data loss as well. Correct me
> > > if wrong please?  
> >
> > Not sure max count is necessarily a valid indicator:
> >
> > 1) What if there was a legitimate large delete process?
> >
> > 2) The application/end users where looking at two different views of the
> > data at different points in time. Just because the count is higher does
> > not mean the data is actually valid.
> >  
> > >
> > > Thanks and Regards
> > > Vikas
> > >
> > > On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com
> > > <mailto:adrian.klaver@aklaver.com>> wrote:
> > >
> > >     On 04/10/2018 08:04 AM, Vikas Sharma wrote:  
> > >      > Hi Adrian,
> > >      >
> > >      > This can be a good example: Application server e.g. tomcat having  
> > two  
> > >      > entries to connect to databases, one for master and 2nd for Slave
> > >      > (ideally used when slave becomes master). If application is not  
> > >     able to  
> > >      > connect to first, it will try to connect to 2nd.  
> > >
> > >     So the application server had a way of seeing the new master(old  
> > slave),  
> > >     in spite of the network glitch, that the original master database
> > >     did not?
> > >
> > >     If so and it was distributing data between the two masters on an  
> > unknown  
> > >     schedule, then as Edison pointed out in another post, you really  
> > have a  
> > >     split brain issue. Each master would have it's own view of the data  
> > and  
> > >     latest update would really only be relevant for that master.
> > >  
> > >      >
> > >      > Regards
> > >      > Vikas
> > >      >
> > >      > On 10 April 2018 at 15:26, Adrian Klaver  
> > >     <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>  
> > >      > <mailto:adrian.klaver@aklaver.com  
> > >     <mailto:adrian.klaver@aklaver.com>>> wrote:  
> > >      >
> > >      >     On 04/10/2018 06:50 AM, Vikas Sharma wrote:
> > >      >
> > >      >         Hi,
> > >      >
> > >      >         We have postgresql 9.5 with streaming  
> > >     replication(Master-slave)  
> > >      >         and automatic failover. Due to network glitch we are in
> > >      >         master-master situation for quite some time. Please,  
> > >     could you  
> > >      >         advise best way to confirm which node is latest in terms  
> > of  
> > >      >         updates to the postgres databases.
> > >      >
> > >      >
> > >      >     It might help to know how the two masters received data when  
> > they  
> > >      >     where operating independently.
> > >      >
> > >      >
> > >      >         Regards
> > >      >         Vikas Sharma
> > >      >
> > >      >
> > >      >
> > >      >     --
> > >      >     Adrian Klaver
> > >      > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>  
> > >     <mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com  
> > >>
> > >      >
> > >      >  
> > >
> > >
> > >     --
> > >     Adrian Klaver
> > >     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> > >  
> >
> >
> > --
> > Adrian Klaver
> > adrian.klaver@aklaver.com
> >  



-- 
Jehan-Guillaume de Rorthais
Dalibo