Re: Postgresql Split Brain: Which one is latest - Mailing list pgsql-general

From Jehan-Guillaume (ioguix) de Rorthais
Subject Re: Postgresql Split Brain: Which one is latest
Date
Msg-id 20180410211930.10fa058f@firost
Whole thread Raw
In response to Re: Postgresql Split Brain: Which one is latest  (Vikas Sharma <shavikas@gmail.com>)
List pgsql-general
On Tue, 10 Apr 2018 17:02:39 +0000
Vikas Sharma <shavikas@gmail.com> wrote:

> Max count is one way (vague I agree), before confirming I will ask the
> application owner to have a look on data in tables as well.

Maybe you could compare your tables on both sides using a tool like
pg_comparator? See:

  https://cri.ensmp.fr/people/coelho/pg_comparator/pg_comparator.html

By the way, what are you using for your auto-failover? What went wrong to
end-up with a split brain situation?

Regards,

> On Tue, Apr 10, 2018, 17:55 Adrian Klaver <adrian.klaver@aklaver.com> wrote:
> 
> > On 04/10/2018 09:47 AM, Vikas Sharma wrote:  
> > > Thanks Adrian and Edison, I also think so. At the moment I have 2
> > > masters, as soon as slave is promoted to master it starts its own
> > > timeline and application might have added data to either of them or
> > > both, only way to find out correct master now is the instance with max
> > > count of data in tables which could incur data loss as well. Correct me
> > > if wrong please?  
> >
> > Not sure max count is necessarily a valid indicator:
> >
> > 1) What if there was a legitimate large delete process?
> >
> > 2) The application/end users where looking at two different views of the
> > data at different points in time. Just because the count is higher does
> > not mean the data is actually valid.
> >  
> > >
> > > Thanks and Regards
> > > Vikas
> > >
> > > On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com
> > > <mailto:adrian.klaver@aklaver.com>> wrote:
> > >
> > >     On 04/10/2018 08:04 AM, Vikas Sharma wrote:  
> > >      > Hi Adrian,
> > >      >
> > >      > This can be a good example: Application server e.g. tomcat having  
> > two  
> > >      > entries to connect to databases, one for master and 2nd for Slave
> > >      > (ideally used when slave becomes master). If application is not  
> > >     able to  
> > >      > connect to first, it will try to connect to 2nd.  
> > >
> > >     So the application server had a way of seeing the new master(old  
> > slave),  
> > >     in spite of the network glitch, that the original master database
> > >     did not?
> > >
> > >     If so and it was distributing data between the two masters on an  
> > unknown  
> > >     schedule, then as Edison pointed out in another post, you really  
> > have a  
> > >     split brain issue. Each master would have it's own view of the data  
> > and  
> > >     latest update would really only be relevant for that master.
> > >  
> > >      >
> > >      > Regards
> > >      > Vikas
> > >      >
> > >      > On 10 April 2018 at 15:26, Adrian Klaver  
> > >     <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>  
> > >      > <mailto:adrian.klaver@aklaver.com  
> > >     <mailto:adrian.klaver@aklaver.com>>> wrote:  
> > >      >
> > >      >     On 04/10/2018 06:50 AM, Vikas Sharma wrote:
> > >      >
> > >      >         Hi,
> > >      >
> > >      >         We have postgresql 9.5 with streaming  
> > >     replication(Master-slave)  
> > >      >         and automatic failover. Due to network glitch we are in
> > >      >         master-master situation for quite some time. Please,  
> > >     could you  
> > >      >         advise best way to confirm which node is latest in terms  
> > of  
> > >      >         updates to the postgres databases.
> > >      >
> > >      >
> > >      >     It might help to know how the two masters received data when  
> > they  
> > >      >     where operating independently.
> > >      >
> > >      >
> > >      >         Regards
> > >      >         Vikas Sharma
> > >      >
> > >      >
> > >      >
> > >      >     --
> > >      >     Adrian Klaver
> > >      > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>  
> > >     <mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com  
> > >>
> > >      >
> > >      >  
> > >
> > >
> > >     --
> > >     Adrian Klaver
> > >     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> > >  
> >
> >
> > --
> > Adrian Klaver
> > adrian.klaver@aklaver.com
> >  



-- 
Jehan-Guillaume de Rorthais
Dalibo


pgsql-general by date:

Previous
From: Jerry Sievers
Date:
Subject: Re: best way to write large data-streams quickly?
Next
From: Peter Geoghegan
Date:
Subject: Re: ERROR: found multixact from before relminmxid