Thread: Postgresql Split Brain: Which one is latest
Hi,
We have postgresql 9.5 with streaming replication(Master-slave) and automatic failover. Due to network glitch we are in master-master situation for quite some time. Please, could you advise best way to confirm which node is latest in terms of updates to the postgres databases.
Regards
Vikas Sharma
On 10/04/2018 16:50, Vikas Sharma wrote: > Hi, > > We have postgresql 9.5 with streaming replication(Master-slave) and automatic failover. Due to network glitch we are inmaster-master situation for quite some time. Please, could you advise best way > to confirm which node is latest in terms of updates to the postgres databases. The one with the latest timeline. > > Regards > Vikas Sharma -- Achilleas Mantzios IT DEV Lead IT DEPT Dynacom Tankers Mgmt
On 04/10/2018 06:50 AM, Vikas Sharma wrote: > Hi, > > We have postgresql 9.5 with streaming replication(Master-slave) and > automatic failover. Due to network glitch we are in master-master > situation for quite some time. Please, could you advise best way to > confirm which node is latest in terms of updates to the postgres databases. It might help to know how the two masters received data when they where operating independently. > > Regards > Vikas Sharma -- Adrian Klaver adrian.klaver@aklaver.com
Hi Adrian,
This can be a good example: Application server e.g. tomcat having two entries to connect to databases, one for master and 2nd for Slave (ideally used when slave becomes master). If application is not able to connect to first, it will try to connect to 2nd.
Regards
Vikas
On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 04/10/2018 06:50 AM, Vikas Sharma wrote:Hi,
We have postgresql 9.5 with streaming replication(Master-slave) and automatic failover. Due to network glitch we are in master-master situation for quite some time. Please, could you advise best way to confirm which node is latest in terms of updates to the postgres databases.
It might help to know how the two masters received data when they where operating independently.
Regards
Vikas Sharma
--
Adrian Klaver
adrian.klaver@aklaver.com
On Tue, Apr 10, 2018 at 11:04 AM, Vikas Sharma <shavikas@gmail.com> wrote:
Hi Adrian,This can be a good example: Application server e.g. tomcat having two entries to connect to databases, one for master and 2nd for Slave (ideally used when slave becomes master). If application is not able to connect to first, it will try to connect to 2nd.RegardsVikasOn 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com> wrote:On 04/10/2018 06:50 AM, Vikas Sharma wrote:Hi,
We have postgresql 9.5 with streaming replication(Master-slave) and automatic failover. Due to network glitch we are in master-master situation for quite some time. Please, could you advise best way to confirm which node is latest in terms of updates to the postgres databases.
It might help to know how the two masters received data when they where operating independently.
Regards
Vikas Sharma
--
Adrian Klaver
adrian.klaver@aklaver.com
Vikas,
Presuming the the real "master" will have additional records/rows inserted in the tables,
if you run ANALYZE on the database(s) in both "masters", then execute the following query
on both, whichever returns the highest count would be the real "master".
SELECT sum(c.reltuples::bigint)
FROM pg_stat_all_tables s
JOIN pg_class c ON c.oid = s.relid
WHERE s.relname NOT LIKE 'pg_%'
AND s.relname NOT LIKE 'sql_%';
--
SELECT sum(c.reltuples::bigint)
FROM pg_stat_all_tables s
JOIN pg_class c ON c.oid = s.relid
WHERE s.relname NOT LIKE 'pg_%'
AND s.relname NOT LIKE 'sql_%';
--
Melvin Davidson
Maj. Database & Exploration Specialist
Universe Exploration Command – UXC
Employment by invitation only!
Maj. Database & Exploration Specialist
Universe Exploration Command – UXC
Employment by invitation only!
Em 10/04/2018 12:28, Melvin Davidson escreveu: > > > On Tue, Apr 10, 2018 at 11:04 AM, Vikas Sharma <shavikas@gmail.com > <mailto:shavikas@gmail.com>> wrote: > > Hi Adrian, > > This can be a good example: Application server e.g. tomcat having > two entries to connect to databases, one for master and 2nd for > Slave (ideally used when slave becomes master). If application is > not able to connect to first, it will try to connect to 2nd. > > Regards > Vikas > > On 10 April 2018 at 15:26, Adrian Klaver > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote: > > On 04/10/2018 06:50 AM, Vikas Sharma wrote: > > Hi, > > We have postgresql 9.5 with streaming > replication(Master-slave) and automatic failover. Due to > network glitch we are in master-master situation for quite > some time. Please, could you advise best way to confirm > which node is latest in terms of updates to the postgres > databases. > > > It might help to know how the two masters received data when > they where operating independently. > > > Regards > Vikas Sharma > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > > > *Vikas, > > * > *Presuming the the real "master" will have additional records/rows > inserted in the tables, > * > *if you run ANALYZE on the database(s) in both "masters", then execute > the following query > * > *on both, whichever returns the highest count would be the real "master". > > SELECT sum(c.reltuples::bigint) > FROM pg_stat_all_tables s > JOIN pg_class c ON c.oid = s.relid > WHERE s.relname NOT LIKE 'pg_%' > AND s.relname NOT LIKE 'sql_%';* > > > -- > *Melvin Davidson** > Maj. Database & Exploration Specialist** > Universe Exploration Command – UXC* > Employment by invitation only! I'm just trying to understand the scenario... Correct me if I'm wrong, if you had two servers acting as master for a while, then you have inserted/updated records on both servers, and you will need some kind of "merge" of records into one of the databases, that will become the new updated master... If you have "sequences" (or "serial" fields), then you will get a bit trouble in your hands. Regards, Edson
On 04/10/2018 08:04 AM, Vikas Sharma wrote: > Hi Adrian, > > This can be a good example: Application server e.g. tomcat having two > entries to connect to databases, one for master and 2nd for Slave > (ideally used when slave becomes master). If application is not able to > connect to first, it will try to connect to 2nd. So the application server had a way of seeing the new master(old slave), in spite of the network glitch, that the original master database did not? If so and it was distributing data between the two masters on an unknown schedule, then as Edison pointed out in another post, you really have a split brain issue. Each master would have it's own view of the data and latest update would really only be relevant for that master. > > Regards > Vikas > > On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > On 04/10/2018 06:50 AM, Vikas Sharma wrote: > > Hi, > > We have postgresql 9.5 with streaming replication(Master-slave) > and automatic failover. Due to network glitch we are in > master-master situation for quite some time. Please, could you > advise best way to confirm which node is latest in terms of > updates to the postgres databases. > > > It might help to know how the two masters received data when they > where operating independently. > > > Regards > Vikas Sharma > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > -- Adrian Klaver adrian.klaver@aklaver.com
Thanks Adrian and Edison, I also think so. At the moment I have 2 masters, as soon as slave is promoted to master it starts its own timeline and application might have added data to either of them or both, only way to find out correct master now is the instance with max count of data in tables which could incur data loss as well. Correct me if wrong please?
Thanks and Regards
Vikas
On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 04/10/2018 08:04 AM, Vikas Sharma wrote:
> Hi Adrian,
>
> This can be a good example: Application server e.g. tomcat having two
> entries to connect to databases, one for master and 2nd for Slave
> (ideally used when slave becomes master). If application is not able to
> connect to first, it will try to connect to 2nd.
So the application server had a way of seeing the new master(old slave),
in spite of the network glitch, that the original master database did not?
If so and it was distributing data between the two masters on an unknown
schedule, then as Edison pointed out in another post, you really have a
split brain issue. Each master would have it's own view of the data and
latest update would really only be relevant for that master.
>
> Regards
> Vikas
>
> On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
> On 04/10/2018 06:50 AM, Vikas Sharma wrote:
>
> Hi,
>
> We have postgresql 9.5 with streaming replication(Master-slave)
> and automatic failover. Due to network glitch we are in
> master-master situation for quite some time. Please, could you
> advise best way to confirm which node is latest in terms of
> updates to the postgres databases.
>
>
> It might help to know how the two masters received data when they
> where operating independently.
>
>
> Regards
> Vikas Sharma
>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
You need to find out when the split happened, and whether each new master have records since then.
On 04/10/2018 11:47 AM, Vikas Sharma wrote:
Thanks Adrian and Edison, I also think so. At the moment I have 2 masters, as soon as slave is promoted to master it starts its own timeline and application might have added data to either of them or both, only way to find out correct master now is the instance with max count of data in tables which could incur data loss as well. Correct me if wrong please?Thanks and RegardsVikasOn Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com> wrote:On 04/10/2018 08:04 AM, Vikas Sharma wrote:
> Hi Adrian,
>
> This can be a good example: Application server e.g. tomcat having two
> entries to connect to databases, one for master and 2nd for Slave
> (ideally used when slave becomes master). If application is not able to
> connect to first, it will try to connect to 2nd.
So the application server had a way of seeing the new master(old slave),
in spite of the network glitch, that the original master database did not?
If so and it was distributing data between the two masters on an unknown
schedule, then as Edison pointed out in another post, you really have a
split brain issue. Each master would have it's own view of the data and
latest update would really only be relevant for that master.
>
> Regards
> Vikas
>
> On 10 April 2018 at 15:26, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
> On 04/10/2018 06:50 AM, Vikas Sharma wrote:
>
> Hi,
>
> We have postgresql 9.5 with streaming replication(Master-slave)
> and automatic failover. Due to network glitch we are in
> master-master situation for quite some time. Please, could you
> advise best way to confirm which node is latest in terms of
> updates to the postgres databases.
>
>
> It might help to know how the two masters received data when they
> where operating independently.
>
>
> Regards
> Vikas Sharma
>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.
On 04/10/2018 09:47 AM, Vikas Sharma wrote: > Thanks Adrian and Edison, I also think so. At the moment I have 2 > masters, as soon as slave is promoted to master it starts its own > timeline and application might have added data to either of them or > both, only way to find out correct master now is the instance with max > count of data in tables which could incur data loss as well. Correct me > if wrong please? Not sure max count is necessarily a valid indicator: 1) What if there was a legitimate large delete process? 2) The application/end users where looking at two different views of the data at different points in time. Just because the count is higher does not mean the data is actually valid. > > Thanks and Regards > Vikas > > On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > On 04/10/2018 08:04 AM, Vikas Sharma wrote: > > Hi Adrian, > > > > This can be a good example: Application server e.g. tomcat having two > > entries to connect to databases, one for master and 2nd for Slave > > (ideally used when slave becomes master). If application is not > able to > > connect to first, it will try to connect to 2nd. > > So the application server had a way of seeing the new master(old slave), > in spite of the network glitch, that the original master database > did not? > > If so and it was distributing data between the two masters on an unknown > schedule, then as Edison pointed out in another post, you really have a > split brain issue. Each master would have it's own view of the data and > latest update would really only be relevant for that master. > > > > > Regards > > Vikas > > > > On 10 April 2018 at 15:26, Adrian Klaver > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > <mailto:adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>>> wrote: > > > > On 04/10/2018 06:50 AM, Vikas Sharma wrote: > > > > Hi, > > > > We have postgresql 9.5 with streaming > replication(Master-slave) > > and automatic failover. Due to network glitch we are in > > master-master situation for quite some time. Please, > could you > > advise best way to confirm which node is latest in terms of > > updates to the postgres databases. > > > > > > It might help to know how the two masters received data when they > > where operating independently. > > > > > > Regards > > Vikas Sharma > > > > > > > > -- > > Adrian Klaver > > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > <mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> > > > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > -- Adrian Klaver adrian.klaver@aklaver.com
Max count is one way (vague I agree), before confirming I will ask the application owner to have a look on data in tables as well.
Regards
On Tue, Apr 10, 2018, 17:55 Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 04/10/2018 09:47 AM, Vikas Sharma wrote:
> Thanks Adrian and Edison, I also think so. At the moment I have 2
> masters, as soon as slave is promoted to master it starts its own
> timeline and application might have added data to either of them or
> both, only way to find out correct master now is the instance with max
> count of data in tables which could incur data loss as well. Correct me
> if wrong please?
Not sure max count is necessarily a valid indicator:
1) What if there was a legitimate large delete process?
2) The application/end users where looking at two different views of the
data at different points in time. Just because the count is higher does
not mean the data is actually valid.
>
> Thanks and Regards
> Vikas
>
> On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
> On 04/10/2018 08:04 AM, Vikas Sharma wrote:
> > Hi Adrian,
> >
> > This can be a good example: Application server e.g. tomcat having two
> > entries to connect to databases, one for master and 2nd for Slave
> > (ideally used when slave becomes master). If application is not
> able to
> > connect to first, it will try to connect to 2nd.
>
> So the application server had a way of seeing the new master(old slave),
> in spite of the network glitch, that the original master database
> did not?
>
> If so and it was distributing data between the two masters on an unknown
> schedule, then as Edison pointed out in another post, you really have a
> split brain issue. Each master would have it's own view of the data and
> latest update would really only be relevant for that master.
>
> >
> > Regards
> > Vikas
> >
> > On 10 April 2018 at 15:26, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> > <mailto:adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>>> wrote:
> >
> > On 04/10/2018 06:50 AM, Vikas Sharma wrote:
> >
> > Hi,
> >
> > We have postgresql 9.5 with streaming
> replication(Master-slave)
> > and automatic failover. Due to network glitch we are in
> > master-master situation for quite some time. Please,
> could you
> > advise best way to confirm which node is latest in terms of
> > updates to the postgres databases.
> >
> >
> > It might help to know how the two masters received data when they
> > where operating independently.
> >
> >
> > Regards
> > Vikas Sharma
> >
> >
> >
> > --
> > Adrian Klaver
> > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> <mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>>
> >
> >
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
On 04/10/2018 10:02 AM, Vikas Sharma wrote: > Max count is one way (vague I agree), before confirming I will ask the > application owner to have a look on data in tables as well. Along that line and dependent on the size of the database and individual tables you might try: 1) Do a plain text dump of the data from the same table from each master. 2) Diff the data dumps. > > Regards > > -- Adrian Klaver adrian.klaver@aklaver.com
On Tue, 10 Apr 2018 17:02:39 +0000 Vikas Sharma <shavikas@gmail.com> wrote: > Max count is one way (vague I agree), before confirming I will ask the > application owner to have a look on data in tables as well. Maybe you could compare your tables on both sides using a tool like pg_comparator? See: https://cri.ensmp.fr/people/coelho/pg_comparator/pg_comparator.html By the way, what are you using for your auto-failover? What went wrong to end-up with a split brain situation? Regards, > On Tue, Apr 10, 2018, 17:55 Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > > On 04/10/2018 09:47 AM, Vikas Sharma wrote: > > > Thanks Adrian and Edison, I also think so. At the moment I have 2 > > > masters, as soon as slave is promoted to master it starts its own > > > timeline and application might have added data to either of them or > > > both, only way to find out correct master now is the instance with max > > > count of data in tables which could incur data loss as well. Correct me > > > if wrong please? > > > > Not sure max count is necessarily a valid indicator: > > > > 1) What if there was a legitimate large delete process? > > > > 2) The application/end users where looking at two different views of the > > data at different points in time. Just because the count is higher does > > not mean the data is actually valid. > > > > > > > > Thanks and Regards > > > Vikas > > > > > > On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian.klaver@aklaver.com > > > <mailto:adrian.klaver@aklaver.com>> wrote: > > > > > > On 04/10/2018 08:04 AM, Vikas Sharma wrote: > > > > Hi Adrian, > > > > > > > > This can be a good example: Application server e.g. tomcat having > > two > > > > entries to connect to databases, one for master and 2nd for Slave > > > > (ideally used when slave becomes master). If application is not > > > able to > > > > connect to first, it will try to connect to 2nd. > > > > > > So the application server had a way of seeing the new master(old > > slave), > > > in spite of the network glitch, that the original master database > > > did not? > > > > > > If so and it was distributing data between the two masters on an > > unknown > > > schedule, then as Edison pointed out in another post, you really > > have a > > > split brain issue. Each master would have it's own view of the data > > and > > > latest update would really only be relevant for that master. > > > > > > > > > > > Regards > > > > Vikas > > > > > > > > On 10 April 2018 at 15:26, Adrian Klaver > > > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > > > <mailto:adrian.klaver@aklaver.com > > > <mailto:adrian.klaver@aklaver.com>>> wrote: > > > > > > > > On 04/10/2018 06:50 AM, Vikas Sharma wrote: > > > > > > > > Hi, > > > > > > > > We have postgresql 9.5 with streaming > > > replication(Master-slave) > > > > and automatic failover. Due to network glitch we are in > > > > master-master situation for quite some time. Please, > > > could you > > > > advise best way to confirm which node is latest in terms > > of > > > > updates to the postgres databases. > > > > > > > > > > > > It might help to know how the two masters received data when > > they > > > > where operating independently. > > > > > > > > > > > > Regards > > > > Vikas Sharma > > > > > > > > > > > > > > > > -- > > > > Adrian Klaver > > > > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > > <mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com > > >> > > > > > > > > > > > > > > > > > -- > > > Adrian Klaver > > > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > > > > > > > > -- > > Adrian Klaver > > adrian.klaver@aklaver.com > > -- Jehan-Guillaume de Rorthais Dalibo