Infinite replication loop with pglogical - Mailing list pgsql-general

From Peter J. Holzer
Subject Infinite replication loop with pglogical
Date
Msg-id 20171207134131.grhh3h6udazmavb3@hjp.at
Whole thread Raw
Responses Re: Infinite replication loop with pglogical
List pgsql-general
[I'm not sure whether this is the right list to discuss third party
extensions like 2ndQuadrant's pglogical. If there is a better place,
feel free to direct me to it]

We are testing bidirectional replication with pglogical:

Node 1:
    postgresql-9.6           9.6.6-1.pgdg90+1
    postgresql-9.6-pglogical 2.1.0-1.jessie+1
Node 2:
    postgresql-10            10.1-1.pgdg90+1
    postgresql-10-pglogical  2.1.0-1.stretch+1

(They are on different versions because we wanted to test a staggered
upgrade)

We have configured bidirectional replication on several tables. In
general that seems to work fine. When a row is inserted/updated/
deleted on one node, it is replicated to the other.

But now I seem to have triggered a replication loop:

wds=> select id, source_id, insert_ts, start_ts, update_ts, finished_ts, module_id, status_id, status_msg, pid,
usernamefrom qualitygate.jobs where id = 19142008; 
    id    | source_id |         insert_ts          | start_ts | update_ts | finished_ts | module_id | status_id |
status_msg| pid | username  

----------+-----------+----------------------------+----------+-----------+-------------+-----------+-----------+------------+-----+----------
 19142008 |         9 | 2017-12-07 11:39:09.904071 |          |           |             |        80 |         0 |
    |   0 | hjp 
(1 row)

wds=> select id, source_id, insert_ts, start_ts, update_ts, finished_ts, module_id, status_id, status_msg, pid,
usernamefrom qualitygate.jobs where id = 19142008; 
    id    | source_id |         insert_ts          |          start_ts          |         update_ts          |
finished_ts| module_id | status_id | status_msg | pid  | username  

----------+-----------+----------------------------+----------------------------+----------------------------+-------------+-----------+-----------+------------+------+----------
 19142008 |         9 | 2017-12-07 11:39:09.904071 | 2017-12-07 11:39:09.914616 | 2017-12-07 11:39:09.914616 |
  |        80 |         0 |            | 6863 | hjp 
(1 row)

wds=> select id, source_id, insert_ts, start_ts, update_ts, finished_ts, module_id, status_id, status_msg, pid,
usernamefrom qualitygate.jobs where id = 19142008; 
    id    | source_id |         insert_ts          |          start_ts          |         update_ts         |
finished_ts       | module_id | status_id | status_msg | pid | username  

----------+-----------+----------------------------+----------------------------+---------------------------+---------------------------+-----------+-----------+------------+-----+----------
 19142008 |         9 | 2017-12-07 11:39:09.904071 | 2017-12-07 11:39:09.914616 | 2017-12-07 11:39:09.95925 |
2017-12-0711:39:09.95925 |        80 |         0 |            |  -1 | hjp 
(1 row)

Subsequent selects randomly return one of three states for this row
(other rows are stable). The order above is the "logical order", i.e.
the row was first inserted with the first state, then updated to the
second, and finally to the third.

I suspect that the updates are bounced between the two nodes eternally
overwriting each other and never reaching a stable state.

Has anybody seen this? If so, is there a way to reliably avoid this?
(Maybe with a different pglogical.conflict_resolution setting, but if it
was a conflict, I should see something in the logs, right?)

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment

pgsql-general by date:

Previous
From: Ben Leslie
Date:
Subject: Enforcing adjacent ranges
Next
From: "Peter J. Holzer"
Date:
Subject: Re: Infinite replication loop with pglogical