Re: Logical WAL streaming & START_REPLICATION - Mailing list pgsql-jdbc

From Craig Ringer
Subject Re: Logical WAL streaming & START_REPLICATION
Date
Msg-id CAMsr+YFwdpLqsBuV39SjGwmk2ES87ZHdczyB-5hbbqOZHctQ-w@mail.gmail.com
Whole thread Raw
In response to Re: Logical WAL streaming & START_REPLICATION  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: Logical WAL streaming & START_REPLICATION  (Dave Cramer <pg@fastcrypt.com>)
List pgsql-jdbc
On 9 March 2018 at 09:50, Craig Ringer <craig@2ndquadrant.com> wrote:
 
If your system will forget work on crash, it's not flushed, and you shouldn't report it flushed.


I haven't checked to see if PgJDBC actually exposes separate control of the reported flush position though.  If it doesn't, it really must in order to make replication slots work properly.

How this should work is:

- You receive a txn and PgJDBC sends feedback updating the received position but NOT flush position

- You send that txn's changes on to wherever they're going and tag them with the txn commit lsn

- When the recipient of the changes confirms it has them stored persistently and is crash safe, report the change's commit lsn to PgJDBC so it can update the flush position sent in feedback on the replication connection.

If the upstream crashes, it'll be able to restart from the last confirmed position. And you made sure you don't need anything older than that.

If the downstream crashes, you must either tolerate receiving duplicate transactions, or you must ensure that flushing to persistent storage on the downstream also atomically records the latest flushed upstream lsn. For example, you might put it in your Kafka messages, and in crash recovery, find the most recent / highest lsn you successfully stored with Kafka. Then you send *that* lsn when starting replication. This cannot make replication go backwards to an older position than you previously told the server you confirmed, but it *can* make the server skip over txns you didn't confirm to it but actually stored locally.

The client side part ensures that if you commit something to local storage but something crashes before the next feedback message reaches the server, the server won't send you a duplicate txn next time.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-jdbc by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Logical WAL streaming & START_REPLICATION
Next
From: Dave Cramer
Date:
Subject: Re: Logical WAL streaming & START_REPLICATION