Re: Logical Decoding Failover - Mailing list pgsql-general

From Colin Morelli
Subject Re: Logical Decoding Failover
Date
Msg-id CAPtU-UrJO6E+jHbTm3VTyONAgYZ0N4wujjJ9F9NqHoBAmhY=6A@mail.gmail.com
Whole thread Raw
In response to Re: Logical Decoding Failover  (Venkata Balaji N <nag1010@gmail.com>)
Responses Re: Logical Decoding Failover  (Venkata Balaji N <nag1010@gmail.com>)
List pgsql-general
Venkata,

No problem - thanks for replying. I'll try to clarify.

Say I have databases A and B, where A is primary and B is a replica. I have an application that executes 3 transactions against A, resulting in WAL log sequence numbers 1, 2, and 3. The WAL is successfully replicated to B.

I have a logical deciding application which is pulling the logical stream, and reads LSN 1 and 2 before the primary (A) fails. I decide to fail over to B, which has all data on it and things work great.

Now the logical decoding client connects to B (the new primary). The replication slot doesn't exist. So, it creates it and starts streaming. This is where the problem lies - as it would begin streaming from LSN 4 (anything after what has already been committed), because I have no way (that I can find) of restoring my "progress" through the WAL on a the replicas.

As a result, my application never sees the event at LSN 3. In fact, I'm not even sure how I could manually do this.

Hopefully that makes more sense.

Best,
Colin
On Mon, Aug 8, 2016 at 10:47 PM Venkata Balaji N <nag1010@gmail.com> wrote:
On Sun, Aug 7, 2016 at 9:29 PM, Colin Morelli <colin.morelli@gmail.com> wrote:
Venkata,

Thanks for the reply. Unfortunately something like PgPool still won't create the replication slots on all hosts, and record the LSN in a way that is reusable on the secondary.

Yes, thats correct, pgPool does not have anything to do with replication slots. That is something which you need to manually configure.
 
This sort of puts logical decoding at odds with HA, correct? In the case of master failover, there's no way to: a) know where in the stream you read to, or b) convert that to something that can be understood by a replica that's about to be promoted.

Can you please clarify, what did you exactly mean here ? are you referring to cascading standbys or something ? Failover is something which application has to do, to reconnect to the promoted standby. As far as logical decoding is concerned, you need to have a replication slot configured for the new master to ensure all the WALs which are not replicated to standby are retained. After promotion, the standby database becomes a standalone instance and the replication must be re-enabled. You can automate the standby promotion using tools like pgpool-II or pacemaker.

I am trying to differentiate standby-promotion and failover mechanism here.

Are there any scheduled improvements here either for 9.6 or 9.7?

I am not 100% sure, if there is anything developed from an failover mechanism perspective.

Regards,
Venkata B N

Fujitsu Australia

pgsql-general by date:

Previous
From: Venkata Balaji N
Date:
Subject: Re: Logical Decoding Failover
Next
From: "hari.prasath"
Date:
Subject: Jsonb extraction very slow