Re: Timeline following for logical slots - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Timeline following for logical slots
Date
Msg-id 20160331080907.GI13305@awork2.anarazel.de
Whole thread Raw
In response to Re: Timeline following for logical slots  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: Timeline following for logical slots  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-hackers
Hi,

On 2016-03-31 08:52:34 +0800, Craig Ringer wrote:
> On 31 March 2016 at 07:15, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> 
> 
> > > Available attached or at
> > >
> > https://github.com/2ndQuadrant/postgres/tree/dev/logical-decoding-timeline-following
> >
> > And pushed this too.
> >
> 
> Much appreciated. Marked as committed at
> https://commitfest.postgresql.org/9/568/ .
> 
> This gives us an option for failover of logical replication in 9.6, even if
> it's a bit cumbersome and complex for the client, in case failover slots
> don't make the cut. And, of course, it's a pre-req for failover slots,
> which I'll rebase on top of it shortly.

FWIW, I think it's dangerous to use this that way. If people manipulate
slots that way we'll have hellishly to debug issues. The test code needs
a big disclaimer to never ever be used in production, and we should
"disclaim any warranty" if somebody does that. To the point of not
fixing issues around it in back branches.

> Andres, I tried to address your comments as best I could. The main one that
> I think stayed open was about the loop that finds the last timeline on a
> segment. If you think that's better done by directly scanning the List* of
> timeline history entries I'm happy to prep a follow-up.

Have to look again.

+        * We start reading xlog from the restart lsn, even though in
+        * CreateDecodingContext we set the snapshot builder up using the
+        * slot's confirmed_flush. This means we might read xlog we don't
+        * actually decode rows from, but the snapshot builder might need it
+        * to get to a consistent point. The point we start returning data to
+        * *users* at is the confirmed_flush lsn set up in the decoding
+        * context.
+        */
still seems pretty misleading - and pretty much unrelated to the
callsite.


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Relation extension scalability
Next
From: Andres Freund
Date:
Subject: Re: Correction for replication slot creation error message in 9.6