Re: pg_replication_slot_advance to return NULL instead of 0/0 if slotnot advanced - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: pg_replication_slot_advance to return NULL instead of 0/0 if slotnot advanced
Date
Msg-id e02bd75a-025f-2484-b461-3889ae069704@2ndquadrant.com
Whole thread Raw
In response to Re: pg_replication_slot_advance to return NULL instead of 0/0 ifslot not advanced  (Michael Paquier <michael@paquier.xyz>)
Responses Re: pg_replication_slot_advance to return NULL instead of 0/0 ifslot not advanced
List pgsql-hackers
On 05/06/18 06:28, Michael Paquier wrote:
> On Mon, Jun 04, 2018 at 11:51:35AM +0200, Petr Jelinek wrote:
>> On 01/06/18 21:13, Michael Paquier wrote:
>>> -    startlsn =3D MyReplicationSlot->data.confirmed_flush;
>>> +    if (OidIsValid(MyReplicationSlot->data.database))
>>> +            startlsn =3D MyReplicationSlot->data.confirmed_flush;
>>> +    else
>>> +            startlsn =3D MyReplicationSlot->data.restart_lsn;
>>> +
>>>      if (moveto < startlsn)
>>>      {
>>>              ReplicationSlotRelease();
>>
>> This part looks correct for the checking that we are not moving
>> backwards. However, there is another existing issue with this code
>> which
>> is that we are later using the confirmed_flush (via startlsn) as start
>> point of logical decoding (XLogReadRecord parameter in
>> pg_logical_replication_slot_advance) which is not correct. The
>> restart_lsn should be used for that. I think it would make sense to
>> fix
>> that as part of this patch as well.
> 
> I am not sure I understand what you are coming at here.  Could you
> explain your point a bit more please as 9c7d06d is yours?

As I said, it's an existing issue. I just had chance to reread the code
when looking and your patch.

> When creating
> the decoding context, all other code paths use the confirmed LSN as a
> start point if not explicitely defined by the caller of
> CreateDecodingContext, as it points to the last LSN where a transaction
> has been committed and decoded.

I didn't say anything about CreateDecodingContext though. I am talking
about the fact that we then use the same variable as input to
XLogReadRecord later in the logical slot code path. The whole point of
having restart_lsn for logical slot is to have correct start point for
XLogReadRecord so using the confirmed_flush there is wrong (and has been
wrong since the original commit).


-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: why partition pruning doesn't work?
Next
From: Amit Khandekar
Date:
Subject: AtEOXact_ApplyLauncher() and subtransactions