Re: pg_replication_slot_advance to return NULL instead of 0/0 if slotnot advanced - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: pg_replication_slot_advance to return NULL instead of 0/0 if slotnot advanced
Date
Msg-id 4ff6b139-1647-cad9-1477-3b6aa44c666d@2ndquadrant.com
Whole thread Raw
In response to Re: pg_replication_slot_advance to return NULL instead of 0/0 ifslot not advanced  (Michael Paquier <michael@paquier.xyz>)
Responses Re: pg_replication_slot_advance to return NULL instead of 0/0 ifslot not advanced
List pgsql-hackers
On 06/06/18 04:04, Michael Paquier wrote:
> On Tue, Jun 05, 2018 at 01:00:30PM +0200, Petr Jelinek wrote:
>> I didn't say anything about CreateDecodingContext though. I am talking
>> about the fact that we then use the same variable as input to
>> XLogReadRecord later in the logical slot code path. The whole point of
>> having restart_lsn for logical slot is to have correct start point for
>> XLogReadRecord so using the confirmed_flush there is wrong (and has been
>> wrong since the original commit).
> 
> OK, I finally see the point you are coming at after a couple of hours
> brainstorming about it, and indeed using confirmed_lsn is logically
> incorrect so the current code is inconsistent with what
> DecodingContextFindStartpoint() does, so we rely on restart_lsn to scan
> for all the records and the decoder state is here to make sure that the
> slot's confirmed_lsn is updated to a consistent position.  I have added
> your feedback in the attached (indented and such), which results in some
> simplifications with a couple of routines.
> 
> I am attaching as well the patch I sent yesterday.  0001 is candidate
> for a back-patch, 0002 is for HEAD to fix the slot advance stuff.
> 
> There is another open item related to slot advancing here:
> https://www.postgresql.org/message-id/2840048a-1184-417a-9da8-3299d207a1d7%40postgrespro.ru
> And per my tests the patch is solving this item as well.  I have just
> used the test mentioned in the thread which:
> 1) creates a slot
> 2) does some activity to generate a couple of WAL pages
> 3) advances the slot at page boundary
> 4) Moves again the slot.
> This test crashes on HEAD at step 4, and not with the attached.
> 
> What do you think?

Seems reasonable to me.

I think the only thing to note about the patches from my side is that we
probably don't want to default to restart_lsn for the
pg_logical_replication_slot_advance() return value (when nothing was
done) but rather the confirmed_lsn. As it is in current patch if we call
the function repeatedly and one call moved slot forward but the next one
didn't the return value will go backwards as restart_lsn tends to be
behind the confirmed one.

-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Why is fncollation in FunctionCallInfoData rather than fmgr_info?
Next
From: Peter Eisentraut
Date:
Subject: Re: PATCH pass PGOPTIONS to pg_regress