Re: Physical replication slot advance is not persistent - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Physical replication slot advance is not persistent
Date
Msg-id 20200121.093940.420134805758355363.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Physical replication slot advance is not persistent  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
Thanks for looking this.

At Mon, 20 Jan 2020 11:00:14 -0800, Andres Freund <andres@anarazel.de> wrote in 
> > Here is a summary of the points raised (please correct me if that
> > does not sound right to you, Andres):
> 
> > 1) The slot advancing has to mark the slot as dirty, but should we
> > make the change persistent at the end of the function or should we
> > wait for a checkpoint to do the work, meaning that any update done to
> > the slot would be lost if a crash occurs in-between?  Note that we
> > have this commit in slotfuncs.c for
> > pg_logical_replication_slot_advance():
> >  * Dirty the slot so it's written out at the next checkpoint.
> >  * We'll still lose its position on crash, as documented, but it's
> >  * better than always losing the position even on clean restart.
> > 
> > This comment refers to the documentation for the logical decoding
> > section (see logicaldecoding-replication-slots in
> > logicaldecoding.sgml), and even if nothing can be done until the slot
> > advance function reaches its hand, we ought to make the data
> > persistent if we can.
> 
> That doesn't really seem like a meaningful reference, because the
> concerns between constantly streaming out changes (where we don't want
> to fsync every single transaction), and doing so in a manual advance
> through an sql function, seem different.

Yes, that is the reason I didn't suggest not to save the file there.
I don't have a clear opinion on it but I agree that users expect that
any changes they made from SQL interface should survive a
crash-recovery.

> > 3) The amount of testing related to slot advancing could be better
> > with cluster-wide operations.
> > 
> > @@ -370,6 +370,11 @@ pg_physical_replication_slot_advance(XLogRecPtr
> > moveto)
> >     MyReplicationSlot->data.restart_lsn = moveto;
> > 
> >     SpinLockRelease(&MyReplicationSlot->mutex);
> >     retlsn = moveto;
> > +
> > +   ReplicationSlotMarkDirty();
> > +
> > +   /* We moved retart_lsn, update the global value. */
> > +   ReplicationSlotsComputeRequiredLSN();
> > I think that the proposed patch is missing a call to
> > ReplicationSlotsComputeRequiredXmin() here for physical slots.
> 
> Hm. It seems ok to include, but I don't think omitting it currently has
> negative effects?

I think no. It is updated sooner or later when replication proceeds
and received a reply message.

> > So, I have been looking at this patch by myself, and updated it so as
> > the extra slot save is done only if any advancing has been done, on
> > top of the other computations that had better be around for
> > consistency.
> 
> Hm, I don't necessarily what that's necessary.

On the other hand, no negitve effect by the extra saving of the file
as far as the SQL function itself is not called extremely
frequently. If I read Andres's comment above correctly, I agree not to
add complexity to supress the "needless" saving of the file.

> > The patch includes TAP tests for physical and logical slots'
> > durability across restarts.
> 
> Cool!
> 
> 
> > diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
> > index bb69683e2a..af3e114fc9 100644
> > --- a/src/backend/replication/slotfuncs.c
> > +++ b/src/backend/replication/slotfuncs.c
> > @@ -359,17 +359,20 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
> >   * checkpoints.
> >   */
> >  static XLogRecPtr
> > -pg_physical_replication_slot_advance(XLogRecPtr moveto)
> > +pg_physical_replication_slot_advance(XLogRecPtr moveto, bool *advance_done)
> >  {
> >      XLogRecPtr    startlsn = MyReplicationSlot->data.restart_lsn;
> >      XLogRecPtr    retlsn = startlsn;
> >  
> > +    *advance_done = false;
> > +
> >      if (startlsn < moveto)
> >      {
> >          SpinLockAcquire(&MyReplicationSlot->mutex);
> >          MyReplicationSlot->data.restart_lsn = moveto;
> >          SpinLockRelease(&MyReplicationSlot->mutex);
> >          retlsn = moveto;
> > +        *advance_done = true;
> >      }
> >  
> >      return retlsn;
> 
> Hm. Why did you choose not to use endlsn as before (except being
> broken), or something? It seems quite conceivable somebody is using
> these functions in an extension.
> 
> 
> 
> 
> > +# Test physical slot advancing and its durability.  Create a new slot on
> > +# the primary, not used by any of the standbys. This reserves WAL at creation.
> > +my $phys_slot = 'phys_slot';
> > +$node_master->safe_psql('postgres',
> > +    "SELECT pg_create_physical_replication_slot('$phys_slot', true);");
> > +$node_master->psql('postgres', "
> > +    CREATE TABLE tab_phys_slot (a int);
> > +    INSERT INTO tab_phys_slot VALUES (generate_series(1,10));");
> > +my $psql_rc = $node_master->psql('postgres',
> > +    "SELECT pg_replication_slot_advance('$phys_slot', 'FF/FFFFFFFF');");
> > +is($psql_rc, '0', 'slot advancing works with physical slot');
> 
> Hm. I'm think testing this with real LSNs is a better idea. What if the
> node actually already is past FF/FFFFFFFF at this point? Quite unlikely,
> I know, but still. I.e. why not get the current LSN after the INSERT,
> and assert that the slot, after the restart, is that?

+1.


(continuation of (3))
At Mon, 20 Jan 2020 15:45:40 +0900, Michael Paquier <michael@paquier.xyz> wrote in
(@@ -370,6 +370,11 @@ pg_physical_replication_slot_advance(XLogRecPtr moveto))
> +   /* We moved retart_lsn, update the global value. */
> +   ReplicationSlotsComputeRequiredLSN();
> I think that the proposed patch is missing a call to
> ReplicationSlotsComputeRequiredXmin() here for physical slots.

No. pg_physical_replication_slot_advance doesn't make an advance of
effective_(catalog)_xmin so it is just useless. It would be necessary
if it were in pg_replication_slot_advance, its caller.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
Next
From: Thomas Munro
Date:
Subject: Reducing WaitEventSet syscall churn