Re: Physical replication slot advance is not persistent - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Physical replication slot advance is not persistent
Date
Msg-id 20200120190014.h4pvoep2t44foy4y@alap3.anarazel.de
Whole thread Raw
In response to Re: Physical replication slot advance is not persistent  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Physical replication slot advance is not persistent  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Re: Physical replication slot advance is not persistent  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
Hi,

On 2020-01-20 15:45:40 +0900, Michael Paquier wrote:
> On Thu, Jan 16, 2020 at 08:09:09PM +0300, Alexey Kondratov wrote:
> > OK, I have definitely overthought that, thanks. This looks like a minimal
> > subset of changes that actually solves the bug. I would only prefer to keep
> > some additional comments (something like the attached), otherwise after half
> > a year it will be unclear again, why we save slot unconditionally here.
> 
> Since this email, Andres has sent an email that did not reach the
> community lists, but where all the participants of this thread were in
> CC.

Ugh, that was an accident.


> Here is a summary of the points raised (please correct me if that
> does not sound right to you, Andres):

> 1) The slot advancing has to mark the slot as dirty, but should we
> make the change persistent at the end of the function or should we
> wait for a checkpoint to do the work, meaning that any update done to
> the slot would be lost if a crash occurs in-between?  Note that we
> have this commit in slotfuncs.c for
> pg_logical_replication_slot_advance():
>  * Dirty the slot so it's written out at the next checkpoint.
>  * We'll still lose its position on crash, as documented, but it's
>  * better than always losing the position even on clean restart.
> 
> This comment refers to the documentation for the logical decoding
> section (see logicaldecoding-replication-slots in
> logicaldecoding.sgml), and even if nothing can be done until the slot
> advance function reaches its hand, we ought to make the data
> persistent if we can.

That doesn't really seem like a meaningful reference, because the
concerns between constantly streaming out changes (where we don't want
to fsync every single transaction), and doing so in a manual advance
through an sql function, seem different.


> 3) The amount of testing related to slot advancing could be better
> with cluster-wide operations.
> 
> @@ -370,6 +370,11 @@ pg_physical_replication_slot_advance(XLogRecPtr
> moveto)
>     MyReplicationSlot->data.restart_lsn = moveto;
> 
>     SpinLockRelease(&MyReplicationSlot->mutex);
>     retlsn = moveto;
> +
> +   ReplicationSlotMarkDirty();
> +
> +   /* We moved retart_lsn, update the global value. */
> +   ReplicationSlotsComputeRequiredLSN();
> I think that the proposed patch is missing a call to
> ReplicationSlotsComputeRequiredXmin() here for physical slots.

Hm. It seems ok to include, but I don't think omitting it currently has
negative effects?


> So, I have been looking at this patch by myself, and updated it so as
> the extra slot save is done only if any advancing has been done, on
> top of the other computations that had better be around for
> consistency.

Hm, I don't necessarily what that's necessary.


> The patch includes TAP tests for physical and logical slots'
> durability across restarts.

Cool!


> diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
> index bb69683e2a..af3e114fc9 100644
> --- a/src/backend/replication/slotfuncs.c
> +++ b/src/backend/replication/slotfuncs.c
> @@ -359,17 +359,20 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
>   * checkpoints.
>   */
>  static XLogRecPtr
> -pg_physical_replication_slot_advance(XLogRecPtr moveto)
> +pg_physical_replication_slot_advance(XLogRecPtr moveto, bool *advance_done)
>  {
>      XLogRecPtr    startlsn = MyReplicationSlot->data.restart_lsn;
>      XLogRecPtr    retlsn = startlsn;
>  
> +    *advance_done = false;
> +
>      if (startlsn < moveto)
>      {
>          SpinLockAcquire(&MyReplicationSlot->mutex);
>          MyReplicationSlot->data.restart_lsn = moveto;
>          SpinLockRelease(&MyReplicationSlot->mutex);
>          retlsn = moveto;
> +        *advance_done = true;
>      }
>  
>      return retlsn;

Hm. Why did you choose not to use endlsn as before (except being
broken), or something? It seems quite conceivable somebody is using
these functions in an extension.




> +# Test physical slot advancing and its durability.  Create a new slot on
> +# the primary, not used by any of the standbys. This reserves WAL at creation.
> +my $phys_slot = 'phys_slot';
> +$node_master->safe_psql('postgres',
> +    "SELECT pg_create_physical_replication_slot('$phys_slot', true);");
> +$node_master->psql('postgres', "
> +    CREATE TABLE tab_phys_slot (a int);
> +    INSERT INTO tab_phys_slot VALUES (generate_series(1,10));");
> +my $psql_rc = $node_master->psql('postgres',
> +    "SELECT pg_replication_slot_advance('$phys_slot', 'FF/FFFFFFFF');");
> +is($psql_rc, '0', 'slot advancing works with physical slot');

Hm. I'm think testing this with real LSNs is a better idea. What if the
node actually already is past FF/FFFFFFFF at this point? Quite unlikely,
I know, but still. I.e. why not get the current LSN after the INSERT,
and assert that the slot, after the restart, is that?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Vik Fearing
Date:
Subject: Re: Greatest Common Divisor
Next
From: Jesper Pedersen
Date:
Subject: Re: Index Skip Scan