Re: replication slot restart_lsn initialization - Mailing list pgsql-hackers

From Andres Freund
Subject Re: replication slot restart_lsn initialization
Date
Msg-id 20150810141254.GF16192@awork2.anarazel.de
Whole thread Raw
In response to Re: replication slot restart_lsn initialization  (Gurjeet Singh <gurjeet@singh.im>)
Responses Re: replication slot restart_lsn initialization  (Gurjeet Singh <gurjeet@singh.im>)
List pgsql-hackers
On 2015-07-07 09:42:54 -0700, Gurjeet Singh wrote:
> --- a/src/backend/replication/slot.c
> +++ b/src/backend/replication/slot.c
> @@ -40,10 +40,10 @@
>  #include <sys/stat.h>
>
>  #include "access/transam.h"
> +#include "access/xlog_internal.h"
>  #include "common/string.h"
>  #include "miscadmin.h"
>  #include "replication/slot.h"
> -#include "storage/fd.h"
>  #include "storage/proc.h"
>  #include "storage/procarray.h"

Why did you remove fd.h? The file definitely uses infrastructure from
there. We're not terribly consistent about that but I'd rather not rely
on it being included via xlog_internal.h -> xlog.h.

>  /*
> + * Grab and save an LSN value to prevent WAL recycling past that point.
> + */
> +void
> +ReplicationSlotRegisterRestartLSN()
> +{

Didn't like that description and function name very much. What does
'grabbing' mean here? Should probably mention that it works on the
currently active slot and modifies it.

It's now ReplicationSlotReserveWal()

> +    ReplicationSlot *slot = MyReplicationSlot;
> +
> +    Assert(slot != NULL);
> +    Assert(slot->data.restart_lsn == InvalidXLogRecPtr);
> +
> +    /*
> +     * The replication slot mechanism is used to prevent removal of required
> +     * WAL. As there is no interlock between this and checkpoints required, WAL
> +     * segment could be removed before ReplicationSlotsComputeRequiredLSN() has
> +     * been called to prevent that. In the very unlikely case that this happens
> +     * we'll just retry.
> +     */

You removed some punctuation in that sentence converting a sentence in
bad english into one without the original meaning ;). See the attached
for a new version.

> +    while (true)
> +    {
> +        XLogSegNo    segno;
> +
> +        /*
> +         * Let's start with enough information if we can, so log a standby
> +         * snapshot and start logical decoding at exactly that position.
> +         */
> +        if (!RecoveryInProgress())
> +        {
> +            XLogRecPtr    flushptr;
> +
> +            /* start at current insert position */
> +            slot->data.restart_lsn = GetXLogInsertRecPtr();
> +
> +            /*
> +             * Log an xid snapshot for logical replication. This snapshot is not
> +             * needed for physical replication, as it relies on the snapshot
> +             * created by checkpoint when the base backup starts.
> +             */
> +            if (slot->data.database != InvalidOid)
> +            {
> +                /* make sure we have enough information to start */
> +                flushptr = LogStandbySnapshot();
> +
> +                /* and make sure it's fsynced to disk */
> +                XLogFlush(flushptr);
> +            }
> +        }
> +        else
> +            slot->data.restart_lsn = GetRedoRecPtr();
> +
> +        /* prevent WAL removal as fast as possible */
> +        ReplicationSlotsComputeRequiredLSN();
> +
> +        /*
> +         * If all required WAL is still there, great, otherwise retry. The
> +         * slot should prevent further removal of WAL, unless there's a
> +         * concurrent ReplicationSlotsComputeRequiredLSN() after we've written
> +         * the new restart_lsn above, so normally we should never need to loop
> +         * more than twice.
> +         */
> +        XLByteToSeg(slot->data.restart_lsn, segno);
> +        if (XLogGetLastRemovedSegno() < segno)
> +            break;
> +    }
> +}

The way you added the check for logical vs. physical slots in there
looks wrong to me. For a physical slot created !InRecovy we'll possibly
return a xlog position from the future (it's the insert position *and*
not flushed to disk), which then cannot be received.

> +/*
>   * Flush all replication slots to disk.
>   *
>   * This needn't actually be part of a checkpoint, but it's a convenient
> @@ -876,7 +942,7 @@ StartupReplicationSlots(void)
>  }
>
>  /* ----
> - * Manipulation of ondisk state of replication slots
> + * Manipulation of on-disk state of replication slots
>   *
>   * NB: none of the routines below should take any notice whether a slot is the
>   * current one or not, that's all handled a layer above.
> diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
> index 9a2793f..01b376a 100644
> --- a/src/backend/replication/slotfuncs.c
> +++ b/src/backend/replication/slotfuncs.c
> @@ -40,6 +40,7 @@ Datum
>  pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
>  {
>      Name        name = PG_GETARG_NAME(0);
> +    bool         immediately_reserve = PG_GETARG_BOOL(1);
>      Datum        values[2];
>      bool        nulls[2];
>      TupleDesc    tupdesc;
> @@ -58,10 +59,28 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
>      /* acquire replication slot, this will check for conflicting names */
>      ReplicationSlotCreate(NameStr(*name), false, RS_PERSISTENT);
>
> -    values[0] = NameGetDatum(&MyReplicationSlot->data.name);
> +    if (immediately_reserve)
> +    {
> +        /* Allocate restart-LSN, if the user asked for it */
> +        ReplicationSlotRegisterRestartLSN();
> +
> +        /* Write this slot to disk */
> +        ReplicationSlotMarkDirty();
> +        ReplicationSlotSave();
>
> -    nulls[0] = false;
> -    nulls[1] = true;
> +        values[0] = NameGetDatum(&MyReplicationSlot->data.name);
> +        values[1] = LSNGetDatum(MyReplicationSlot->data.restart_lsn);
> +
> +        nulls[0] = false;
> +        nulls[1] = false;
> +    }
> +    else
> +    {
> +        values[0] = NameGetDatum(&MyReplicationSlot->data.name);
> +
> +        nulls[0] = false;
> +        nulls[1] = true;
> +    }

I moved
values[0] = NameGetDatum(&MyReplicationSlot->data.name);
nulls[0] = false;
to outside the conditional block, there seems no reason to have it in
there?

I also removed a bunch of unrelated minor cleanups that I plan to commit
& backpatch separately.

What do you think?

Andres

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: tap tests remove working directories
Next
From: Andrew Dunstan
Date:
Subject: Re: tap tests remove working directories