Re: pg_receivewal starting position - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: pg_receivewal starting position
Date
Msg-id YXSWdC0At7JBwoC+@paquier.xyz
Whole thread Raw
In response to Re: pg_receivewal starting position  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: pg_receivewal starting position  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Sat, Oct 23, 2021 at 10:46:30PM +0530, Bharath Rupireddy wrote:
> 1) It's better to initialize nulls with false, we can avoid setting
> them to true. The instances where the columns are not nulls is going
> to be more than the columns with null values, so we could avoid some
> of nulls[i] = false; instructions.

I don't think that this is an improvement in this case as in the
default case we'd return a tuple full of NULL values if the slot does
not exist, so the existing code is simpler when we don't look at the
slot contents.

> 2) As I previously mentioned, we are not copying the slot contents
> while holding the spinlock, it's just we are taking the memory address
> and releasing the lock, so there is a chance that the memory we are
> looking at can become unavailable or stale while we access
> slot_contents. So, I suggest we do the memcpy of the *slot to
> slot_contents. I'm sure the memcpy ing the entire ReplicationSlot
> contents will be costlier, so let's just take the info that we need
> (data.database, data.restart_lsn) into local variables while we hold
> the spin lock

The style of the patch is consistent with what we do in other areas
(see pg_get_replication_slots as one example).

> + /* Copy slot contents while holding spinlock */
> + SpinLockAcquire(&slot->mutex);
> + slot_contents = *slot;

And what this does is to copy the contents of the slot into a local
area (note that we use a NameData pointing to an area with
NAMEDATALEN).  Aka if the contents of *slot are changed by whatever
reason (this cannot change as of the LWLock acquired), what we have
saved is unchanged as of this command's context.

> 3) The data that the new command returns to the client can actually
> become stale while it is captured and in transit to the client as we
> release the spinlock and other backends can drop or alter the info.
> So, it's better we talk about this in the documentation of the new
> command and also in the comments saying "clients will have to deal
> with it."

The same can be said with IDENTIFY_SYSTEM when the flushed location
becomes irrelevant.  I am not sure that this has any need to apply
here.  We could add that this is useful to get a streaming start
position though.

> 4) How about we be more descriptive about the error added? This will
> help identify for which replication slot the command has failed from
> tons of server logs which really help in debugging and analysis.
> I suggest we have this:
> errmsg("cannot use \"%s\" command with logical replication slot
> \"%s\"", "READ_REPLICATION_SLOT", cmd->slotname);
> instead of just a plain, non-informative, generic message:
> errmsg("cannot use \"%s\" with logical replication slots",

Yeah.  I don't mind adding the slot name in this string.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Predefined role pg_maintenance for VACUUM, ANALYZE, CHECKPOINT.
Next
From: Michael Paquier
Date:
Subject: Re: Refactoring: join MakeSingleTupleTableSlot() and MakeTupleTableSlot()