Re: Logical decoding on standby - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Logical decoding on standby
Date
Msg-id CAMsr+YHC541OTrZCvUZynSF9Mt3jbjr+yEwWwUZG5Rk4+S0vnQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Logical decoding on standby  (Andres Freund <andres@anarazel.de>)
Responses Re: Logical decoding on standby  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-hackers
On 20 March 2017 at 17:33, Andres Freund <andres@anarazel.de> wrote:

> Have you checked how high the overhead of XLogReadDetermineTimeline is?
> A non-local function call, especially into a different translation-unit
> (no partial inlining), for every single page might end up being
> noticeable.  That's fine in the cases it actually adds functionality,
> but for a master streaming out data, that's not actually adding
> anything.

I haven't been able to measure any difference. But, since we require
the caller to ensure a reasonably up to date ThisTimeLineID, maybe
it's worth adding an inlineable function for the fast-path that tests
the "page cached" and "timeline is current and unchanged" conditions?


//xlogutils.h:
static inline void XLogReadDetermineTimeline(...)
{  ... first test for page already read-in and valid ...  ... second test for ThisTimeLineId ...
XLogReadCheckTimeLineChange(...)
}

XLogReadCheckTimeLineChange(...)
{  ... rest of function
}

(Yes, I know "inline" means little, but it's a hint for readers)

I'd rather avoid using a macro since it'd be pretty ugly, but it's
also an option if an inline func is undesirable.

#define XLOG_READ_DETERMINE_TIMELINE \ do { \   ... same as above ... } while (0);


Can be done after CF if needed anyway, it's just fiddling some code
around. Figured I'd mention though.

>> +             /*
>> +              * To avoid largely duplicating ReplicationSlotDropAcquired() or
>> +              * complicating it with already_locked flags for ProcArrayLock,
>> +              * ReplicationSlotControlLock and ReplicationSlotAllocationLock, we
>> +              * just release our ReplicationSlotControlLock to drop the slot.
>> +              *
>> +              * There's no race here: we acquired this slot, and no slot "behind"
>> +              * our scan can be created or become active with our target dboid due
>> +              * to our exclusive lock on the DB.
>> +              */
>> +             LWLockRelease(ReplicationSlotControlLock);
>> +             ReplicationSlotDropAcquired();
>> +             LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
>
> I don't see much problem with this, but I'd change the code so you
> simply do a goto restart; if you released the slot.  Then there's a lot
> less chance / complications around temporarily releasing
> ReplicationSlotControlLock

I don't quite get this. I suspect I'm just not seeing the implications
as clearly as you do.

Do you mean we should restart the whole scan of the slot array if we
drop any slot? That'll be O(n log m) but since we don't expect to be
working on a big array or a lot of slots it's unlikely to matter.

The patch coming soon will assume we'll restart the whole scan, anyway.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: WIP: [[Parallel] Shared] Hash
Next
From: Robert Haas
Date:
Subject: Re: comments in hash_alloc_buckets