Re: Minimal logical decoding on standbys - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Minimal logical decoding on standbys
Date
Msg-id CA+TgmobgOLH-JpBoBSdu4i+sjRdgwmDEZGECkmowXqQgQL6WhQ@mail.gmail.com
Whole thread Raw
In response to Re: Minimal logical decoding on standbys  ("Drouvot, Bertrand" <bdrouvot@amazon.com>)
Responses Re: Minimal logical decoding on standbys  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Wed, Oct 27, 2021 at 2:56 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
> So you have in mind to check for XLogLogicalInfoActive() first, and if true, then open the relation and call
> RelationIsAccessibleInLogicalDecoding()?

I think 0001 is utterly unacceptable. We cannot add calls to
table_open() in low-level functions like this. Suppose for example
that _bt_getbuf() calls _bt_log_reuse_page() which with 0001 applied
would call get_rel_logical_catalog(). _bt_getbuf() will have acquired
a buffer lock on the page. The idea that it's safe to call
table_open() while holding a buffer lock cannot be taken seriously.
That could do arbitrary amounts of work taking any number of other
buffer locks, which could easily deadlock (and the deadlock detector
wouldn't help, since these are lwlocks). Even if that were no issue,
we really, really do not want to write code that could result in large
numbers of additional calls to table_open() -- and _bt_getbuf() is
certainly a frequently-used function. I think that, in order to have
any chance of being acceptable, this would need to be restructured so
that it pulls data from an existing relcache entry that is known to be
valid, without attempting to create a new one. That is,
get_rel_logical_decoding() would need to take a Relation argument, not
an OID.

I also think it's super-weird that the value being logged is computed
using RelationIsAccessibleInLogicalDecoding(). That means that if
wal_level < logical, we'll set onCatalogTable = false in the xlog
record, regardless of whether that's true or not. Now I suppose it
won't matter, because presumably this field is only going to be
consulted for whatever purpose when logical replication is active, but
I object on principle to the idea of a field whose name suggests that
it means one thing and whose value is inconsistent with that
interpretation.

Regarding 0003, I notice that GetXLogReplayRecPtr() gets an extra
argument that is set to false everywhere except one place that is
inside the new code. That suggests to me that putting logic that the
other 15 callers don't need is not the right approach here. It also
looks like, in the one place where that argument does get passed as
true, LogStandbySnapshot() moves outside the retry loop. I think
that's unlikely to be correct.

I also notice that 0003 deletes a comment that says "We need to force
hot_standby_feedback to be enabled at all times so the primary cannot
remove rows we need," but also that this is the only mention of
hot_standby_feedback in the entire patch set. If the existing comment
that we need to do something about that is incorrect, we should update
it independently of this patch set to be correct. But if the existing
comment is correct then there ought to be something in the patch that
deals with it.

Another part of that same deleted comment says "We need to be able to
correctly and quickly identify the timeline LSN belongs to," but I
don't see what the patch does about that, either. I'm actually not
sure exactly what that's talking about, but today for unrelated
reasons I happened to be looking at logical_read_xlog_page(), which is
actually what caused me to look at this thread. In that function we
have, as the first two lines of executable code:

     XLogReadDetermineTimeline(state, targetPagePtr, reqLen);
     sendTimeLineIsHistoric = (state->currTLI != ThisTimeLineID);

The second line of code depends on the value of ThisTimeLineID. The
first line of code does too, because XLogReadDetermineTimeline() uses
that variable internally. If logical decoding is only allowed on a
primary, then there can't really be an issue here, because we will
have checked RecoveryInProgress() in
CheckLogicalDecodingRequirements() and ThisTimeLineID will have its
final value. But on a standby, I'm not sure that ThisTimeLineID even
has to be initialized here, and I really can't see any reason at all
why the value it contains is necessarily still current. This
function's sister, read_local_xlog_page(), contains a bunch of logic
that tries to make sure that we're always reading every record from
the right timeline, but there's nothing similar here. I think that
would likely have to be fixed in order for decoding to work on
standbys, but maybe I'm missing something.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: "Bossart, Nathan"
Date:
Subject: Re: Correct error message for end-of-recovery record TLI
Next
From: Andres Freund
Date:
Subject: Re: [PATCH v2] src/port/snprintf.c: Optimize the common base=10 case in fmtint