Re: Minimal logical decoding on standbys - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Minimal logical decoding on standbys
Date
Msg-id 20190521161937.tpepqkiatw6kv35v@alap3.anarazel.de
Whole thread Raw
In response to Re: Minimal logical decoding on standbys  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Responses Re: Minimal logical decoding on standbys
Re: Minimal logical decoding on standbys
List pgsql-hackers
Hi,

Sorry for the late response.

On 2019-04-16 12:27:46 +0530, Amit Khandekar wrote:
> On Sat, 13 Apr 2019 at 00:57, Andres Freund <andres@anarazel.de> wrote:
> > > Not sure why this is happening. On slave, wal_level is logical, so
> > > logical records should have tuple data. Not sure what does that have
> > > to do with wal_level of master. Everything should be there on slave
> > > after it replays the inserts; and also slave wal_level is logical.
> >
> > The standby doesn't write its own WAL, only primaries do. I thought we
> > forbade running with wal_level=logical on a standby, when the primary is
> > only set to replica.  But that's not what we do, see
> > CheckRequiredParameterValues().
> >
> > I've not yet thought this through, but I think we'll have to somehow
> > error out in this case.  I guess we could just check at the start of
> > decoding what ControlFile->wal_level is set to,
> 
> By "start of decoding", I didn't get where exactly. Do you mean
> CheckLogicalDecodingRequirements() ?

Right.


> > and then raise an error
> > in decode.c when we pass an XLOG_PARAMETER_CHANGE record that sets
> > wal_level to something lower?
> 
> Didn't get where exactly we should error out. We don't do
> XLOG_PARAMETER_CHANGE handling in decode.c , so obviously you meant
> something else, which I didn't understand.

I was indeed thinking of checking XLOG_PARAMETER_CHANGE in
decode.c. Adding handling for that, and just checking wal_level, ought
to be fairly doable? But, see below:


> What I am thinking is :
> In CheckLogicalDecodingRequirements(), besides checking wal_level,
> also check ControlFile->wal_level when InHotStandby. I mean, when we
> are InHotStandby, both wal_level and ControlFile->wal_level should be
> >= WAL_LEVEL_LOGICAL. This will allow us to error out when using logical
> slot when master has incompatible wal_level.

That still allows the primary to change wal_level after logical decoding
has started, so we need the additional checks.

I'm not yet sure how to best deal with the fact that wal_level might be
changed by the primary at basically all times. We would eventually get
an error when logical decoding reaches the XLOG_PARAMETER_CHANGE. But
that's not necessarily sufficient - if a primary changes its wal_level
to lower, it could remove information logical decoding needs *before*
logical decoding reaches the XLOG_PARAMETER_CHANGE record.

So I suspect we need conflict handling in xlog_redo's
XLOG_PARAMETER_CHANGE case. If we there check against existing logical
slots, we ought to be safe.

Therefore I think the check in CheckLogicalDecodingRequirements() needs
to be something like:

if (RecoveryInProgress())
{
    if (!InHotStandby)
        ereport(ERROR, "logical decoding on a standby required hot_standby to be enabled");
    /*
     * This check is racy, but whenever XLOG_PARAMETER_CHANGE indicates that
     * wal_level has changed, we verify that there are no existin glogical
     * replication slots. And to avoid races around creating a new slot,
     * CheckLogicalDecodingRequirements() is called once before creating the slot,
     * andd once when logical decoding is initially starting up.
     */
    if (ControlFile->wal_level != LOGICAL)
        ereport(ERROR, "...");
}

And then add a second CheckLogicalDecodingRequirements() call into
CreateInitDecodingContext().

What do you think?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: A few more opportunities to use TupleTableSlotOps fields
Next
From: Robert Haas
Date:
Subject: Re: New EXPLAIN option: ALL