Thread: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

Logical replication from PG v13 and below to PG v14 (devel version) is not working.

From
Ashutosh Sharma
Date:
Hi All,

Today, while exploring logical replication in PostgreSQL, I noticed
that logical replication from PG version 13 and below to PG v14
(development version) is not working. It has stopped working from the
following git commit onwards:

commit 464824323e57dc4b397e8b05854d779908b55304
Author: Amit Kapila <akapila@postgresql.org>
Date:   Thu Sep 3 07:54:07 2020 +0530

    Add support for streaming to built-in logical replication.

...
...

Here is the experiment that I performed to verify this:

Publisher (PG-v12/13):
==================
CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);

INSERT INTO pg2pg (whatever) SELECT 'str-' || i FROM generate_series(1, 5) i;

SELECT * FROM pg2pg;

CREATE PUBLICATION pg2pg_pub FOR TABLE pg2pg;

Subscriber (PG-v14 HEAD or commit 46482432):
=====================================
CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);

CREATE SUBSCRIPTION pg2pg_sub CONNECTION 'host=127.0.0.1 port=5433
dbname=postgres user=ashu' PUBLICATION pg2pg_pub;

SELECT * FROM pg2pg;

Above select query produces no result. When this experiment is
performed below the mentioned git commit, it works fine.

After spending some time looking into this issue, I observed that
above git commit has bumped the logical replication protocol version.
Due to this, the logical replication apply worker process is unable to
do WAL streaming which causes it to terminate. Therefore, the data
inserted in the publication table is not applied on the subscription
table (i.e. no data replication happens)

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

From
Bharath Rupireddy
Date:
On Mon, Sep 21, 2020 at 3:26 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> commit 464824323e57dc4b397e8b05854d779908b55304
> Author: Amit Kapila <akapila@postgresql.org>
> Date:   Thu Sep 3 07:54:07 2020 +0530
>
> Above select query produces no result. When this experiment is
> performed below the mentioned git commit, it works fine.
>

I encountered the same issue. We get to see below error during table
sync phase in the subscriber server logs.

2020-09-21 16:01:45.794 IST [782428] LOG:  logical replication apply
worker for subscription "pg2pg_sub" has started
2020-09-21 16:01:45.799 IST [782428] ERROR:  could not start WAL
streaming: ERROR:  client sent proto_version=2 but we only support
protocol 1 or lower
    CONTEXT:  slot "pg2pg_sub", output plugin "pgoutput", in the
startup callback
2020-09-21 16:01:45.800 IST [781607] LOG:  background worker "logical
replication worker" (PID 782428) exited with exit code 1

>
> After spending some time looking into this issue, I observed that
> above git commit has bumped the logical replication protocol version.
> Due to this, the logical replication apply worker process is unable to
> do WAL streaming which causes it to terminate. Therefore, the data
> inserted in the publication table is not applied on the subscription
> table (i.e. no data replication happens)
>

That's because the above commit changed LOGICALREP_PROTO_VERSION_NUM
to 2 from 1.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



On Mon, Sep 21, 2020 at 3:26 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Hi All,
>
> Today, while exploring logical replication in PostgreSQL, I noticed
> that logical replication from PG version 13 and below to PG v14
> (development version) is not working. It has stopped working from the
> following git commit onwards:
>
> commit 464824323e57dc4b397e8b05854d779908b55304
> Author: Amit Kapila <akapila@postgresql.org>
> Date:   Thu Sep 3 07:54:07 2020 +0530
>
>     Add support for streaming to built-in logical replication.
>
> ...
> ...
>
> Here is the experiment that I performed to verify this:
>
> Publisher (PG-v12/13):
> ==================
> CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);
>
> INSERT INTO pg2pg (whatever) SELECT 'str-' || i FROM generate_series(1, 5) i;
>
> SELECT * FROM pg2pg;
>
> CREATE PUBLICATION pg2pg_pub FOR TABLE pg2pg;
>
> Subscriber (PG-v14 HEAD or commit 46482432):
> =====================================
> CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);
>
> CREATE SUBSCRIPTION pg2pg_sub CONNECTION 'host=127.0.0.1 port=5433
> dbname=postgres user=ashu' PUBLICATION pg2pg_pub;
>
> SELECT * FROM pg2pg;
>
> Above select query produces no result. When this experiment is
> performed below the mentioned git commit, it works fine.
>
> After spending some time looking into this issue, I observed that
> above git commit has bumped the logical replication protocol version.
> Due to this, the logical replication apply worker process is unable to
> do WAL streaming which causes it to terminate. Therefore, the data
> inserted in the publication table is not applied on the subscription
> table (i.e. no data replication happens)

Seems like this commit, should have only set the
LOGICALREP_PROTO_STREAM_VERSION_NUM to 2 but the
LOGICALREP_PROTO_VERSION_NUM shouln't have been changed.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



On Mon, Sep 21, 2020 at 4:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Sep 21, 2020 at 3:26 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > Hi All,
> >
> > Today, while exploring logical replication in PostgreSQL, I noticed
> > that logical replication from PG version 13 and below to PG v14
> > (development version) is not working. It has stopped working from the
> > following git commit onwards:
> >
> > commit 464824323e57dc4b397e8b05854d779908b55304
> > Author: Amit Kapila <akapila@postgresql.org>
> > Date:   Thu Sep 3 07:54:07 2020 +0530
> >
> >     Add support for streaming to built-in logical replication.
> >
> > ...
> > ...
> >
> > Here is the experiment that I performed to verify this:
> >
> > Publisher (PG-v12/13):
> > ==================
> > CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);
> >
> > INSERT INTO pg2pg (whatever) SELECT 'str-' || i FROM generate_series(1, 5) i;
> >
> > SELECT * FROM pg2pg;
> >
> > CREATE PUBLICATION pg2pg_pub FOR TABLE pg2pg;
> >
> > Subscriber (PG-v14 HEAD or commit 46482432):
> > =====================================
> > CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);
> >
> > CREATE SUBSCRIPTION pg2pg_sub CONNECTION 'host=127.0.0.1 port=5433
> > dbname=postgres user=ashu' PUBLICATION pg2pg_pub;
> >
> > SELECT * FROM pg2pg;
> >
> > Above select query produces no result. When this experiment is
> > performed below the mentioned git commit, it works fine.
> >
> > After spending some time looking into this issue, I observed that
> > above git commit has bumped the logical replication protocol version.
> > Due to this, the logical replication apply worker process is unable to
> > do WAL streaming which causes it to terminate. Therefore, the data
> > inserted in the publication table is not applied on the subscription
> > table (i.e. no data replication happens)
>
> Seems like this commit, should have only set the
> LOGICALREP_PROTO_STREAM_VERSION_NUM to 2 but the
> LOGICALREP_PROTO_VERSION_NUM shouln't have been changed.

I think if we don't increase the LOGICALREP_PROTO_VERSION_NUM, then
streaming will not work in the latest version.  So what I feel is that
we can keep the LOGICALREP_PROTO_VERSION_NUM as 1 only add one more
parameter say LOGICALREP_PROTO_MAX_VERSION_NUM.  So now from the
latest subscriber if streaming is enabled then we can send
LOGICALREP_PROTO_STREAM_VERSION_NUM and LOGICALREP_PROTO_VERSION_NUM
otherwise.  And on publisher side we can check with the
max_protocol_version and min_protocol version.  I have attached a
patch for the changes I have explained.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

From
Ashutosh Sharma
Date:
Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:

In the error message we are still referring to the native protocol
version number. Shouldn't it be replaced with the greatest protocol
version number we support now (i.e. LOGICALREP_PROTO_MAX_VERSION_NUM)?

-       if (data->protocol_version > LOGICALREP_PROTO_VERSION_NUM)
+       if (data->protocol_version > LOGICALREP_PROTO_MAX_VERSION_NUM)
            ereport(ERROR,
                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                     errmsg("client sent proto_version=%d but we only
support protocol %d or lower",
    data->protocol_version, LOGICALREP_PROTO_VERSION_NUM)));

Other than this, I don't have any comments.

On Mon, Sep 21, 2020 at 5:34 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Sep 21, 2020 at 4:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Sep 21, 2020 at 3:26 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > >
> > > Hi All,
> > >
> > > Today, while exploring logical replication in PostgreSQL, I noticed
> > > that logical replication from PG version 13 and below to PG v14
> > > (development version) is not working. It has stopped working from the
> > > following git commit onwards:
> > >
> > > commit 464824323e57dc4b397e8b05854d779908b55304
> > > Author: Amit Kapila <akapila@postgresql.org>
> > > Date:   Thu Sep 3 07:54:07 2020 +0530
> > >
> > >     Add support for streaming to built-in logical replication.
> > >
> > > ...
> > > ...
> > >
> > > Here is the experiment that I performed to verify this:
> > >
> > > Publisher (PG-v12/13):
> > > ==================
> > > CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);
> > >
> > > INSERT INTO pg2pg (whatever) SELECT 'str-' || i FROM generate_series(1, 5) i;
> > >
> > > SELECT * FROM pg2pg;
> > >
> > > CREATE PUBLICATION pg2pg_pub FOR TABLE pg2pg;
> > >
> > > Subscriber (PG-v14 HEAD or commit 46482432):
> > > =====================================
> > > CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);
> > >
> > > CREATE SUBSCRIPTION pg2pg_sub CONNECTION 'host=127.0.0.1 port=5433
> > > dbname=postgres user=ashu' PUBLICATION pg2pg_pub;
> > >
> > > SELECT * FROM pg2pg;
> > >
> > > Above select query produces no result. When this experiment is
> > > performed below the mentioned git commit, it works fine.
> > >
> > > After spending some time looking into this issue, I observed that
> > > above git commit has bumped the logical replication protocol version.
> > > Due to this, the logical replication apply worker process is unable to
> > > do WAL streaming which causes it to terminate. Therefore, the data
> > > inserted in the publication table is not applied on the subscription
> > > table (i.e. no data replication happens)
> >
> > Seems like this commit, should have only set the
> > LOGICALREP_PROTO_STREAM_VERSION_NUM to 2 but the
> > LOGICALREP_PROTO_VERSION_NUM shouln't have been changed.
>
> I think if we don't increase the LOGICALREP_PROTO_VERSION_NUM, then
> streaming will not work in the latest version.  So what I feel is that
> we can keep the LOGICALREP_PROTO_VERSION_NUM as 1 only add one more
> parameter say LOGICALREP_PROTO_MAX_VERSION_NUM.  So now from the
> latest subscriber if streaming is enabled then we can send
> LOGICALREP_PROTO_STREAM_VERSION_NUM and LOGICALREP_PROTO_VERSION_NUM
> otherwise.  And on publisher side we can check with the
> max_protocol_version and min_protocol version.  I have attached a
> patch for the changes I have explained.
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com

-- 
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:
>
> In the error message we are still referring to the native protocol
> version number. Shouldn't it be replaced with the greatest protocol
> version number we support now (i.e. LOGICALREP_PROTO_MAX_VERSION_NUM)?
>
> -       if (data->protocol_version > LOGICALREP_PROTO_VERSION_NUM)
> +       if (data->protocol_version > LOGICALREP_PROTO_MAX_VERSION_NUM)
>             ereport(ERROR,
>                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                      errmsg("client sent proto_version=%d but we only
> support protocol %d or lower",
>     data->protocol_version, LOGICALREP_PROTO_VERSION_NUM)));
>
> Other than this, I don't have any comments.

Thanks for the review.  I have fixed it the attached patch.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:
>

Thanks Ashutosh and Dilip for working on this. I'll look into it in a
day or two.

-- 
With Regards,
Amit Kapila.



Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

From
Ashutosh Sharma
Date:
On Mon, Sep 21, 2020 at 6:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:
> >
>
> Thanks Ashutosh and Dilip for working on this. I'll look into it in a
> day or two.
>

Just a thought:

Should we change the sequence of these #define in
src/include/replication/logicalproto.h?

#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

I would have changed above to something like this:

#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2

#define LOGICALREP_PROTO_MIN_VERSION_NUM  LOGICALREP_PROTO_VERSION_NUM
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

-- 
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



On Tue, Sep 22, 2020 at 8:34 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> On Mon, Sep 21, 2020 at 6:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > >
> > > Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:
> > >
> >
> > Thanks Ashutosh and Dilip for working on this. I'll look into it in a
> > day or two.
> >
>
> Just a thought:
>
> Should we change the sequence of these #define in
> src/include/replication/logicalproto.h?
>
> #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> #define LOGICALREP_PROTO_VERSION_NUM 1
> #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
>
> I would have changed above to something like this:
>
> #define LOGICALREP_PROTO_VERSION_NUM 1
> #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
>
> #define LOGICALREP_PROTO_MIN_VERSION_NUM  LOGICALREP_PROTO_VERSION_NUM
> #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
>

I am not sure if this suggestion makes it better than what is purposed
by Dilip but I think we can declare them in define number order like
below:
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

Another thing is can we also test by having a publisher of v14 and
subscriber of v13 or prior version, just reverse of what Ashutosh has
tested?

-- 
With Regards,
Amit Kapila.



Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

From
Ashutosh Sharma
Date:
On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 22, 2020 at 8:34 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > On Mon, Sep 21, 2020 at 6:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > > >
> > > > Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:
> > > >
> > >
> > > Thanks Ashutosh and Dilip for working on this. I'll look into it in a
> > > day or two.
> > >
> >
> > Just a thought:
> >
> > Should we change the sequence of these #define in
> > src/include/replication/logicalproto.h?
> >
> > #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > #define LOGICALREP_PROTO_VERSION_NUM 1
> > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> >
> > I would have changed above to something like this:
> >
> > #define LOGICALREP_PROTO_VERSION_NUM 1
> > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> >
> > #define LOGICALREP_PROTO_MIN_VERSION_NUM  LOGICALREP_PROTO_VERSION_NUM
> > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> >
>
> I am not sure if this suggestion makes it better than what is purposed
> by Dilip but I think we can declare them in define number order like
> below:
> #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> #define LOGICALREP_PROTO_VERSION_NUM 1
> #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
>

The only reason I proposed that was because for the *_MAX_VERSION_NUM
we are using the latest PROTOCOL version name in its definition so why
not to do the same for defining *_MIN_VERSION_NUM as well. Other than
that, I also wanted to correct the sequence so that they are defined
in the increasing order which you have already done here.

> Another thing is can we also test by having a publisher of v14 and
> subscriber of v13 or prior version, just reverse of what Ashutosh has
> tested?
>

I've tested LR from PGv12 (Publisher) to PGv14 (Subscriber) and it works fine.

-- 
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

From
Ashutosh Sharma
Date:
On Tue, Sep 22, 2020 at 12:22 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Sep 22, 2020 at 8:34 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > >
> > > On Mon, Sep 21, 2020 at 6:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > > > >
> > > > > Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:
> > > > >
> > > >
> > > > Thanks Ashutosh and Dilip for working on this. I'll look into it in a
> > > > day or two.
> > > >
> > >
> > > Just a thought:
> > >
> > > Should we change the sequence of these #define in
> > > src/include/replication/logicalproto.h?
> > >
> > > #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> > > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > > #define LOGICALREP_PROTO_VERSION_NUM 1
> > > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> > >
> > > I would have changed above to something like this:
> > >
> > > #define LOGICALREP_PROTO_VERSION_NUM 1
> > > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > >
> > > #define LOGICALREP_PROTO_MIN_VERSION_NUM  LOGICALREP_PROTO_VERSION_NUM
> > > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> > >
> >
> > I am not sure if this suggestion makes it better than what is purposed
> > by Dilip but I think we can declare them in define number order like
> > below:
> > #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> > #define LOGICALREP_PROTO_VERSION_NUM 1
> > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> >
>
> The only reason I proposed that was because for the *_MAX_VERSION_NUM
> we are using the latest PROTOCOL version name in its definition so why
> not to do the same for defining *_MIN_VERSION_NUM as well. Other than
> that, I also wanted to correct the sequence so that they are defined
> in the increasing order which you have already done here.
>
> > Another thing is can we also test by having a publisher of v14 and
> > subscriber of v13 or prior version, just reverse of what Ashutosh has
> > tested?
> >
>
> I've tested LR from PGv12 (Publisher) to PGv14 (Subscriber) and it works fine.
>

I meant LR from PGv14 (Publisher) to PGv12 (Subscriber) not the other way.

> --
> With Regards,
> Ashutosh Sharma
> EnterpriseDB:http://www.enterprisedb.com



On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 22, 2020 at 8:34 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > On Mon, Sep 21, 2020 at 6:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > > >
> > > > Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:
> > > >
> > >
> > > Thanks Ashutosh and Dilip for working on this. I'll look into it in a
> > > day or two.
> > >
> >
> > Just a thought:
> >
> > Should we change the sequence of these #define in
> > src/include/replication/logicalproto.h?
> >
> > #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > #define LOGICALREP_PROTO_VERSION_NUM 1
> > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> >
> > I would have changed above to something like this:
> >
> > #define LOGICALREP_PROTO_VERSION_NUM 1
> > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> >
> > #define LOGICALREP_PROTO_MIN_VERSION_NUM  LOGICALREP_PROTO_VERSION_NUM
> > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> >
>
> I am not sure if this suggestion makes it better than what is purposed
> by Dilip but I think we can declare them in define number order like
> below:
> #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> #define LOGICALREP_PROTO_VERSION_NUM 1
> #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

Done this way.

> Another thing is can we also test by having a publisher of v14 and
> subscriber of v13 or prior version, just reverse of what Ashutosh has
> tested?

I have also tested this and working fine.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Tue, Sep 22, 2020 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > I am not sure if this suggestion makes it better than what is purposed
> > by Dilip but I think we can declare them in define number order like
> > below:
> > #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> > #define LOGICALREP_PROTO_VERSION_NUM 1
> > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
>
> Done this way.
>

- options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
+ options.proto.logical.proto_version = MySubscription->stream ?
+ LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;

Here, I think instead of using MySubscription->stream, we should use
server/walrecv version number as we used at one place in tablesync.c.
Because say if we decide to add something additional for decode of
prepared xacts in PG14 then this check needs to be extended for stream
and prepared where as server version number check won't need to be
changed.

-- 
With Regards,
Amit Kapila.



On Thu, Sep 24, 2020 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 22, 2020 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > I am not sure if this suggestion makes it better than what is purposed
> > > by Dilip but I think we can declare them in define number order like
> > > below:
> > > #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> > > #define LOGICALREP_PROTO_VERSION_NUM 1
> > > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> >
> > Done this way.
> >
>
> - options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
> + options.proto.logical.proto_version = MySubscription->stream ?
> + LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;
>
> Here, I think instead of using MySubscription->stream, we should use
> server/walrecv version number as we used at one place in tablesync.c.

I am not sure how can we do this?  If PG version number is 14 then we
will always sent the LOGICALREP_PROTO_STREAM_VERSION_NUM? then again
we will face the same error right?  I think it should be strictly
based on whether we have enabled the streaming or not.  Because
logical replication protocol is still the same, only if the streaming
is enabled we expect the streaming protocol otherwise not.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



On Thu, Sep 24, 2020 at 4:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Sep 24, 2020 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Sep 22, 2020 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > I am not sure if this suggestion makes it better than what is purposed
> > > > by Dilip but I think we can declare them in define number order like
> > > > below:
> > > > #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> > > > #define LOGICALREP_PROTO_VERSION_NUM 1
> > > > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > > > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> > >
> > > Done this way.
> > >
> >
> > - options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
> > + options.proto.logical.proto_version = MySubscription->stream ?
> > + LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;
> >
> > Here, I think instead of using MySubscription->stream, we should use
> > server/walrecv version number as we used at one place in tablesync.c.
>
> I am not sure how can we do this?
>

Have you checked what will function walrcv_server_version() will
return? I was thinking that if we know that subscriber is connected to
Publisher version < 14 then we can send the right value.


-- 
With Regards,
Amit Kapila.



On Thu, Sep 24, 2020 at 4:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 24, 2020 at 4:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Sep 24, 2020 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Sep 22, 2020 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > I am not sure if this suggestion makes it better than what is purposed
> > > > > by Dilip but I think we can declare them in define number order like
> > > > > below:
> > > > > #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> > > > > #define LOGICALREP_PROTO_VERSION_NUM 1
> > > > > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > > > > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> > > >
> > > > Done this way.
> > > >
> > >
> > > - options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
> > > + options.proto.logical.proto_version = MySubscription->stream ?
> > > + LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;
> > >
> > > Here, I think instead of using MySubscription->stream, we should use
> > > server/walrecv version number as we used at one place in tablesync.c.
> >
> > I am not sure how can we do this?
> >
>
> Have you checked what will function walrcv_server_version() will
> return? I was thinking that if we know that subscriber is connected to
> Publisher version < 14 then we can send the right value.

But, suppose we know the publisher version is < 14 and user has set
streaming on while  creating subscriber then still we will send the
right version?  I think tablesync we are forming a query so we are
forming as per the publisher's version.  But here we are sending which
protocol version we are expecting for the data transfer so I feel it
should be LOGICALREP_PROTO_VERSION_NUM if we expect non-streaming
transfer and LOGICALREP_PROTO_STREAM_VERSION_NUM if we expect the
streaming transfer.  Now let the publisher decide whether it supports
the protocol we have asked for or not and if it doesn't support then
let it give error.  Am I missing something here?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



On Thu, Sep 24, 2020 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Sep 24, 2020 at 4:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Have you checked what will function walrcv_server_version() will
> > return? I was thinking that if we know that subscriber is connected to
> > Publisher version < 14 then we can send the right value.
>
> But, suppose we know the publisher version is < 14 and user has set
> streaming on while  creating subscriber then still we will send the
> right version?
>

Yeah we can send the version depending on which server we are talking to?

>  I think tablesync we are forming a query so we are
> forming as per the publisher's version.  But here we are sending which
> protocol version we are expecting for the data transfer so I feel it
> should be LOGICALREP_PROTO_VERSION_NUM if we expect non-streaming
> transfer and LOGICALREP_PROTO_STREAM_VERSION_NUM if we expect the
> streaming transfer.
>

I am not sure if this is the right strategy. See
libpqrcv_startstreaming, even if the user asked for streaming unless
the server supports it we don't send the streaming option to the user.
Another thing is this check will become more complicated if we need
another feature like decode prepared to send different version or even
if it is the same version, we might need additional checks. Why do you
want to send a streaming protocol version when we know the server
doesn't support it, lets behave similar to what we do in
libpqrcv_startstreaming.

-- 
With Regards,
Amit Kapila.



Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

From
Ashutosh Sharma
Date:
Hi Amit,

On Thu, Sep 24, 2020 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 22, 2020 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > I am not sure if this suggestion makes it better than what is purposed
> > > by Dilip but I think we can declare them in define number order like
> > > below:
> > > #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> > > #define LOGICALREP_PROTO_VERSION_NUM 1
> > > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > > #define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> >
> > Done this way.
> >
>
> - options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
> + options.proto.logical.proto_version = MySubscription->stream ?
> + LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;
>
> Here, I think instead of using MySubscription->stream, we should use
> server/walrecv version number as we used at one place in tablesync.c.

Should subscribers be setting the LR protocol value based on what is
the publisher version it is communicating with or should it be set
based on whether streaming was enabled or not while creating that
subscription? AFAIU if we set this value based on the publisher
version (which is lets say >= 14), then it's quite possible that the
subscriber will start streaming changes for the in-progress
transactions even if the streaming was disabled while creating the
subscription, won't it?

Please let me know if I am missing something here.

Thanks,

-- 
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



On Thu, Sep 24, 2020 at 5:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 24, 2020 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Sep 24, 2020 at 4:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Have you checked what will function walrcv_server_version() will
> > > return? I was thinking that if we know that subscriber is connected to
> > > Publisher version < 14 then we can send the right value.
> >
> > But, suppose we know the publisher version is < 14 and user has set
> > streaming on while  creating subscriber then still we will send the
> > right version?
> >
>
> Yeah we can send the version depending on which server we are talking to?

Ok

> >  I think tablesync we are forming a query so we are
> > forming as per the publisher's version.  But here we are sending which
> > protocol version we are expecting for the data transfer so I feel it
> > should be LOGICALREP_PROTO_VERSION_NUM if we expect non-streaming
> > transfer and LOGICALREP_PROTO_STREAM_VERSION_NUM if we expect the
> > streaming transfer.
> >
>
> I am not sure if this is the right strategy. See
> libpqrcv_startstreaming, even if the user asked for streaming unless
> the server supports it we don't send the streaming option to the user.
> Another thing is this check will become more complicated if we need
> another feature like decode prepared to send different version or even
> if it is the same version, we might need additional checks. Why do you
> want to send a streaming protocol version when we know the server
> doesn't support it, lets behave similar to what we do in
> libpqrcv_startstreaming.

Okay, so even if the streaming is enabled we are sending the streaming
on to the server versions which are >= 14 which make sense.  But I
still doubt that it is right thing to send the protocol version based
on the server version.   For example suppose in future PG20 we change
the streaming protocol version to 3,  that mean from PG14 to PG20 we
may not support the streaming transmission but we still be able to
support the normal transamission.  Now if streaming is off then
ideally we should send the LOGICALREP_PROTO_VERSION_NUM and that is
still supporetd by the publisher but if we send the
LOGICALREP_PROTO_STREAM_VERSION_NUM then it will error out because the
streaming protocol changed in the latest subscriber and the same is
not supported by the older publisher (14).  So now if we want non
streaming transmission from 14 to 20 and that should be allowed but if
based on the server version we always send the
LOGICALREP_PROTO_STREAM_VERSION_NUM then that will be a problem
because our streaming version has changed.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



On Fri, Sep 25, 2020 at 7:21 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Sep 24, 2020 at 5:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Sep 24, 2020 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, Sep 24, 2020 at 4:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Have you checked what will function walrcv_server_version() will
> > > > return? I was thinking that if we know that subscriber is connected to
> > > > Publisher version < 14 then we can send the right value.
> > >
> > > But, suppose we know the publisher version is < 14 and user has set
> > > streaming on while  creating subscriber then still we will send the
> > > right version?
> > >
> >
> > Yeah we can send the version depending on which server we are talking to?
>
> Ok
>
> > >  I think tablesync we are forming a query so we are
> > > forming as per the publisher's version.  But here we are sending which
> > > protocol version we are expecting for the data transfer so I feel it
> > > should be LOGICALREP_PROTO_VERSION_NUM if we expect non-streaming
> > > transfer and LOGICALREP_PROTO_STREAM_VERSION_NUM if we expect the
> > > streaming transfer.
> > >
> >
> > I am not sure if this is the right strategy. See
> > libpqrcv_startstreaming, even if the user asked for streaming unless
> > the server supports it we don't send the streaming option to the user.
> > Another thing is this check will become more complicated if we need
> > another feature like decode prepared to send different version or even
> > if it is the same version, we might need additional checks. Why do you
> > want to send a streaming protocol version when we know the server
> > doesn't support it, lets behave similar to what we do in
> > libpqrcv_startstreaming.
>
> Okay, so even if the streaming is enabled we are sending the streaming
> on to the server versions which are >= 14 which make sense.  But I
> still doubt that it is right thing to send the protocol version based
> on the server version.   For example suppose in future PG20 we change
> the streaming protocol version to 3,  that mean from PG14 to PG20 we
> may not support the streaming transmission but we still be able to
> support the normal transamission.  Now if streaming is off then
> ideally we should send the LOGICALREP_PROTO_VERSION_NUM and that is
> still supporetd by the publisher but if we send the
> LOGICALREP_PROTO_STREAM_VERSION_NUM then it will error out because the
> streaming protocol changed in the latest subscriber and the same is
> not supported by the older publisher (14).
>

No that won't happen, we will send the version based on what the
server version supports (in this case it would be 2 when we are
talking to publisher 14). We can define a new macro or something like
that. I feel the protocol should depend on whether the server supports
it or not rather than based on some user specified option because it
will make our life easier if tomorrow we want to enable streaming by
default rather than based on option specified by the user. You can
consider it this way that how would you handle this if we wouldn't
have a user specified option (streaming) because that is generally the
way it should work.

-- 
With Regards,
Amit Kapila.



On Thu, Sep 24, 2020 at 6:33 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Hi Amit,
>
> > Here, I think instead of using MySubscription->stream, we should use
> > server/walrecv version number as we used at one place in tablesync.c.
>
> Should subscribers be setting the LR protocol value based on what is
> the publisher version it is communicating with or should it be set
> based on whether streaming was enabled or not while creating that
> subscription? AFAIU if we set this value based on the publisher
> version (which is lets say >= 14), then it's quite possible that the
> subscriber will start streaming changes for the in-progress
> transactions even if the streaming was disabled while creating the
> subscription, won't it?
>

No that won't happen because we send this option to the server
(publisher in this case) only when version is >=14 and user has
specified this option. See the below check in function
libpqrcv_startstreaming()
{
..
if (options->proto.logical.streaming &&
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfo(&cmd, ", streaming 'on'");
..
}

-- 
With Regards,
Amit Kapila.



Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

From
Ashutosh Sharma
Date:
On Fri, Sep 25, 2020 at 8:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 24, 2020 at 6:33 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > Hi Amit,
> >
> > > Here, I think instead of using MySubscription->stream, we should use
> > > server/walrecv version number as we used at one place in tablesync.c.
> >
> > Should subscribers be setting the LR protocol value based on what is
> > the publisher version it is communicating with or should it be set
> > based on whether streaming was enabled or not while creating that
> > subscription? AFAIU if we set this value based on the publisher
> > version (which is lets say >= 14), then it's quite possible that the
> > subscriber will start streaming changes for the in-progress
> > transactions even if the streaming was disabled while creating the
> > subscription, won't it?
> >
>
> No that won't happen because we send this option to the server
> (publisher in this case) only when version is >=14 and user has
> specified this option. See the below check in function
> libpqrcv_startstreaming()
> {
> ..
> if (options->proto.logical.streaming &&
> PQserverVersion(conn->streamConn) >= 140000)
> appendStringInfo(&cmd, ", streaming 'on'");
> ..
> }
>

Ah, okay, understood, thanks. So, that means we can use the streaming
protocol version if the server (publisher) supports it, regardless of
whether the client (subscriber) has the streaming option enabled or
not.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



On Fri, Sep 25, 2020 at 8:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 24, 2020 at 6:33 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > Hi Amit,
> >
> > > Here, I think instead of using MySubscription->stream, we should use
> > > server/walrecv version number as we used at one place in tablesync.c.
> >
> > Should subscribers be setting the LR protocol value based on what is
> > the publisher version it is communicating with or should it be set
> > based on whether streaming was enabled or not while creating that
> > subscription? AFAIU if we set this value based on the publisher
> > version (which is lets say >= 14), then it's quite possible that the
> > subscriber will start streaming changes for the in-progress
> > transactions even if the streaming was disabled while creating the
> > subscription, won't it?
> >
>
> No that won't happen because we send this option to the server
> (publisher in this case) only when version is >=14 and user has
> specified this option. See the below check in function
> libpqrcv_startstreaming()
> {
> ..
> if (options->proto.logical.streaming &&
> PQserverVersion(conn->streamConn) >= 140000)
> appendStringInfo(&cmd, ", streaming 'on'");
> ..
> }

Ok, I have modified as per your suggestion.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Fri, Sep 25, 2020 at 1:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Sep 25, 2020 at 8:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > No that won't happen because we send this option to the server
> > (publisher in this case) only when version is >=14 and user has
> > specified this option. See the below check in function
> > libpqrcv_startstreaming()
> > {
> > ..
> > if (options->proto.logical.streaming &&
> > PQserverVersion(conn->streamConn) >= 140000)
> > appendStringInfo(&cmd, ", streaming 'on'");
> > ..
> > }
>
> Ok, I have modified as per your suggestion.
>

Pushed, thanks!

-- 
With Regards,
Amit Kapila.