Thread: Allow logical replication to copy tables in binary format

Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hey hackers,

I see that logical replication subscriptions have an option to enable binary [1]. 
When it's enabled, subscription requests publisher to send data in binary format. 
But this is only the case for apply phase. In tablesync, tables are still copied as text.

To copy tables, COPY command is used and that command supports copying in binary. So it seemed to me possible to copy in binary for tablesync too.
I'm not sure if there is a reason to always copy tables in text format. But I couldn't see why not to do it in binary if it's enabled.

You can find the small patch that only enables binary copy attached.  

What do you think about this change? Does it make sense? Am I missing something?


Best,
Melih
Attachment

Re: Allow logical replication to copy tables in binary format

From
"Euler Taveira"
Date:
On Wed, Aug 10, 2022, at 12:03 PM, Melih Mutlu wrote:
I see that logical replication subscriptions have an option to enable binary [1]. 
When it's enabled, subscription requests publisher to send data in binary format. 
But this is only the case for apply phase. In tablesync, tables are still copied as text.
This option could have been included in the commit 9de77b54531; it wasn't.
Maybe it wasn't considered because the initial table synchronization can be a
separate step in your logical replication setup idk. I agree that the binary
option should be available for the initial table synchronization.

To copy tables, COPY command is used and that command supports copying in binary. So it seemed to me possible to copy in binary for tablesync too.
I'm not sure if there is a reason to always copy tables in text format. But I couldn't see why not to do it in binary if it's enabled.
The reason to use text format is that it is error prone. There are restrictions
while using the binary format. For example, if your schema has different data
types for a certain column, the copy will fail. Even with such restrictions, I
think it is worth adding it.

You can find the small patch that only enables binary copy attached.  
I have a few points about your implementation.

* Are we considering to support prior Postgres versions too? These releases
  support binary mode but it could be an unexpected behavior (initial sync in
  binary mode) for a publisher using 14 or 15 and a subscriber using 16. IMO
  you should only allow it for publisher on 16 or later.
* Docs should say that the binary option also applies to initial table
  synchronization and possibly emphasize some of the restrictions.
* Tests. Are the current tests enough? 014_binary.pl.


--
Euler Taveira

Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Thu, Aug 11, 2022 at 7:34 AM Euler Taveira <euler@eulerto.com> wrote:
>
> On Wed, Aug 10, 2022, at 12:03 PM, Melih Mutlu wrote:
>
> I see that logical replication subscriptions have an option to enable binary [1].
> When it's enabled, subscription requests publisher to send data in binary format.
> But this is only the case for apply phase. In tablesync, tables are still copied as text.
>
> This option could have been included in the commit 9de77b54531; it wasn't.
> Maybe it wasn't considered because the initial table synchronization can be a
> separate step in your logical replication setup idk. I agree that the binary
> option should be available for the initial table synchronization.
>
> To copy tables, COPY command is used and that command supports copying in binary. So it seemed to me possible to copy
inbinary for tablesync too.
 
> I'm not sure if there is a reason to always copy tables in text format. But I couldn't see why not to do it in binary
ifit's enabled.
 
>
> The reason to use text format is that it is error prone. There are restrictions
> while using the binary format. For example, if your schema has different data
> types for a certain column, the copy will fail.
>

Won't such restrictions hold true even during replication?

-- 
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
"Euler Taveira"
Date:
On Thu, Aug 11, 2022, at 8:04 AM, Amit Kapila wrote:
On Thu, Aug 11, 2022 at 7:34 AM Euler Taveira <euler@eulerto.com> wrote:
>
> The reason to use text format is that it is error prone. There are restrictions
> while using the binary format. For example, if your schema has different data
> types for a certain column, the copy will fail.
>

Won't such restrictions hold true even during replication?
I expect that the COPY code matches the proto.c code. The point is that table
sync is decoupled from the logical replication. Hence, we should emphasize in
the documentation that the restrictions *also* apply to the initial table
synchronization.


--
Euler Taveira

Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:

Euler Taveira <euler@eulerto.com>, 11 Ağu 2022 Per, 16:27 tarihinde şunu yazdı:
On Thu, Aug 11, 2022, at 8:04 AM, Amit Kapila wrote:
On Thu, Aug 11, 2022 at 7:34 AM Euler Taveira <euler@eulerto.com> wrote:
>
> The reason to use text format is that it is error prone. There are restrictions
> while using the binary format. For example, if your schema has different data
> types for a certain column, the copy will fail.
>

Won't such restrictions hold true even during replication?
I expect that the COPY code matches the proto.c code. The point is that table
sync is decoupled from the logical replication. Hence, we should emphasize in
the documentation that the restrictions *also* apply to the initial table
synchronization.

If such restrictions are already the case for replication phase after initial table sync, then it shouldn't prevent us from enabling binary option for table sync. Right?
But yes, it needs to be stated somewhere.

Euler Taveira <euler@eulerto.com>, 11 Ağu 2022 Per, 05:03 tarihinde şunu yazdı
I have a few points about your implementation.

* Are we considering to support prior Postgres versions too? These releases
  support binary mode but it could be an unexpected behavior (initial sync in
  binary mode) for a publisher using 14 or 15 and a subscriber using 16. IMO
  you should only allow it for publisher on 16 or later.

How is any issue that might occur due to version mismatch being handled right now in repliaction after table sync?
What I understand from the documentation is if replication can fail due to using different pg versions, it just fails. So binary option cannot be used. [1]
Do you think that this is more serious for table sync and we need to restrict binary option with different publisher and subscriber versions? But not for replication?

* Docs should say that the binary option also applies to initial table
  synchronization and possibly emphasize some of the restrictions.
* Tests. Are the current tests enough? 014_binary.pl.

You're right on both points. I just wanted to know your opinions on this first. Then the patch will need some tests and proper documentation.  


Best,
Melih
  

Re: Allow logical replication to copy tables in binary format

From
"Euler Taveira"
Date:
On Thu, Aug 11, 2022, at 10:46 AM, Melih Mutlu wrote:
If such restrictions are already the case for replication phase after initial table sync, then it shouldn't prevent us from enabling binary option for table sync. Right?
I didn't carefully examine the COPY code but I won't expect significant
differences (related to text vs binary mode) from the logical replication
protocol. After inspecting the git history, I took my argument back after
checking the commit 670c0a1d474. The initial commit 9de77b54531 imposes some
restrictions (user-defined arrays and composite types) as mentioned in the
commit message but it was removed in 670c0a1d474. My main concern is to break a
scenario that was previously working (14 -> 15) but after a subscriber upgrade
it won't (14 -> 16). I would say that you should test some scenarios:
014_binary.pl and also custom data types, same column with different data
types, etc.

How is any issue that might occur due to version mismatch being handled right now in repliaction after table sync?
What I understand from the documentation is if replication can fail due to using different pg versions, it just fails. So binary option cannot be used. [1]
Do you think that this is more serious for table sync and we need to restrict binary option with different publisher and subscriber versions? But not for replication?
It is a conservative argument. If we didn't allow a publisher to run COPY in
binary mode while using previous Postgres versions, we know that it works. (At
least there aren't bug reports for logical replication using binary option.)
Since one of the main use cases for logical replication is migration, I'm
concerned that it may not work (even if the binary option defaults to false,
someone can decide to use it for performance reasons).

I did a quick test and the failure while using binary mode is not clear. Since
you are modifying this code, you could probably provide additional patch(es) to
make it clear that there is an error (due to some documented restriction).


--
Euler Taveira

Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:


Euler Taveira <euler@eulerto.com>, 11 Ağu 2022 Per, 20:16 tarihinde şunu yazdı:
My main concern is to break a scenario that was previously working (14 -> 15) but after a subscriber upgrade
it won't (14 -> 16).
Fair concern. Some cases that might break the logical replication with version upgrade would be:
1- Usage of different binary formats between publisher and subscriber. As stated here [1], binary format has been changed after v7.4.
But I don't think this would be a concern, since we wouldn't even have logical replication with 7.4 and earlier versions.
2- Lack (or mismatch) of binary send/receive functions for custom data types would cause failures. This case can already cause failures with current logical replication, regardless of binary copy. Stated here [2].
3- Copying in binary format would work with the same schemas. Currently, logical replication does not require the exact same schemas in publisher and subscriber. 
This is an additional restriction that comes with the COPY command.

If a logical replication has been set up with different schemas and subscription is created with the binary option, then yes this would break things. 
This restriction can be clearly stated and wouldn't be unexpected though.

I'm also okay with allowing binary copy only for v16 or later, if you think it would be safer and no one disagrees with that. 
What are your thoughts?

I would say that you should test some scenarios:
014_binary.pl and also custom data types, same column with different data
types, etc.
I added scenarios in two tests to test binary copy:
014_binary.pl: This was already testing subscriptions with binary option enabled. I added an extra step to insert initial data before creating the subscription.
So that we can test initial table sync with binary copy.

002_types.pl: This file was already testing more complex data types. I added an extra subscriber node to create a subscription with binary option enabled.
This way, it now tests binary copy with different custom types.

Do you think these would be enough in terms of testing?  
 
Attached patch also includes some additions to the doc along with the tests. 
 
Thanks,
Melih


Attachment

Re: Allow logical replication to copy tables in binary format

From
Andres Freund
Date:
Hi,

On 2022-08-10 18:03:56 +0300, Melih Mutlu wrote:
> To copy tables, COPY command is used and that command supports copying in
> binary. So it seemed to me possible to copy in binary for tablesync too.
> I'm not sure if there is a reason to always copy tables in text format.

It'd be good to collect some performance numbers justifying this. I'd expect
decent gains if there's e.g. a bytea or timestamptz column involved.

Greetings,

Andres Freund



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hello,

Andres Freund <andres@anarazel.de>, 2 Eyl 2022 Cum, 01:25 tarihinde şunu yazdı:
It'd be good to collect some performance numbers justifying this. I'd expect
decent gains if there's e.g. a bytea or timestamptz column involved.

Experimented the binary copy with a quick setup.

- Created a "temp" table with bytea and timestamptz columns
postgres=# \d temp
                        Table "public.temp"
 Column |           Type           | Collation | Nullable | Default
--------+--------------------------+-----------+----------+---------
 i      | integer                  |           |          |
 b      | bytea                    |           |          |
 t      | timestamp with time zone |           |          |
 
- Loaded with ~1GB data
postgres=# SELECT pg_size_pretty( pg_total_relation_size('temp') );
 pg_size_pretty
----------------
 1137 MB
(1 row)

- Created a publication with only this "temp" table. 
- Created a subscription with binary enabled on instances from master branch and this patch.
- Timed the tablesync process by calling the following procedure:
CREATE OR REPLACE PROCEDURE wait_for_rep() LANGUAGE plpgsql AS $$BEGIN WHILE (SELECT count(*) != 0 FROM pg_subscription_rel WHERE srsubstate <> 'r') LOOP COMMIT; END LOOP; END; $$;

Hera are averaged results of multiple consecutive runs from both master branch and the patch:

master (binary enabled but no binary copy): 20007.7948 ms
the patch (allows binary copy): 8874,869 ms 

Seems like a good improvement.
What are your thoughts on this patch?

Best,
Melih

Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi hackers,

I just wanted to gently ping to hear what you all think about this patch.

Appreciate any feedback/thougths.

Thanks,
Melih 

RE: Allow logical replication to copy tables in binary format

From
"osumi.takamichi@fujitsu.com"
Date:
On Tuesday, August 16, 2022 2:04 AM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> Attached patch also includes some additions to the doc along with the tests.

Hi, thank you for updating the patch. Minor review comments for the v2.


(1) whitespace issues

Please fix below whitespace errors.

$ git apply v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch
v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch:39: trailing whitespace.
          binary format.(See <xref linkend="sql-copy"/>.)
v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch:120: trailing whitespace.

v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch:460: trailing whitespace.
);
warning: 3 lines add whitespace errors.


(2) Suggestion to update another general description about the subscription

Kindly have a look at doc/src/sgml/logical-replication.sgml.

"The data types of the columns do not need to match,
as long as the text representation of the data can be converted to the target type.
For example, you can replicate from a column of type integer to a column of type bigint."

With the patch, I think we have an impact about those descriptions
since enabling the binary option for a subscription and executing the
initial synchronization requires the same data types for binary format.

I suggest that we update those descriptions as well.


(3) shouldn't test that we fail expectedly with binary copy for different types ?

How about having a test that we correctly fail with different data types
between the publisher and the subscriber, for instance ?


(4) Error message of the binary format copy

I've gotten below message from data type contradiction (between integer and bigint).
Probably, this is unclear for the users to understand the direct cause 
and needs to be adjusted ?
This might be a similar comment Euler mentioned in [1].

2022-09-16 11:54:54.835 UTC [4570] ERROR:  insufficient data left in message
2022-09-16 11:54:54.835 UTC [4570] CONTEXT:  COPY tab, line 1, column id


(5) Minor adjustment of the test comment in 002_types.pl.

+is( $result, $sync_result, 'check initial sync on subscriber');
+is( $result_binary, $sync_result, 'check initial sync on subscriber in binary');

 # Insert initial test data

There are two same comments which say "Insert initial test data" in this file.
We need to update them, one for the initial table sync and
the other for the application of changes.

[1] - https://www.postgresql.org/message-id/f1d58324-8df4-4bb5-a546-8c741c2e6fa8%40www.fastmail.com



Best Regards,
    Takamichi Osumi


RE: Allow logical replication to copy tables in binary format

From
"osumi.takamichi@fujitsu.com"
Date:
Hi


Few more minor comments.

On Tuesday, August 16, 2022 2:04 AM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> 
> 
>     My main concern is to break a scenario that was previously working (14
> -> 15) but after a subscriber upgrade
>     it won't (14 -> 16).
> 
> Fair concern. Some cases that might break the logical replication with version
> upgrade would be:
...
> 3- Copying in binary format would work with the same schemas. Currently,
> logical replication does not require the exact same schemas in publisher and
> subscriber.
> This is an additional restriction that comes with the COPY command.
> 
> If a logical replication has been set up with different schemas and subscription
> is created with the binary option, then yes this would break things.
> This restriction can be clearly stated and wouldn't be unexpected though.
> 
> I'm also okay with allowing binary copy only for v16 or later, if you think it would
> be safer and no one disagrees with that.
> What are your thoughts?
I agree with the direction to support binary copy for v16 and later.

IIUC, the binary format replication with different data types fails even during apply phase on HEAD.
I thought that means, the upgrade concern only applies to a scenario that the user executes
only initial table synchronizations between the publisher and subscriber
and doesn't replicate any data at apply phase after that. I would say
this isn't a valid scenario and your proposal makes sense.


Best Regards,
    Takamichi Osumi


Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi Takamichi,

Thanks for your reviews. 

I addressed your reviews, please find the attached patch. 

osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com>, 16 Eyl 2022 Cum, 16:51 tarihinde şunu yazdı:
(1) whitespace issues

Fixed 

(2) Suggestion to update another general description about the subscription

Kindly have a look at doc/src/sgml/logical-replication.sgml.

"The data types of the columns do not need to match,
as long as the text representation of the data can be converted to the target type.
For example, you can replicate from a column of type integer to a column of type bigint."

With the patch, I think we have an impact about those descriptions
since enabling the binary option for a subscription and executing the
initial synchronization requires the same data types for binary format.

I suggest that we update those descriptions as well.

You're right, this needs to be stated in the docs. Modified descriptions accordingly.
 
(3) shouldn't test that we fail expectedly with binary copy for different types ?

How about having a test that we correctly fail with different data types
between the publisher and the subscriber, for instance ?

Modified 002_types.pl test such that it now tests the replication between different data types. 
It's expected to fail if the binary is enabled, and succeed if not.
 
(4) Error message of the binary format copy

I've gotten below message from data type contradiction (between integer and bigint).
Probably, this is unclear for the users to understand the direct cause
and needs to be adjusted ?
This might be a similar comment Euler mentioned in [1].

2022-09-16 11:54:54.835 UTC [4570] ERROR:  insufficient data left in message
2022-09-16 11:54:54.835 UTC [4570] CONTEXT:  COPY tab, line 1, column id

It's already unclear for users to understand what's the issue if they're copying data between different column types via the COPY command.
This issue comes from COPY, and logical replication just utilizes COPY.
I don't think it would make sense to adjust an error message from a functionality which logical replication only uses and has no direct impact on.
It might be better to do this in a separate patch. What do you think?
 
(5) Minor adjustment of the test comment in 002_types.pl.

+is( $result, $sync_result, 'check initial sync on subscriber');
+is( $result_binary, $sync_result, 'check initial sync on subscriber in binary');

 # Insert initial test data

There are two same comments which say "Insert initial test data" in this file.
We need to update them, one for the initial table sync and
the other for the application of changes.

Fixed. 

I agree with the direction to support binary copy for v16 and later.

IIUC, the binary format replication with different data types fails even during apply phase on HEAD.
I thought that means, the upgrade concern only applies to a scenario that the user executes
only initial table synchronizations between the publisher and subscriber
and doesn't replicate any data at apply phase after that. I would say
this isn't a valid scenario and your proposal makes sense.
 
No, logical replication in binary does not fail on apply phase if data types are different.  
The concern with upgrade (if data types are not the same) would be not being able to create a new subscription with binary enabled or replicate new tables added into publication.
Replication of tables from existing subscriptions would not be affected by this change since they will already be in the apply phase, not tablesync.
Do you think this would still be an issue?


Thanks,
Melih
  

Attachment

RE: Allow logical replication to copy tables in binary format

From
"osumi.takamichi@fujitsu.com"
Date:
Hi,


On Monday, October 3, 2022 8:50 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>     (4) Error message of the binary format copy
>
>     I've gotten below message from data type contradiction (between
> integer and bigint).
>     Probably, this is unclear for the users to understand the direct cause
>     and needs to be adjusted ?
>     This might be a similar comment Euler mentioned in [1].
>
>     2022-09-16 11:54:54.835 UTC [4570] ERROR:  insufficient data left in
> message
>     2022-09-16 11:54:54.835 UTC [4570] CONTEXT:  COPY tab, line 1,
> column id
>
> It's already unclear for users to understand what's the issue if they're copying
> data between different column types via the COPY command.
> This issue comes from COPY, and logical replication just utilizes COPY.
> I don't think it would make sense to adjust an error message from a functionality
> which logical replication only uses and has no direct impact on.
> It might be better to do this in a separate patch. What do you think?
Yes, makes sense. It should be a separate patch.


>     I agree with the direction to support binary copy for v16 and later.
>
>     IIUC, the binary format replication with different data types fails even
> during apply phase on HEAD.
>     I thought that means, the upgrade concern only applies to a scenario
> that the user executes
>     only initial table synchronizations between the publisher and subscriber
>     and doesn't replicate any data at apply phase after that. I would say
>     this isn't a valid scenario and your proposal makes sense.
>
> No, logical replication in binary does not fail on apply phase if data types are
> different.
With HEAD, I observe in some case we fail at apply phase because of different data types like
integer vs. bigint as written scenario in [1]. In short, I think we can slightly
adjust your documentation and make it more general so that the description applies to
both table sync phase and apply phase.

I'll suggest a below change for your sentence of logical-replication.sgml.
FROM:
In binary case, it is not allowed to replicate data between different types due to restrictions inherited from COPY.
TO:
Binary format is type specific and does not allow to replicate data between different types according to its
restrictions.


If my idea above is correct, then I feel we can remove all the fixes for create_subscription.sgml.
I'm not sure if I should pursue this perspective of the document improvement
any further after this email, since this isn't essentially because of this patch.



> The concern with upgrade (if data types are not the same) would be not being
> able to create a new subscription with binary enabled or replicate new tables
> added into publication.
> Replication of tables from existing subscriptions would not be affected by this
> change since they will already be in the apply phase, not tablesync.
> Do you think this would still be an issue?
Okay, thanks for explaining this. I understand that
the upgrade concern applies to the table sync that is executed
between text format (before the patch) and binary format (after the patch).




[1] - binary format test that we fail for different types on apply phase on HEAD

<publisher>
create table tab (id integer);
insert into tab values (100);
create publication mypub for table tab;

<subscriber>
create table tab (id bigint);
create subscription mysub connection '...' publication mypub with (copy_data = false, binary = true, disable_on_error =
true);

-- wait for several seconds

<subscriber>
select srsubid, srrelid, srrelid::regclass, srsubstate, srsublsn from pg_subscription_rel; -- check the status as 'r'
forthe relation 
select * from tab; -- confirm we don't copy the initial data on the pub

<publisher>
insert into tab values (1), (2);

-- wait for several seconds

<subscriber>
select subname, subenabled from pg_subscription; -- shows 'f' for the 2nd column because of an error
select * from tab -- no records

This error doesn't happen when we adopt 'integer' on the subscriber aligned with the publisher
and we can see the two records on the subscriber.



Best Regards,
    Takamichi Osumi




Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hello,

osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com>, 12 Eki 2022 Çar, 04:36 tarihinde şunu yazdı:
>       I agree with the direction to support binary copy for v16 and later.
>
>       IIUC, the binary format replication with different data types fails even
> during apply phase on HEAD.
>       I thought that means, the upgrade concern only applies to a scenario
> that the user executes
>       only initial table synchronizations between the publisher and subscriber
>       and doesn't replicate any data at apply phase after that. I would say
>       this isn't a valid scenario and your proposal makes sense.
>
> No, logical replication in binary does not fail on apply phase if data types are
> different.
With HEAD, I observe in some case we fail at apply phase because of different data types like
integer vs. bigint as written scenario in [1]. In short, I think we can slightly
adjust your documentation and make it more general so that the description applies to
both table sync phase and apply phase.

Yes, you're right. I somehow had the impression that HEAD supports replication between different types in binary. 
But as can be shown in the scenario you mentioned, it does not work.

I'll suggest a below change for your sentence of logical-replication.sgml.
FROM:
In binary case, it is not allowed to replicate data between different types due to restrictions inherited from COPY.
TO:
Binary format is type specific and does not allow to replicate data between different types according to its
restrictions.

In this case, this change makes sense since this patch does actually not introduce this issue. It already exists in HEAD too. 
 
If my idea above is correct, then I feel we can remove all the fixes for create_subscription.sgml.
I'm not sure if I should pursue this perspective of the document improvement
any further after this email, since this isn't essentially because of this patch.
 
I'm only keeping the following change in create_subscription.sgml to indicate binary option copies in binary format now.
-          Specifies whether the subscription will request the publisher to
-          send the data in binary format (as opposed to text).
+          Specifies whether the subscription will copy the initial data to
+          synchronize relations in binary format and also request the publisher
+          to send the data in binary format too (as opposed to text).

 
> The concern with upgrade (if data types are not the same) would be not being
> able to create a new subscription with binary enabled or replicate new tables
> added into publication.
> Replication of tables from existing subscriptions would not be affected by this
> change since they will already be in the apply phase, not tablesync.
> Do you think this would still be an issue?
Okay, thanks for explaining this. I understand that
the upgrade concern applies to the table sync that is executed
between text format (before the patch) and binary format (after the patch).

I was thinking apply would work with different types in binary format.
Since apply also would not work, then the scenario that I tried to explain earlier is not a concern anymore. 


Attached patch with updated version of this patch.

Thanks,
Melih
 
Attachment

RE: Allow logical replication to copy tables in binary format

From
"shiy.fnst@fujitsu.com"
Date:
On Mon, Nov 14, 2022 8:08 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Attached patch with updated version of this patch.

Thanks for your patch.

I tried to do a performance test for this patch, the result looks good to me.
(The steps are similar to what Melih shared [1].)

The time to synchronize about 1GB data in binary (the average of the middle
eight was taken):
HEAD: 16.854 s
Patched: 6.805 s

Besides, here are some comments.

1.
+# Binary enabled subscription should fail
+$node_subscriber_binary->wait_for_log("ERROR:  insufficient data left in message");

Should it be changed to "ERROR: ( [A-Z0-9]+:)? ", like other subscription tests.

2.
+# Binary disabled subscription should succeed
+$node_publisher->wait_for_catchup('tap_sub');

If we want to wait for table synchronization to finish, should we call
wait_for_subscription_sync()?

3.
I also think it might be better to support copy binary only for publishers of
v16 or later. Do you plan to implement it in the patch?
  
[1] https://www.postgresql.org/message-id/CAGPVpCQEKDVKQPf6OFQ-9WiRYB1YRejm--YJTuwgzuvj1LEJ2A%40mail.gmail.com

Regards,
Shi yu


Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Thanks for your review.

shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com>, 11 Oca 2023 Çar, 11:56 tarihinde şunu yazdı:
On Mon, Nov 14, 2022 8:08 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
1.
+# Binary enabled subscription should fail
+$node_subscriber_binary->wait_for_log("ERROR:  insufficient data left in message");

Should it be changed to "ERROR: ( [A-Z0-9]+:)? ", like other subscription tests.

Done.
 
2.
+# Binary disabled subscription should succeed
+$node_publisher->wait_for_catchup('tap_sub');

If we want to wait for table synchronization to finish, should we call
wait_for_subscription_sync()?

Done.
 
3.
I also think it might be better to support copy binary only for publishers of
v16 or later. Do you plan to implement it in the patch?

Done.

Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
vignesh C
Date:
On Wed, 11 Jan 2023 at 16:14, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi,
>
> Thanks for your review.
>
> shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com>, 11 Oca 2023 Çar, 11:56 tarihinde şunu yazdı:
>>
>> On Mon, Nov 14, 2022 8:08 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>> 1.
>> +# Binary enabled subscription should fail
>> +$node_subscriber_binary->wait_for_log("ERROR:  insufficient data left in message");
>>
>> Should it be changed to "ERROR: ( [A-Z0-9]+:)? ", like other subscription tests.
>
>
> Done.
>
>>
>> 2.
>> +# Binary disabled subscription should succeed
>> +$node_publisher->wait_for_catchup('tap_sub');
>>
>> If we want to wait for table synchronization to finish, should we call
>> wait_for_subscription_sync()?
>
>
> Done.
>
>>
>> 3.
>> I also think it might be better to support copy binary only for publishers of
>> v16 or later. Do you plan to implement it in the patch?
>
>
> Done.

For some reason CFBot is not able to apply the patch as in [1], Could
you have a look and post an updated patch if required:
=== Applying patches on top of PostgreSQL commit ID
c96de2ce1782116bd0489b1cd69ba88189a495e8 ===
=== applying patch
./v5-0001-Allow-logical-replication-to-copy-table-in-binary.patch
gpatch: **** Only garbage was found in the patch input.

[1] - http://cfbot.cputube.org/patch_41_3840.log

Regards,
Vignesh



RE: Allow logical replication to copy tables in binary format

From
"Takamichi Osumi (Fujitsu)"
Date:
On Wednesday, January 11, 2023 7:45 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> Thanks for your review.
> Done.
Hi, minor comments on v5.


(1) publisher's version check

+       /* If the publisher is v16 or later, copy in binary format.*/
+       server_version = walrcv_server_version(LogRepWorkerWalRcvConn);
+       if (server_version >=160000 && MySubscription->binary)
+       {
+               appendStringInfoString(&cmd, "  WITH (FORMAT binary)");
+               options = lappend(options, makeDefElem("format", (Node *) makeString("binary"), -1));
+       }
+
+       elog(LOG, "version: %i, %s", server_version, cmd.data);

(1-1) There is no need to log the version and the query by elog here.
(1-2) Also, I suggest we can remove the server_version variable itself,
      because we have only one actual reference for it.
      There is a style that we call walrcv_server_version in the
      'if condition' directly like existing codes in fetch_remote_table_info().
(1-3) Suggestion to improve comments.
      FROM:
      /* If the publisher is v16 or later, copy in binary format.*/
      TO:
      /* If the publisher is v16 or later, copy data in the required data format. */


(2) Minor suggestion for some test code alignment.

 $result =
   $node_subscriber->safe_psql('postgres',
        "SELECT sum(a) FROM tst_dom_constr");
-is($result, '21', 'sql-function constraint on domain');
+is($result, '33', 'sql-function constraint on domain');
+
+$result_binary =
+  $node_subscriber->safe_psql('postgres',
+       "SELECT sum(a) FROM tst_dom_constr");
+is($result_binary, '33', 'sql-function constraint on domain');


I think if we change the order of this part of check like below, then
it would look more aligned with other existing test codes introduced by this patch.

---
my $domain_check = 'SELECT sum(a) FROM tst_dom_constr';
$result = $node_subscriber->safe_psql('postgres', $domain_check);
$result_binary = $node_subscriber->safe_psql('postgres', $domain_check);
is($result, '33', 'sql-function constraint on domain');
is($result_binary, '33', 'sql-function constraint on domain in binary');
---



Best Regards,
    Takamichi Osumi




Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Thanks for your reviews.

Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com>, 12 Oca 2023 Per, 06:07 tarihinde şunu yazdı:
On Wednesday, January 11, 2023 7:45 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
(1-1) There is no need to log the version and the query by elog here.
(1-2) Also, I suggest we can remove the server_version variable itself,
      because we have only one actual reference for it.
      There is a style that we call walrcv_server_version in the
      'if condition' directly like existing codes in fetch_remote_table_info().
(1-3) Suggestion to improve comments.
      FROM:
      /* If the publisher is v16 or later, copy in binary format.*/
      TO:
      /* If the publisher is v16 or later, copy data in the required data format. */

Forgot to remove that LOG line. Removed it now and applied other suggestions too. 
 
I think if we change the order of this part of check like below, then
it would look more aligned with other existing test codes introduced by this patch.

Right. Changed it to make it more aligned with the rest.

Thanks, 
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
Bharath Rupireddy
Date:
On Thu, Jan 12, 2023 at 1:53 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi,
>
> Thanks for your reviews.

Thanks. I have some comments:

1. The performance numbers posted upthread [1] look impressive for the
use-case tried, that's a 2.25X improvement or 55.6% reduction in
execution times. However, it'll be great to run a few more varied
tests to confirm the benefit.

2. It'll be great to capture the perf report to see if the time spent
in copy_table() is reduced with the patch.

3. I think blending initial table sync's binary copy option with
data-type level binary send/receive is not good. Moreover, data-type
level binary send/receive has its own restrictions per 9de77b5453.
IMO, it'll be good to have a new option, say copy_data_format synonyms
with COPY command's FORMAT option but with only text (default) and
binary values.

4. Why to add tests to existing 002_types.pl? Can we add a new file
with all the data types covered?

5. It's not clear to me as to why these rows were removed in the patch?
-        (1, '{1, 2, 3}'),
-        (2, '{2, 3, 1}'),
         (3, '{3, 2, 1}'),
         (4, '{4, 3, 2}'),
         (5, '{5, NULL, 3}');

     -- test_tbl_arrays
     INSERT INTO tst_arrays (a, b, c, d) VALUES
-        ('{1, 2, 3}', '{"a", "b", "c"}', '{1.1, 2.2, 3.3}', '{"1
day", "2 days", "3 days"}'),
-        ('{2, 3, 1}', '{"b", "c", "a"}', '{2.2, 3.3, 1.1}', '{"2
minutes", "3 minutes", "1 minute"}'),

6. BTW, the subbinary description is missing in pg_subscription docs
https://www.postgresql.org/docs/devel/catalog-pg-subscription.html?
-          Specifies whether the subscription will request the publisher to
-          send the data in binary format (as opposed to text).
+          Specifies whether the subscription will copy the initial data to
+          synchronize relations in binary format and also request the publisher
+          to send the data in binary format too (as opposed to text).

7. A nitpick - space is needed after >= before 160000.
+    if (walrcv_server_version(LogRepWorkerWalRcvConn) >=160000 &&

8. Note that the COPY binary format isn't portable across platforms
(Windows to Linux for instance) or major versions
https://www.postgresql.org/docs/devel/sql-copy.html, whereas, logical
replication is https://www.postgresql.org/docs/devel/logical-replication.html.
I don't see any handling of such cases in copy_table but only a check
for the publisher version. I think we need to account for all the
cases - allow binary COPY only when publisher and subscriber are of
same versions, architectures, platforms. The trick here is how we
identify if the subscriber is of the same type and taste
(architectures and platforms) as the publisher. Looking for
PG_VERSION_STR of publisher and subscriber might be naive, but I'm not
sure if there's a better way to do it.

Also, the COPY binary format is very data type specific - per the docs
"for example it will not work to output binary data from a smallint
column and read it into an integer column, even though that would work
fine in text format.". I did a small experiment [2], the logical
replication works with compatible data types (int -> smallint, even
int -> text), whereas the COPY binary format doesn't.

I think it'll complicate things a bit to account for the above cases
and allow COPY with binary format for logical replication.

[1] https://www.postgresql.org/message-id/CAGPVpCQEKDVKQPf6OFQ-9WiRYB1YRejm--YJTuwgzuvj1LEJ2A%40mail.gmail.com
[2]
DROP TABLE foo;
DROP PUBLICATION mypub;
CREATE TABLE foo(c1 bigint, c2 int, c3 smallint);
INSERT INTO foo SELECT i , i+1, i+2 FROM generate_series(1, 5) i;
CREATE PUBLICATION mypub FOR TABLE foo;
SELECT COUNT(*) FROM foo;
SELECT * FROM foo;

DROP SUBSCRIPTION mysub;
DROP TABLE foo;
CREATE TABLE foo(c1 smallint, c2 smallint, c3 smallint); -- works
without any problem
-- OR
CREATE TABLE foo(c1 smallint, c2 text, c3 smallint); -- works without
any problem
CREATE SUBSCRIPTION mysub CONNECTION 'port=5432 dbname=postgres
user=ubuntu' PUBLICATION mypub;
SELECT COUNT(*) FROM foo;
SELECT * FROM foo;

drop table foo;
create table foo(c1 bigint, c2 int, c3 smallint);
insert into foo select i, i+1, i+2 from generate_series(1, 10) i;
copy foo(c1, c2, c3) to '/home/ubuntu/postgres/inst/bin/data/foo.text'
with (format 'text');
copy foo(c1, c2, c3) to
'/home/ubuntu/postgres/inst/bin/data/foo.binary' with (format
'binary');
drop table bar;
create table bar(c1 smallint, c2 smallint, c3 smallint);
-- or
create table bar(c1 smallint, c2 text, c3 smallint);
copy bar(c1, c2, c3) from
'/home/ubuntu/postgres/inst/bin/data/foo.text' with (format 'text');
copy bar(c1, c2, c3) from
'/home/ubuntu/postgres/inst/bin/data/foo.binary' with (format
'binary'); -- produces "ERROR:  incorrect binary data format"

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi Bharath,

Thanks for reviewing. 

Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>, 18 Oca 2023 Çar, 10:17 tarihinde şunu yazdı:
On Thu, Jan 12, 2023 at 1:53 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
1. The performance numbers posted upthread [1] look impressive for the
use-case tried, that's a 2.25X improvement or 55.6% reduction in
execution times. However, it'll be great to run a few more varied
tests to confirm the benefit.

Sure, do you have any specific test case or suggestion in mind? 
 
2. It'll be great to capture the perf report to see if the time spent
in copy_table() is reduced with the patch.

Will provide that in another email soon.
 
3. I think blending initial table sync's binary copy option with
data-type level binary send/receive is not good. Moreover, data-type
level binary send/receive has its own restrictions per 9de77b5453.
IMO, it'll be good to have a new option, say copy_data_format synonyms
with COPY command's FORMAT option but with only text (default) and
binary values.

Added a "copy_format" option for subscriptions with text as default value. So it would be possible to create a binary subscription but copy tables in text format to avoid restrictions that you're concerned about.
 
4. Why to add tests to existing 002_types.pl? Can we add a new file
with all the data types covered?

Since 002_types.pl is where the replication of data types are covered. I thought it would be okay to test replication with the binary option in that file.
Sure, we can add a new file with different data types for testing subscriptions with binary option. But then I feel like it would have too many duplicated lines with 002_types.pl.
If you think that 002_types.pl lacks some data types needs to be tested, then we should add those into 002_types.pl too whether we test subscription with binary option in that file or some other place, right?  
 
5. It's not clear to me as to why these rows were removed in the patch?
-        (1, '{1, 2, 3}'),
-        (2, '{2, 3, 1}'),
         (3, '{3, 2, 1}'),
         (4, '{4, 3, 2}'),
         (5, '{5, NULL, 3}');

     -- test_tbl_arrays
     INSERT INTO tst_arrays (a, b, c, d) VALUES
-        ('{1, 2, 3}', '{"a", "b", "c"}', '{1.1, 2.2, 3.3}', '{"1
day", "2 days", "3 days"}'),
-        ('{2, 3, 1}', '{"b", "c", "a"}', '{2.2, 3.3, 1.1}', '{"2
minutes", "3 minutes", "1 minute"}'),

Previously, it wasn't actually testing the initial table sync since all tables were empty when subscription was created.
I just simply split the data initially inserted to test initial table sync.

With this patch, it inserts the first two rows for all data types before subscriptions get created. 
You can see these lines:
+# Insert initial test data
+$node_publisher->safe_psql(
+       'postgres', qq(
+       -- test_tbl_one_array_col
+       INSERT INTO tst_one_array (a, b) VALUES
+               (1, '{1, 2, 3}'),
+               (2, '{2, 3, 1}');
+
+       -- test_tbl_arrays
+       INSERT INTO tst_arrays (a, b, c, d) VALUES

 
6. BTW, the subbinary description is missing in pg_subscription docs
https://www.postgresql.org/docs/devel/catalog-pg-subscription.html?
-          Specifies whether the subscription will request the publisher to
-          send the data in binary format (as opposed to text).
+          Specifies whether the subscription will copy the initial data to
+          synchronize relations in binary format and also request the publisher
+          to send the data in binary format too (as opposed to text).

Done.
 
7. A nitpick - space is needed after >= before 160000.
+    if (walrcv_server_version(LogRepWorkerWalRcvConn) >=160000 &&

Done.
 
8. Note that the COPY binary format isn't portable across platforms
(Windows to Linux for instance) or major versions
https://www.postgresql.org/docs/devel/sql-copy.html, whereas, logical
replication is https://www.postgresql.org/docs/devel/logical-replication.html.
I don't see any handling of such cases in copy_table but only a check
for the publisher version. I think we need to account for all the
cases - allow binary COPY only when publisher and subscriber are of
same versions, architectures, platforms. The trick here is how we
identify if the subscriber is of the same type and taste
(architectures and platforms) as the publisher. Looking for
PG_VERSION_STR of publisher and subscriber might be naive, but I'm not
sure if there's a better way to do it.

I think having the "copy_format" option with text as default, like I replied to your 3rd review above, will keep things as they are now.
The patch now will only allow users to choose binary copy as well, if they want it and acknowledge the limitations that come with binary copy.
COPY command's portability issues shouldn't be an issue right now, since the patch still supports text format. Right?
 
Also, the COPY binary format is very data type specific - per the docs
"for example it will not work to output binary data from a smallint
column and read it into an integer column, even though that would work
fine in text format.". I did a small experiment [2], the logical
replication works with compatible data types (int -> smallint, even
int -> text), whereas the COPY binary format doesn't.

Logical replication between different types like int and smallint is already not working properly on HEAD too.
Yes, the scenario you shared looks like working. But you didn't create the subscription with binary=true. The patch did not change subscription with binary=false case. I believe what you should experiment is binary=true case which already fails in the apply phase on HEAD.

Well, with this patch, it will begin to fail in the table copy phase. But I don't think this is a problem because logical replication in binary format is already broken for replications between different data types.

Please see [1] and you'll get the following error in your case:
"ERROR:  incorrect binary data format in logical replication column 1"
    
[1]
Publisher:
DROP TABLE foo;
DROP PUBLICATION mypub;
CREATE TABLE foo(c1 bigint, c2 int, c3 smallint);
INSERT INTO foo SELECT i , i+1, i+2 FROM generate_series(1, 5) i;
CREATE PUBLICATION mypub FOR TABLE foo;
SELECT COUNT(*) FROM foo;
SELECT * FROM foo;

Subscriber:
DROP SUBSCRIPTION mysub;
DROP TABLE foo;
CREATE TABLE foo(c1 smallint, c2 smallint, c3 smallint);
CREATE SUBSCRIPTION mysub CONNECTION 'port=5432 dbname=postgres
user=ubuntu' PUBLICATION mypub WITH(binary); -- table sync will be successful since they're copied in text even though set binary=true
SELECT COUNT(*) FROM foo;
SELECT * FROM foo;

Back to publisher:
INSERT INTO foo SELECT i , i+1, i+2 FROM generate_series(1, 5) i; -- insert more rows to see whether the apply also works
SELECT COUNT(*) FROM foo; -- you'll see that new rows does not get replicated

In subscriber logs:
LOG:  logical replication apply worker for subscription "mysub" has started
ERROR:  incorrect binary data format in logical replication column 1
CONTEXT:  processing remote data for replication origin "pg_16395" during message type "INSERT" for replication target relation "public.foo" column "c1" in transaction 747, finished at 0/157F3E0
LOG:  background worker "logical replication worker" (PID 16903) exited with exit code 1

Best,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
Bharath Rupireddy
Date:
On Mon, Jan 30, 2023 at 4:19 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>

Thanks for providing an updated patch.

>> On Thu, Jan 12, 2023 at 1:53 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>> 1. The performance numbers posted upthread [1] look impressive for the
>> use-case tried, that's a 2.25X improvement or 55.6% reduction in
>> execution times. However, it'll be great to run a few more varied
>> tests to confirm the benefit.
>
> Sure, do you have any specific test case or suggestion in mind?

Have a huge amount of publisher's table (with mix of columns like int,
text, double, bytea and so on) prior data so that the subscriber's
table sync workers have to do a "good" amount of work to copy, then
measure the copy_table time with and without patch.

>> 2. It'll be great to capture the perf report to see if the time spent
>> in copy_table() is reduced with the patch.
>
> Will provide that in another email soon.

Thanks.

>> 4. Why to add tests to existing 002_types.pl? Can we add a new file
>> with all the data types covered?
>
> Since 002_types.pl is where the replication of data types are covered. I thought it would be okay to test replication
withthe binary option in that file.
 
> Sure, we can add a new file with different data types for testing subscriptions with binary option. But then I feel
likeit would have too many duplicated lines with 002_types.pl.
 
> If you think that 002_types.pl lacks some data types needs to be tested, then we should add those into 002_types.pl
toowhether we test subscription with binary option in that file or some other place, right?
 
>
> Previously, it wasn't actually testing the initial table sync since all tables were empty when subscription was
created.
> I just simply split the data initially inserted to test initial table sync.
>
> With this patch, it inserts the first two rows for all data types before subscriptions get created.
> You can see these lines:

It'd be better and clearer to have a separate TAP test file IMO since
the focus of the feature here isn't to just test for data types. With
separate tests, you can verify "ERROR:  incorrect binary data format
in logical replication column 1" cases.

>> 8. Note that the COPY binary format isn't portable across platforms
>> (Windows to Linux for instance) or major versions
>> https://www.postgresql.org/docs/devel/sql-copy.html, whereas, logical
>> replication is https://www.postgresql.org/docs/devel/logical-replication.html.
>> I don't see any handling of such cases in copy_table but only a check
>> for the publisher version. I think we need to account for all the
>> cases - allow binary COPY only when publisher and subscriber are of
>> same versions, architectures, platforms. The trick here is how we
>> identify if the subscriber is of the same type and taste
>> (architectures and platforms) as the publisher. Looking for
>> PG_VERSION_STR of publisher and subscriber might be naive, but I'm not
>> sure if there's a better way to do it.
>
> I think having the "copy_format" option with text as default, like I replied to your 3rd review above, will keep
thingsas they are now.
 
> The patch now will only allow users to choose binary copy as well, if they want it and acknowledge the limitations
thatcome with binary copy.
 
> COPY command's portability issues shouldn't be an issue right now, since the patch still supports text format.
Right?

With the above said, do you think checking for publisher versions is
needed? The user can decide to enable/disable binary COPY based on the
publisher's version no?
+    /* If the publisher is v16 or later, specify the format to copy data. */
+    if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 160000)
+    {

Few more comments on v7:
1.
+          Specifies the format in which pre-existing data on the publisher will
+          copied to the subscriber. Supported formats are
+          <literal>text</literal> and <literal>binary</literal>. The default is
+          <literal>text</literal>.
It'll be good to call out the cases in the documentation as to where
copy_format can be enabled and needs to be disabled.

2.
+             errmsg("%s value should be either \"text\" or \"binary\"",
How about "value must be either ...."?

3.
+    if (!opts->binary &&
+        opts->copy_format == LOGICALREP_COPY_AS_BINARY)
+    {
+        ereport(ERROR,
+                        (errcode(ERRCODE_SYNTAX_ERROR),
+                         errmsg("%s and %s are mutually exclusive options",
+                                "binary = false", "copy_format = binary")));

+    "CREATE SUBSCRIPTION tap_sub_binary CONNECTION
'$publisher_connstr' PUBLICATION tap_pub WITH (slot_name =
tap_sub_binary_slot, binary = true, copy_format = 'binary')"
Why should the subscription's binary option and copy_format option be
tied at all? Tying these two options hurts usability. Is there a
fundamental reason? I think they both are/must be independent. One
deals with data types and another deals with how initial table data is
copied.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



RE: Allow logical replication to copy tables in binary format

From
"Takamichi Osumi (Fujitsu)"
Date:
On Monday, January 30, 2023 7:50 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> Thanks for reviewing.
...
> Please see [1] and you'll get the following error in your case:
> "ERROR:  incorrect binary data format in logical replication column 1"
Hi, thanks for sharing v7.

(1) general comment

I wondered if the addition of the new option/parameter can introduce some confusion to the users.

case 1. When binary = true and copy_format = text, the table sync is conducted by text.
case 2. When binary = false and copy_format = binary, the table sync is conducted by binary.
(Note that the case 2 can be accepted after addressing the 3rd comment of Bharath-san in [1].
I agree with the 3rd comment by itself.)

The name of the new subscription parameter looks confusing.
How about "init_sync_format" or something ?

(2) The commit message doesn't get updated.

The commit message needs to mention the new subscription option.

(3) whitespace errors.

$ git am v7-0001-Allow-logical-replication-to-copy-table-in-binary.patch
Applying: Allow logical replication to copy table in binary
.git/rebase-apply/patch:95: trailing whitespace.
          copied to the subscriber. Supported formats are
.git/rebase-apply/patch:101: trailing whitespace.
          that data will not be copied if <literal>copy_data = false</literal>.
warning: 2 lines add whitespace errors.

(4) pg_dump.c

        if (fout->remoteVersion >= 160000)
-               appendPQExpBufferStr(query, " s.suborigin\n");
+               appendPQExpBufferStr(query, " s.suborigin,\n");
        else
-               appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+               appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+
+       if (fout->remoteVersion >= 160000)
+               appendPQExpBufferStr(query, " s.subcopyformat\n");
+       else
+               appendPQExpBuffer(query, " '%c' AS subcopyformat\n", LOGICALREP_COPY_AS_TEXT);

This new branch for v16 can be made together with the previous same condition.

(5) describe.c

+
+               /* Copy format is only supported in v16 and higher */
+               if (pset.sversion >= 160000)
+                       appendPQExpBuffer(&buf,
+                                                         ", subcopyformat AS \"%s\"\n",
+                                                         gettext_noop("Copy Format"));


Similarly to (4), this creates a new branch for v16. Please see the above codes of this part.

(6)

+ * Extract the copy format value from a DefElem.
+ */
+char
+defGetCopyFormat(DefElem *def)

Shouldn't this function be static and remove the change of subscriptioncmds.h ?

(7) catalogs.sgml

The subcopyformat should be mentioned here and the current description for subbinary
should be removed.

(8) create_subscription.sgml

+          <literal>text</literal>.
+
+          <literal>binary</literal> format can be selected only if

Unnecessary blank line.

[1] - https://www.postgresql.org/message-id/CALj2ACW5Oa7_v25iZb326UXvtM_tjQfw0Tc3hPJ8zN4FZqc9cw%40mail.gmail.com


Best Regards,
    Takamichi Osumi




RE: Allow logical replication to copy tables in binary format

From
"Takamichi Osumi (Fujitsu)"
Date:
Hi, Melih


On Monday, January 30, 2023 7:50 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> Thanks for reviewing.
...
> Well, with this patch, it will begin to fail in the table copy phase...
The latest patch doesn't get updated for more than two weeks
after some review comments. If you don't have time,
I would like to help updating the patch for you and other reviewers.

Kindly let me know your status.


Best Regards,
    Takamichi Osumi




Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Please see the attached patch for following changes. 

Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>, 30 Oca 2023 Pzt, 15:34 tarihinde şunu yazdı:
On Mon, Jan 30, 2023 at 4:19 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote: 
It'd be better and clearer to have a separate TAP test file IMO since
the focus of the feature here isn't to just test for data types. With
separate tests, you can verify "ERROR:  incorrect binary data format
in logical replication column 1" cases.

Moved some tests from 002_types.pl to 014_binary.pl since this is where most binary features are tested. It covers now "incorrect data format" cases too.
Also added some regression tests for copy_format parameter.
 
With the above said, do you think checking for publisher versions is
needed? The user can decide to enable/disable binary COPY based on the
publisher's version no?
+    /* If the publisher is v16 or later, specify the format to copy data. */
+    if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 160000)
+    {

If the user decides to enable it, then it might be nice to not allow it for previous versions. 
But I'm not sure. I'm okay to remove it if you all agree.
 
Few more comments on v7:
1.
+          Specifies the format in which pre-existing data on the publisher will
+          copied to the subscriber. Supported formats are
+          <literal>text</literal> and <literal>binary</literal>. The default is
+          <literal>text</literal>.
It'll be good to call out the cases in the documentation as to where
copy_format can be enabled and needs to be disabled.

Modified that description a bit. Can you check if that's okay now?
 
2.
+             errmsg("%s value should be either \"text\" or \"binary\"",
How about "value must be either ...."?
 
Done
 
3.
Why should the subscription's binary option and copy_format option be
tied at all? Tying these two options hurts usability. Is there a
fundamental reason? I think they both are/must be independent. One
deals with data types and another deals with how initial table data is
copied.

My initial purpose with this patch was just making subscriptions with binary option enabled fully binary from initial copy to apply. Things have changed a bit when we decided to move binary copy behind a parameter.
I didn't actually think there would be any use case where a user wants the initial copy to be in binary format for a sub with binary = false. Do you think it would be useful to copy in binary even for a sub with binary disabled?

Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Thanks for reviewing. Please see the v8 here [1]

Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com>, 1 Şub 2023 Çar, 06:05 tarihinde şunu yazdı:
(1) general comment

I wondered if the addition of the new option/parameter can introduce some confusion to the users.

case 1. When binary = true and copy_format = text, the table sync is conducted by text.
case 2. When binary = false and copy_format = binary, the table sync is conducted by binary.
(Note that the case 2 can be accepted after addressing the 3rd comment of Bharath-san in [1].
I agree with the 3rd comment by itself.)

I replied to Bharath's comment [1], can you please check to see if that makes sense?
 
The name of the new subscription parameter looks confusing.
How about "init_sync_format" or something ?

The option to enable initial sync is named "copy_data", so I named the new parameter as "copy_format" to refer to that copy meant by "copy_data". Maybe "copy_data_format" would be better. I can change it if you think it's better.
 
(2) The commit message doesn't get updated.
 
Done 
 
(3) whitespace errors.

Done
 
(4) pg_dump.c

Done
 
(5) describe.c

Done
 
(6)

+ * Extract the copy format value from a DefElem.
+ */
+char
+defGetCopyFormat(DefElem *def)

Shouldn't this function be static and remove the change of subscriptioncmds.h ?

I wanted to make "defGetCopyFormat" be consistent with "defGetStreamingMode" since they're basically doing the same work for different parameters. And that function isn't static, so I didn't make "defGetCopyFormat" static too.
 
(7) catalogs.sgml
 
Done 

(8) create_subscription.sgml

Done
 
Also;

The latest patch doesn't get updated for more than two weeks
after some review comments. If you don't have time,
I would like to help updating the patch for you and other reviewers.

Kindly let me know your status.

 Sorry for the delay. This patch is currently one of my priorities. Hopefully I will share quicker updates from now on.


Thanks,
--
Melih Mutlu
Microsoft

Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Mon, Jan 30, 2023 at 4:19 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi Bharath,
>
> Thanks for reviewing.
>
> Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>, 18 Oca 2023 Çar, 10:17 tarihinde şunu yazdı:
>>
>> On Thu, Jan 12, 2023 at 1:53 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>> 1. The performance numbers posted upthread [1] look impressive for the
>> use-case tried, that's a 2.25X improvement or 55.6% reduction in
>> execution times. However, it'll be great to run a few more varied
>> tests to confirm the benefit.
>
>
> Sure, do you have any specific test case or suggestion in mind?
>
>>
>> 2. It'll be great to capture the perf report to see if the time spent
>> in copy_table() is reduced with the patch.
>
>
> Will provide that in another email soon.
>
>>
>> 3. I think blending initial table sync's binary copy option with
>> data-type level binary send/receive is not good. Moreover, data-type
>> level binary send/receive has its own restrictions per 9de77b5453.
>> IMO, it'll be good to have a new option, say copy_data_format synonyms
>> with COPY command's FORMAT option but with only text (default) and
>> binary values.
>
>
> Added a "copy_format" option for subscriptions with text as default value. So it would be possible to create a binary
subscriptionbut copy tables in text format to avoid restrictions that you're concerned about. 
>
>>
>> 4. Why to add tests to existing 002_types.pl? Can we add a new file
>> with all the data types covered?
>
>
> Since 002_types.pl is where the replication of data types are covered. I thought it would be okay to test replication
withthe binary option in that file. 
> Sure, we can add a new file with different data types for testing subscriptions with binary option. But then I feel
likeit would have too many duplicated lines with 002_types.pl. 
> If you think that 002_types.pl lacks some data types needs to be tested, then we should add those into 002_types.pl
toowhether we test subscription with binary option in that file or some other place, right? 
>
>>
>> 5. It's not clear to me as to why these rows were removed in the patch?
>> -        (1, '{1, 2, 3}'),
>> -        (2, '{2, 3, 1}'),
>>          (3, '{3, 2, 1}'),
>>          (4, '{4, 3, 2}'),
>>          (5, '{5, NULL, 3}');
>>
>>      -- test_tbl_arrays
>>      INSERT INTO tst_arrays (a, b, c, d) VALUES
>> -        ('{1, 2, 3}', '{"a", "b", "c"}', '{1.1, 2.2, 3.3}', '{"1
>> day", "2 days", "3 days"}'),
>> -        ('{2, 3, 1}', '{"b", "c", "a"}', '{2.2, 3.3, 1.1}', '{"2
>> minutes", "3 minutes", "1 minute"}'),
>
>
> Previously, it wasn't actually testing the initial table sync since all tables were empty when subscription was
created.
> I just simply split the data initially inserted to test initial table sync.
>
> With this patch, it inserts the first two rows for all data types before subscriptions get created.
> You can see these lines:
>>
>> +# Insert initial test data
>> +$node_publisher->safe_psql(
>> +       'postgres', qq(
>> +       -- test_tbl_one_array_col
>> +       INSERT INTO tst_one_array (a, b) VALUES
>> +               (1, '{1, 2, 3}'),
>> +               (2, '{2, 3, 1}');
>> +
>> +       -- test_tbl_arrays
>> +       INSERT INTO tst_arrays (a, b, c, d) VALUES
>
>
>
>>
>> 6. BTW, the subbinary description is missing in pg_subscription docs
>> https://www.postgresql.org/docs/devel/catalog-pg-subscription.html?
>> -          Specifies whether the subscription will request the publisher to
>> -          send the data in binary format (as opposed to text).
>> +          Specifies whether the subscription will copy the initial data to
>> +          synchronize relations in binary format and also request the publisher
>> +          to send the data in binary format too (as opposed to text).
>
>
> Done.
>
>>
>> 7. A nitpick - space is needed after >= before 160000.
>> +    if (walrcv_server_version(LogRepWorkerWalRcvConn) >=160000 &&
>
>
> Done.
>
>>
>> 8. Note that the COPY binary format isn't portable across platforms
>> (Windows to Linux for instance) or major versions
>> https://www.postgresql.org/docs/devel/sql-copy.html, whereas, logical
>> replication is https://www.postgresql.org/docs/devel/logical-replication.html.
>> I don't see any handling of such cases in copy_table but only a check
>> for the publisher version. I think we need to account for all the
>> cases - allow binary COPY only when publisher and subscriber are of
>> same versions, architectures, platforms. The trick here is how we
>> identify if the subscriber is of the same type and taste
>> (architectures and platforms) as the publisher. Looking for
>> PG_VERSION_STR of publisher and subscriber might be naive, but I'm not
>> sure if there's a better way to do it.
>
>
> I think having the "copy_format" option with text as default, like I replied to your 3rd review above, will keep
thingsas they are now. 
> The patch now will only allow users to choose binary copy as well, if they want it and acknowledge the limitations
thatcome with binary copy. 
> COPY command's portability issues shouldn't be an issue right now, since the patch still supports text format. Right?
>

One thing that is not completely clear from above is whether we will
have any problem if the subscription uses binary mode for copying
across the server versions. Do we use binary file during the copy used
in logical replication?

>>
>> Also, the COPY binary format is very data type specific - per the docs
>> "for example it will not work to output binary data from a smallint
>> column and read it into an integer column, even though that would work
>> fine in text format.". I did a small experiment [2], the logical
>> replication works with compatible data types (int -> smallint, even
>> int -> text), whereas the COPY binary format doesn't.
>
>
> Logical replication between different types like int and smallint is already not working properly on HEAD too.
> Yes, the scenario you shared looks like working. But you didn't create the subscription with binary=true. The patch
didnot change subscription with binary=false case. I believe what you should experiment is binary=true case which
alreadyfails in the apply phase on HEAD. 
>
> Well, with this patch, it will begin to fail in the table copy phase. But I don't think this is a problem because
logicalreplication in binary format is already broken for replications between different data types. 
>

So, doesn't this mean that there is no separate failure mode during
the initial copy? I am clarifying this to see if the patch really
needs a separate copy_format option for initial sync?

--
With Regards,
Amit Kapila.



RE: Allow logical replication to copy tables in binary format

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Melih,

Thank you for updating the patch. Before reviewing, I found that
cfbot have not accepted v8 patch [1].

IIUC src/psql/describe.c has been modified in v8, but src/test/regress/expected/subscription.out
has not been changed accordingly.

[1]: https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/42/3840

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


RE: Allow logical replication to copy tables in binary format

From
"shiy.fnst@fujitsu.com"
Date:
On Thu, Feb 16, 2023 8:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Mon, Jan 30, 2023 at 4:19 PM Melih Mutlu <m.melihmutlu@gmail.com>
> wrote:
> >
> > Logical replication between different types like int and smallint is already not
> working properly on HEAD too.
> > Yes, the scenario you shared looks like working. But you didn't create the
> subscription with binary=true. The patch did not change subscription with
> binary=false case. I believe what you should experiment is binary=true case
> which already fails in the apply phase on HEAD.
> >
> > Well, with this patch, it will begin to fail in the table copy phase. But I don't
> think this is a problem because logical replication in binary format is already
> broken for replications between different data types.
> >
> 
> So, doesn't this mean that there is no separate failure mode during
> the initial copy? I am clarifying this to see if the patch really
> needs a separate copy_format option for initial sync?
> 

In the case that the data type doesn't have binary output function, for apply
phase, the column will be sent in text format (see logicalrep_write_tuple()) and
it works fine. But with copy_format = binary, the walsender exits with an
error.

For example:
-- create table on publisher and subscriber
CREATE TYPE myvarchar;
CREATE FUNCTION myvarcharin(cstring, oid, integer) RETURNS myvarchar
LANGUAGE internal IMMUTABLE PARALLEL SAFE STRICT AS 'varcharin';
CREATE FUNCTION myvarcharout(myvarchar) RETURNS cstring
LANGUAGE internal IMMUTABLE PARALLEL SAFE STRICT AS 'varcharout';
CREATE TYPE myvarchar (
    input = myvarcharin,
    output = myvarcharout,
    alignment = integer,
    storage = main
);
CREATE TABLE tbl1 (a myvarchar);

-- create publication and insert some data on publisher
create publication pub for table tbl1;
INSERT INTO tbl1 values ('a');

-- create subscription on subscriber
create subscription sub connection 'dbname=postgres port=5432' publication pub with(binary, copy_format = binary);

Then I got the following error in the publisher log.

walsender ERROR:  no binary output function available for type public.myvarchar
walsender STATEMENT:  COPY public.tbl1 (a) TO STDOUT  WITH (FORMAT binary)

Regards,
Shi Yu

Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>, 20 Şub 2023 Pzt, 10:12 tarihinde şunu yazdı:
Dear Melih,

Thank you for updating the patch. Before reviewing, I found that
cfbot have not accepted v8 patch [1].

Thanks for letting me know. 
Attached the fixed version of the patch.

Best,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:


Amit Kapila <amit.kapila16@gmail.com>, 16 Şub 2023 Per, 15:47 tarihinde şunu yazdı:
On Mon, Jan 30, 2023 at 4:19 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>> 8. Note that the COPY binary format isn't portable across platforms
>> (Windows to Linux for instance) or major versions
>> https://www.postgresql.org/docs/devel/sql-copy.html, whereas, logical
>> replication is https://www.postgresql.org/docs/devel/logical-replication.html.
>> I don't see any handling of such cases in copy_table but only a check
>> for the publisher version. I think we need to account for all the
>> cases - allow binary COPY only when publisher and subscriber are of
>> same versions, architectures, platforms. The trick here is how we
>> identify if the subscriber is of the same type and taste
>> (architectures and platforms) as the publisher. Looking for
>> PG_VERSION_STR of publisher and subscriber might be naive, but I'm not
>> sure if there's a better way to do it.
>
>
> I think having the "copy_format" option with text as default, like I replied to your 3rd review above, will keep things as they are now.
> The patch now will only allow users to choose binary copy as well, if they want it and acknowledge the limitations that come with binary copy.
> COPY command's portability issues shouldn't be an issue right now, since the patch still supports text format. Right?
>

One thing that is not completely clear from above is whether we will
have any problem if the subscription uses binary mode for copying
across the server versions. Do we use binary file during the copy used
in logical replication?

Since binary copy relies on COPY command, we may have problems across different server versions in cases where COPY is not portable.
What I understand from this [1], COPY works across server versions later than 7.4. This shouldn't be a problem for logical replication.
Currently the patch also does not allow binary copy if the publisher version is older than 16.
 
> > Logical replication between different types like int and smallint is already not
> working properly on HEAD too.
> > Yes, the scenario you shared looks like working. But you didn't create the
> subscription with binary=true. The patch did not change subscription with
> binary=false case. I believe what you should experiment is binary=true case
> which already fails in the apply phase on HEAD.
> >
> > Well, with this patch, it will begin to fail in the table copy phase. But I don't
> think this is a problem because logical replication in binary format is already
> broken for replications between different data types.
> >

So, doesn't this mean that there is no separate failure mode during
the initial copy? I am clarifying this to see if the patch really
needs a separate copy_format option for initial sync?

It will fail in a case such as [2] while it would work on HEAD. 
What I meant by my above comment was that binary enabled subscriptions are not already working properly if they replicate between different types. So, the failure caused by replicating, for example, from smallint to int is not really introduced by this patch. Such subscriptions would fail in apply phase anyway. With this patch they will fail while binary copy. 
 

Best,
--
Melih Mutlu
Microsoft

Re: Allow logical replication to copy tables in binary format

From
Bharath Rupireddy
Date:
On Mon, Feb 20, 2023 at 5:17 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Thanks for letting me know.
> Attached the fixed version of the patch.

Thanks. I have few comments on v9 patch:

1.
+                    /* Do not allow binary = false with copy_format = binary */
+                    if (!opts.binary &&
+                        sub->copyformat == LOGICALREP_COPY_AS_BINARY &&
+                        !IsSet(opts.specified_opts, SUBOPT_COPY_FORMAT))
+                        ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                 errmsg("cannot set %s for a
subscription with %s",
+                                        "binary = false",
"copy_format = binary")));

I don't understand why we'd need to tie an option (binary) that deals
with data types at column-level with another option (copy_format) that
requests the entire table data to be in binary. This'd essentially
make one to set binary = true to use copy_format = binary, no? IMHO,
this inter-dependency is not good for better usability.

2. Why can't the tests that this patch adds be simple? Why would it
need to change the existing tests at all? I'm thinking to create a new
00X_binary_copy_format.pl or such and setting up logical replication
with copy_format = binary and letting table sync worker request
publisher in binary format - you can verify this via publisher server
logs - look for COPY with BINARY option. If required, have the table
with different data types. This greatly reduces the patch's footprint.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



RE: Allow logical replication to copy tables in binary format

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Melih,

Thank you for updating the patch! Followings are my comments.

01. catalogs.sgml

```
        If true, the subscription will request that the publisher send data
-       in binary format
```

I'm not clear here. subbinary does not directly mean that whether the worker
requests to send data or not. How about:


If true, the subscription will request that the publisher send data in binary
format, except initial data synchronization

02. create_subscription.sgml

```
+          the binary format is very data type specific, it will not allow copying
+          between different column types as opposed to text format. Note that
```

The name of formats are not specified as <literal>, whereas in previous sentence
they are. We can use format either of them.


03. parse_subscription_options()

I'm not sure the combination of "copy_format = binary" and "copy_data = false"
should be accepted or not. How do you think?

04. parse_subscription_options()

```
+                               (errcode(ERRCODE_SYNTAX_ERROR),
+                                       errmsg("%s and %s are mutually exclusive options",
+                                               "binary = false", "copy_format = binary")));
+       }
```

A comment for translator seemed to be missed.

05. CreateSubscription()

```
        values[Anum_pg_subscription_suborigin - 1] =
                CStringGetTextDatum(opts.origin);
+       values[Anum_pg_subscription_subcopyformat - 1] = CharGetDatum(opts.copy_format);
```

I think this should be done the same ordering with FormData_pg_subscription.
Maybe after the line?

```
    values[Anum_pg_subscription_subdisableonerr - 1] = BoolGetDatum(opts.disableonerr);
```

06. AlterSubscription()

If we decided not to accept "copy_format = binary" and "copy_data = false", here
should be also fixed.

07. AlterSubscription()

```
+                                               ereport(ERROR,
+                                                               (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                                                errmsg("cannot set %s for a subscription with %s",
+                                                                               "binary = false", "copy_format =
binary")));
...
+                                               ereport(ERROR,
+                                                               (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                                                errmsg("cannot set %s for a subscription with %s",
+                                                                               "copy_format = binary", "binary =
false")));
```

Comments for translator seemed to be missed.

08. defGetCopyFormat()

```
+       /*
+        * If no parameter value given, set it to text format.
+        */
+       if (!def->arg)
+               return LOGICALREP_COPY_AS_TEXT;
```

I think the case no parameter is given should be rejected. It should be accepted
only when the parameter has boolean data type. Note that defGetStreamingMode()
is accepted no-parameter style due to the compatibility. At first streaming is
boolean, and then "parallel" is added.

09. describeSubscriptions

```
+                       /* Copy format is only supported in v16 and higher */
```

I think this comment should be atop of if statement, and it can mention about Origin too.


10. pg_subscription.h

```
+       char            subcopyformat BKI_DEFAULT(LOGICALREP_COPY_AS_TEXT); /* Copy format *
```

I'm not sure whether BKI_DEFAULT() is needed or not. Other options like twophase
does not have the default value as catalog level. The default is set in
parse_subscription_options() and then the value will be set to catalog.

11. typedef struct Subscription

In catalog entry the subcopyformat is aligned just after subdisableonerr, but in
struct Subscription, copyformat is added at the end. Can we place just after disableonerr?

12. Reply

> Since binary copy relies on COPY command, we may have problems across
> different server versions in cases where COPY is not portable.
> What I understand from this [1], COPY works across server versions later
> than 7.4. This shouldn't be a problem for logical replication.
> Currently the patch also does not allow binary copy if the publisher
> version is older than 16.

If in future version the general data type is added, the copy command in binary
format will not work well, right? It is because the inferior version does not have
recv/send functions for added type. It means that there is a possibility that
replication between different versions may be failed if binary format is specified.
Therefore, I think we should accept copy_format = binary only when the major
version of servers are the same.

Note that this comments is not the request to the patch. Maybe the modification
should be done not only for copy_format but also binary, and it may be out of scope
the patch.


Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Allow logical replication to copy tables in binary format

From
Jelte Fennema
Date:
> If in future version the general data type is added, the copy command in binary
> format will not work well, right? It is because the inferior version does not have
> recv/send functions for added type. It means that there is a possibility that
> replication between different versions may be failed if binary format is specified.
> Therefore, I think we should accept copy_format = binary only when the major
> version of servers are the same.

I don't think it's necessary to check versions. Yes, there are
situations where binary will fail across major versions. But in many
cases it does not. To me it seems the responsibility of the operator
to evaluate this risk. And if the operator chooses wrong and uses
binary copy across incompatible versions, then it will still fail hard
in that case during the copy phase (so still a very early error). So I
don't see a reason to check pre-emptively, afaict it will only
disallow some valid usecases and introduce more code.

Furthermore no major version check is done for "binary = true" either
(afaik). The only additional failure scenario that copy_format=binary
introduces is when one of the types does not implement a send function
on the source. With binary=true, this would continue to work, but with
copy_format=binary this stops working. All other failure scenarios
that binary encoding of types introduces apply to both binary=true and
copy_format=binary (the only difference being in which phase of the
replication these failures happen, the apply or the copy phase).

> I'm not sure the combination of "copy_format = binary" and "copy_data = false"
> should be accepted or not. How do you think?

It seems quite useless indeed to specify the format of a copy that won't happen.



RE: Allow logical replication to copy tables in binary format

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Jelte,

> I don't think it's necessary to check versions. Yes, there are
> situations where binary will fail across major versions. But in many
> cases it does not. To me it seems the responsibility of the operator
> to evaluate this risk. And if the operator chooses wrong and uses
> binary copy across incompatible versions, then it will still fail hard
> in that case during the copy phase (so still a very early error). So I
> don't see a reason to check pre-emptively, afaict it will only
> disallow some valid usecases and introduce more code.
> 
> Furthermore no major version check is done for "binary = true" either
> (afaik). The only additional failure scenario that copy_format=binary
> introduces is when one of the types does not implement a send function
> on the source. With binary=true, this would continue to work, but with
> copy_format=binary this stops working. All other failure scenarios
> that binary encoding of types introduces apply to both binary=true and
> copy_format=binary (the only difference being in which phase of the
> replication these failures happen, the apply or the copy phase).

I thought that current specification was lack of consideration, but you meant to
say that it is intentional one to keep the availability, right? 
Indeed my suggestion seems to be too pessimistic, but I want to listen to other
opinions more...

> > I'm not sure the combination of "copy_format = binary" and "copy_data = false"
> > should be accepted or not. How do you think?
> 
> It seems quite useless indeed to specify the format of a copy that won't happen.

I understood that the conbination of "copy_format = binary" and "copy_data = false"
should be rejected in parse_subscription_options() and AlterSubscription(). Is it right?
I'm expecting that is done in next version.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


RE: Allow logical replication to copy tables in binary format

From
"shiy.fnst@fujitsu.com"
Date:
On Thu, Feb 23, 2023 12:40 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
> 
> > > I'm not sure the combination of "copy_format = binary" and "copy_data =
> false"
> > > should be accepted or not. How do you think?
> >
> > It seems quite useless indeed to specify the format of a copy that won't
> happen.
> 
> I understood that the conbination of "copy_format = binary" and "copy_data =
> false"
> should be rejected in parse_subscription_options() and AlterSubscription(). Is it
> right?
> I'm expecting that is done in next version.
> 

The copy_data option only takes effect once in CREATE SUBSCIPTION or ALTER
SUBSCIPTION REFRESH PUBLICATION command, but the copy_format option can take
affect multiple times if the subscription is refreshed multiple times. Even if
the subscription is created with copy_date=false, copy_format can take affect
when executing ALTER SUBSCIPTION REFRESH PUBLICATION. So, I am not sure we want
to reject this usage.

Besides, here are my comments on the v9 patch.
1.
src/bin/pg_dump/pg_dump.c
    if (fout->remoteVersion >= 160000)
-        appendPQExpBufferStr(query, " s.suborigin\n");
+    {
+        appendPQExpBufferStr(query, " s.suborigin,\n");
+        appendPQExpBufferStr(query, " s.subcopyformat\n");
+    }
     else
-        appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+    {
+        appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+        appendPQExpBuffer(query, " '%c' AS subcopyformat\n", LOGICALREP_COPY_AS_TEXT);
+    }

src/bin/psql/describe.c
        if (pset.sversion >= 160000)
+        {
             appendPQExpBuffer(&buf,
                               ", suborigin AS \"%s\"\n",
                               gettext_noop("Origin"));
+            /* Copy format is only supported in v16 and higher */
+            appendPQExpBuffer(&buf,
+                              ", subcopyformat AS \"%s\"\n",
+                              gettext_noop("Copy Format"));
+        }

I think we can call only once appendPQExpBuffer() for the two options which are supported in v16.
For example,
        if (pset.sversion >= 160000)
        {
            appendPQExpBuffer(&buf,
                              ", suborigin AS \"%s\"\n"
                              ", subcopyformat AS \"%s\"\n",
                              gettext_noop("Origin"),
                              gettext_noop("Copy Format"));
        }

2.
src/bin/psql/tab-complete.c
@@ -1926,7 +1926,7 @@ psql_completion(const char *text, int start, int end)
     /* ALTER SUBSCRIPTION <name> SET ( */
     else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
         COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
-                      "streaming", "synchronous_commit");
+                      "streaming", "synchronous_commit", "copy_format");
     /* ALTER SUBSCRIPTION <name> SKIP ( */
     else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
         COMPLETE_WITH("lsn");
@@ -3269,7 +3269,8 @@ psql_completion(const char *text, int start, int end)
     else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
         COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
                       "disable_on_error", "enabled", "origin", "slot_name",
-                      "streaming", "synchronous_commit", "two_phase");
+                      "streaming", "synchronous_commit", "two_phase",
+                      "copy_format");


The options should be listed in alphabetical order. See commit d547f7cf5ef.

Regards,
Shi Yu

Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com>, 23 Şub 2023 Per, 12:29 tarihinde şunu yazdı:
On Thu, Feb 23, 2023 12:40 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
>
> > > I'm not sure the combination of "copy_format = binary" and "copy_data =
> false"
> > > should be accepted or not. How do you think?
> >
> > It seems quite useless indeed to specify the format of a copy that won't
> happen.
>
> I understood that the conbination of "copy_format = binary" and "copy_data =
> false"
> should be rejected in parse_subscription_options() and AlterSubscription(). Is it
> right?
> I'm expecting that is done in next version.
>

The copy_data option only takes effect once in CREATE SUBSCIPTION or ALTER
SUBSCIPTION REFRESH PUBLICATION command, but the copy_format option can take
affect multiple times if the subscription is refreshed multiple times. Even if
the subscription is created with copy_date=false, copy_format can take affect
when executing ALTER SUBSCIPTION REFRESH PUBLICATION. So, I am not sure we want
to reject this usage.

I believe the places copy_data and copy_format are needed are pretty much the same. I couldn't think of a case where copy_format is needed but copy_data isn't. Please let me know if I'm missing something.
CREATE SUBSCRIPTION, ALTER SUBSCRIPTION SET/ADD/REFRESH PUBLICATION are all the places where initial sync can happen. For all these commands, copy_data needs to be given as a parameter or it will be set to the default value which is true. Even if copy_data=false when the sub was created, REFRESH PUBLICATION (without an explicit copy_data parameter) will copy some tables if needed regardless of what copy_data was in CREATE SUBSCRIPTION. This is because copy_data is not something stored in pg_subscription or another catalog. But this is not an issue for copy_fornat since its value will be stored in the catalog. This can allow users to set the format even if copy_data=false and no initial sync is needed at that moment. So that future initial syncs which can be triggered by ALTER SUBSCRIPTION will be performed in the correct format.
So, I also think we should allow setting copy_format even if copy_data=false.

Another way to deal with this issue could be expecting the user to specify format every time copy_format is needed, similar to the case for copy_data, and moving on with text (default) format if it's not specified for the current CREATE/ALTER SUBSCRIPTION execution. But I don't think this would make things easier.
 
Best,
--
Melih Mutlu
Microsoft

Re: Allow logical replication to copy tables in binary format

From
Jelte Fennema
Date:
> This is because copy_data is not something stored in pg_subscription
> or another catalog. But this is not an issue for copy_fornat since its
> value will be stored in the catalog. This can allow users to set the
> format even if copy_data=false and no initial sync is needed at that
> moment.

One other approach that might make sense is to expand the values that
copy_data accepts to include the value "binary" (and probably "text"
for clarity). That way people could easily choose for each sync if
they want to use binary copy, text copy or no copy. Based on your
message, this would mean that copy_format=binary would not be stored
in catalogs anymore, does that have any bad side-effects for the
implementation?



Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
Here is my summary of this thread so far, plus some other thoughts.

(I wrote this mostly for my own internal notes/understanding, but
maybe it is a helpful summary for others so I am posting it here too).

~~

1. Initial Goal
------------------

Currently, the CREATE SUBSCRIPTION ... WITH(binary=true) mode does
data replication in binary mode, but tablesync COPY phase is still
only in text mode. IIUC Melih just wanted to unify those phases so
binary=true would mean also do the COPY phase in binary [1].

Actually, this was my very first expectation too.

2. Objections to unification
-----------------------------------

Bharath posted suggesting tying the COPY/replication parts is not a
good idea [2]. But if these are not to be unified then it requires a
new subscription option to be introduced, and currently, the thread
refers to this new option as copy_format=binary.

3. A problem: binary replication is not always binary!
----------------------------------------------------------------------

Shi-san reported [3] that under special circumstances (e.g. if the
datatype has no binary output function) the current HEAD binary=true
mode for replication has the ability to fall back to text replication.
Since the binary COPY doesn't do this, it means binary COPY could fail
in some cases where the binary=true replication will be OK.

AFAIK, this means that even if we *wanted* to unify everything with
just 'binary=true' it can't be done like that.

4. New option is needed
---------------------------------

OK, so let's assume these options must be separated (because of the
problem of #3).

~

4a. New string option called copy_format?

Currently, the patch/thread describes a new 'copy_format' string
option with values 'text' (default) and 'binary'.

Why? If there are only 2 values then IMO it would be better to have a
*boolean* option called something like binary_copy=true/false.

~

4b. Or, we could extend copy_data

Jelte suggested [4] we could extend copy_data = 'text' (aka on/true)
OR 'binary' OR 'none' (aka off/false).

That was interesting, although
- my first impression was to worry about backward compatibility issues
for existing application code. I don't know if those are easily
solved.
- AFAIK such "hybrid" boolean/enum options are kind of frowned upon so
I am not sure if a committer would be in favour of introducing another
one.


5. Combining options
------------------------------

Because options are separated, it means combinations become possible...

~~

5a. Combining option: "copy_format = binary" and "copy_data = false"

Kuroda [5] wrote such a combination should not be allowed.

I kind of disagree with that. IMO everything should be flexible as
possible. The patch might end up accidentally stopping something that
has a valid use case. Anyway, such restrictions are easy to add later.

~~

5b. Combining options: binary=true/copy_format=binary, and
binary=false/copy_format=binary become possible.

AFAIK currently the patch disallows some combinations:

+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "binary = false", "copy_format = binary")));


I kind of disagree with introducing such rules/restrictions. IMO all
this patch needs to do is clearly document all necessary precautions
etc. But if the user still wants to do something, we should just let
them do it. If they manage to shoot themselves in the foot, well it
was their choice after reading the docs, and it's their foot.


6. pub/sub version checking
----------------------------

There were some discussions about doing some server checking to reject
some PG combinations from being allowed to use the copy_format=binary.

Jelte suggested [5] that it is the "responsibility of the operator to
evaluate the risk".

I agree. Yes, the patch certainly needs to *document* all precautions,
but having too many restrictions might end up accidentally stopping
something useful. Anyway, this seems like #5a. I prefer KISS
Principle. More restrictions can be added later if found necessary.


7. More doubts & a thought bubble
---------------------------------

7a. What is user action for this scenario?

I am unsure about this scenario - imagine that everything is working
properly and the copy_format=binary/copy_data=true is all working
nicely for weeks for a certain pub/sub...

But if the publication was defined as FOR ALL TABLES, or as ALL TABLES
IN SCHEMA, then I think the tablesync can still crash if a user
creates a new table that suffers the same COPY binary trouble Shi-san
described earlier [3].

What is the user supposed to do when that tablesync fails?

They had no way to predict it could happen, And it will be too painful
to have to DROP and re-create the entire SUBSCRIPTION again now that
it has happened.

~~

7a. A thought bubble

I wondered (perhaps this is naive) would it be it possible to enhance
the COPY command to enhance the "binary" mode so it can be made to
fall back to text mode if it needs to in the same way that binary
replication does.

If such an enhanced COPY format mode worked, then most of the patch is redundant
- there is no need for any new option
- tablesync COPY could then *always* use this "binary-with-falback" mode


------
[1] Melih initially wanted a unified binary mode -
https://www.postgresql.org/message-id/CAGPVpCQYi9AYQSS%3DRmGgVNjz5ZEnLB8mACwd9aioVhLmbgiAMA%40mail.gmail.com

[2] Barath doesn't like the binary/copy_format inter-dependency -
https://www.postgresql.org/message-id/CALj2ACVPt-BaLMm3Ezy1-rfUzH9qStxePcyGrHPamPESEZSBFA%40mail.gmail.com

[3] Shi-san found binary mode has the ability to fall back to text
sometimes -
https://www.postgresql.org/message-id/OSZPR01MB6310B58F069FF8E148B247FDFDA49%40OSZPR01MB6310.jpnprd01.prod.outlook.com

[4] Jelte idea to enhance the copy_data option -
https://www.postgresql.org/message-id/CAGECzQS393xiME%2BEySLU7ceO4xOB81kPjPqNBjaeW3sLgfLhNw%40mail.gmail.com

[5] Kuroda-san etc expecting copy_data=false/copy_format=binary
combination is not allowed -
https://www.postgresql.org/message-id/TYAPR01MB5866DDF02B3CEE59DA024CC3F5AB9%40TYAPR01MB5866.jpnprd01.prod.outlook.com

[6] Jelte says it is the operator's responsibility to use the correct
options - https://www.postgresql.org/message-id/CAGECzQSStdb%2Bx1BxzCktZd1uSjvd6eYq1wcHV3vPCykrGqxYKQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Allow logical replication to copy tables in binary format

From
"Takamichi Osumi (Fujitsu)"
Date:
On Monday, February 20, 2023 8:47 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> Thanks for letting me know.
> Attached the fixed version of the patch.
Hi, Melih


Thanks for updating the patch. Minor comments on v9.

(1) commit message

"The patch introduces a new parameter, copy_format, to CREATE SUBSCRIPTION to allow to choose
the format used in initial table synchronization."

This patch introduces the new parameter not only to CREATE SUBSCRIPTION and ALTER SUBSCRIPTION, then this description
shouldbe more general, something like below. 

"The patch introduces a new parameter, copy_format, as subscription option to
allow user to choose the format of initial table synchronization."

(2) copy_table

We don't need to check the publisher's version below.

+
+       /* If the publisher is v16 or later, specify the format to copy data. */
+       if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 160000)
+       {
+               char *format = MySubscription->copyformat == LOGICALREP_COPY_AS_BINARY ? "binary" : "text";
+               appendStringInfo(&cmd, "  WITH (FORMAT %s)", format);
+               options = lappend(options, makeDefElem("format", (Node *) makeString(format), -1));
+       }
+

We don't have this kind of check for "binary" option and it seems this is user's responsibility to avoid any errors
duringreplication. If we want to add this kind of check, then we can add checks for both "binary" and "copy_format"
optiontogether as an independent patch. 

(3) subscription.sql/out

The other existing other subscription options check the invalid input for newly created option (e.g. "foo" for
disable_on_error, streaming mode). So, I think we can add this type of test for this feature. 



Best Regards,
    Takamichi Osumi




Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Mon, Feb 20, 2023 at 3:37 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Thu, Feb 16, 2023 8:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > So, doesn't this mean that there is no separate failure mode during
> > the initial copy? I am clarifying this to see if the patch really
> > needs a separate copy_format option for initial sync?
> >
>
> In the case that the data type doesn't have binary output function, for apply
> phase, the column will be sent in text format (see logicalrep_write_tuple()) and
> it works fine. But with copy_format = binary, the walsender exits with an
> error.
>
...
...
>
> Then I got the following error in the publisher log.
>
> walsender ERROR:  no binary output function available for type public.myvarchar
> walsender STATEMENT:  COPY public.tbl1 (a) TO STDOUT  WITH (FORMAT binary)
>

Thanks for sharing the example. I think to address this user can
create a SUBSCRIPTION with 'binary = false' and then after the initial
copy enables it with ALTER SUBSCRIPTION.  Personally, I feel it is not
required to have a separate option to allow copy in binary mode. Note,
where there is some use for it but having more options for similar
work is also confusing as users need to pay attention to different
options and their values. It won't be difficult to add such an option
in the future if we see such cases and or users specifically require
something like this.

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Bharath Rupireddy
Date:
On Fri, Feb 24, 2023 at 8:32 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here is my summary of this thread so far, plus some other thoughts.

Thanks.

> 1. Initial Goal
> ------------------
>
> Currently, the CREATE SUBSCRIPTION ... WITH(binary=true) mode does
> data replication in binary mode, but tablesync COPY phase is still
> only in text mode. IIUC Melih just wanted to unify those phases so
> binary=true would mean also do the COPY phase in binary [1].
>
> Actually, this was my very first expectation too.
>
> 2. Objections to unification
> -----------------------------------
>
> Bharath posted suggesting tying the COPY/replication parts is not a
> good idea [2]. But if these are not to be unified then it requires a
> new subscription option to be introduced, and currently, the thread
> refers to this new option as copy_format=binary.

Looking closely at the existing binary=true option and COPY's binary
protocol, essentially they depend on the data type's binary send and
receive functions.

> 3. A problem: binary replication is not always binary!
> ----------------------------------------------------------------------
>
> Shi-san reported [3] that under special circumstances (e.g. if the
> datatype has no binary output function) the current HEAD binary=true
> mode for replication has the ability to fall back to text replication.
> Since the binary COPY doesn't do this, it means binary COPY could fail
> in some cases where the binary=true replication will be OK.

Right. Apply workers can fallback to text mode transparently, whereas
with binary COPY it's a bit difficult to do so.

> AFAIK, this means that even if we *wanted* to unify everything with
> just 'binary=true' it can't be done like that.

Hm, it looks like that.

> 4. New option is needed
> ---------------------------------
>
> OK, so let's assume these options must be separated (because of the
> problem of #3).
>
> ~
>
> 4a. New string option called copy_format?
>
> Currently, the patch/thread describes a new 'copy_format' string
> option with values 'text' (default) and 'binary'.
>
> Why? If there are only 2 values then IMO it would be better to have a
> *boolean* option called something like binary_copy=true/false.
>
> ~
>
> 4b. Or, we could extend copy_data
>
> Jelte suggested [4] we could extend copy_data = 'text' (aka on/true)
> OR 'binary' OR 'none' (aka off/false).

How about copy_binary = {true/false}? So, the options for the user are:

copy_binary - defaults to false and when true, the subscriber requests
publisher to send pre-existing table's data in binary format (as
opposed to text) during data synchronization/table copy phase. It is
recommended to enable this option only when 1) the column data types
have appropriate binary send/receive functions, 2) not replicating
between different major versions or different platforms, 3) both
publisher and subscriber tables have the exact same column types (not
when replicating from smallint to int or numeric to int8 and so on), 4)
both publisher and subscriber supports COPY with binary option,
otherwise the table copy can fail.

binary - defaults to false and when true, the subscriber requests
publisher to send table's data in binary format (as opposed to text)
during normal replication phase. <<existing description from the docs
continues>>

Note that with this we made users responsible to choose copy_binary
rather than we being smart.

> AFAIK currently the patch disallows some combinations:
>
> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("%s and %s are mutually exclusive options",
> + "binary = false", "copy_format = binary")));
>
> 6. pub/sub version checking
> ----------------------------
>
> There were some discussions about doing some server checking to reject
> some PG combinations from being allowed to use the copy_format=binary.

IMHO, these restrictions make the feature more complicated to use and
be removed and the options be made simple to use (usability and
simplicity clearly wins the race). If there's any kind of feedback
from the users/developers, we can always come back and improve.

> But if the publication was defined as FOR ALL TABLES, or as ALL TABLES
> IN SCHEMA, then I think the tablesync can still crash if a user
> creates a new table that suffers the same COPY binary trouble Shi-san
> described earlier [3].
>
> What is the user supposed to do when that tablesync fails?
>
> They had no way to predict it could happen, And it will be too painful
> to have to DROP and re-create the entire SUBSCRIPTION again now that
> it has happened.

Can't ALTER SUBSCRIPTION .... SET copy_format = 'text'; and ALTER
SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_data = true); work
here instead of drop and recreate subscription?

> 7a. A thought bubble
>
> I wondered (perhaps this is naive) would it be it possible to enhance
> the COPY command to enhance the "binary" mode so it can be made to
> fall back to text mode if it needs to in the same way that binary
> replication does.
>
> If such an enhanced COPY format mode worked, then most of the patch is redundant
> - there is no need for any new option
> - tablesync COPY could then *always* use this "binary-with-falback" mode

I don't think that's a great idea. COPY is a user-facing SQL command
and any errors (because of data type not having binary send/receive
functions) better be reported to the users. Again, such an option
might complicate both COPY code and usability.


--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Allow logical replication to copy tables in binary format

From
Bharath Rupireddy
Date:
On Tue, Feb 21, 2023 at 7:18 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, Feb 20, 2023 at 5:17 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> >
> > Thanks for letting me know.
> > Attached the fixed version of the patch.
>
> Thanks. I have few comments on v9 patch:
>
> 1.
> +                    /* Do not allow binary = false with copy_format = binary */
> +                    if (!opts.binary &&
> +                        sub->copyformat == LOGICALREP_COPY_AS_BINARY &&
> +                        !IsSet(opts.specified_opts, SUBOPT_COPY_FORMAT))
> +                        ereport(ERROR,
> +
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> +                                 errmsg("cannot set %s for a
> subscription with %s",
> +                                        "binary = false",
> "copy_format = binary")));
>
> I don't understand why we'd need to tie an option (binary) that deals
> with data types at column-level with another option (copy_format) that
> requests the entire table data to be in binary. This'd essentially
> make one to set binary = true to use copy_format = binary, no? IMHO,
> this inter-dependency is not good for better usability.
>
> 2. Why can't the tests that this patch adds be simple? Why would it
> need to change the existing tests at all? I'm thinking to create a new
> 00X_binary_copy_format.pl or such and setting up logical replication
> with copy_format = binary and letting table sync worker request
> publisher in binary format - you can verify this via publisher server
> logs - look for COPY with BINARY option. If required, have the table
> with different data types. This greatly reduces the patch's footprint.

I've done performance testing with the v9 patch.

I can constantly observe 1.34X improvement or 25% improvement in table
sync/copy performance with the patch:
HEAD binary = false
Time: 214398.637 ms (03:34.399)

PATCHED binary = true, copy_format = binary:
Time: 160509.262 ms (02:40.509)

There's a clear reduction (5.68% to 4.49%) in the CPU cycles spent in
copy_table with the patch:
perf report HEAD:
-    5.68%     0.00%  postgres  postgres  [.] copy_table
   - copy_table
      - 5.67% CopyFrom
         - 4.26% NextCopyFrom
            - 2.16% NextCopyFromRawFields
               - 1.55% CopyReadLine
                  - 1.52% CopyReadLineText
                     - 0.76% CopyLoadInputBuf
                          0.50% CopyConvertBuf
                 0.60% CopyReadAttributesText
            - 1.93% InputFunctionCall
                 0.69% timestamptz_in
                 0.53% byteain
         - 0.73% CopyMultiInsertInfoFlush
            - 0.73% CopyMultiInsertBufferFlush
               - 0.66% table_multi_insert
                    0.65% heap_multi_insert

perf report PATCHED:
-    4.49%     0.00%  postgres  postgres  [.] copy_table
   - copy_table
      - 4.48% CopyFrom
         - 2.36% NextCopyFrom
            - 1.81% CopyReadBinaryAttribute
                 1.47% ReceiveFunctionCall
         - 1.21% CopyMultiInsertInfoFlush
            - 1.21% CopyMultiInsertBufferFlush
               - 1.11% table_multi_insert
                  - 1.09% heap_multi_insert
                     - 0.71% RelationGetBufferForTuple
                        - 0.63% ReadBufferBI
                           - 0.62% ReadBufferExtended
                                0.62% ReadBuffer_common

I've tried to choose the table columns such that the binary send/recv
vs non-binary/plain/text copy has some benefit. The table has 100mn
rows, and is of 11GB size. I've measured the benefit using Melih's
helper function wait_for_rep(). Note that I had to compile source code
with -ggdb3 -O0 for perf report, otherwise with -O3 for performance
numbers:

wal_level = 'logical'
shared_buffers = '8GB'
wal_buffers = '1GB'
max_wal_size = '32GB'

create table foo(i int, n numeric, v varchar, b bytea, t timestamp
with time zone default current_timestamp);
insert into foo select i, i+1, md5(i::text), md5(i::text)::bytea from
generate_series(1, 100000000) i;

CREATE OR REPLACE PROCEDURE wait_for_rep()
LANGUAGE plpgsql
AS $$
BEGIN
    WHILE (SELECT count(*) != 0 FROM pg_subscription_rel WHERE
srsubstate <> 'r') LOOP COMMIT;
    END LOOP;
END;
$$;

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Thanks for all of your reviews!

So, I made some changes in the v10 according to your comments.

1- copy_format is changed to a boolean parameter copy_binary.
It sounds easier to use a boolean to enable/disable binary copy. Default value is false, so nothing changes in the current behaviour if copy_binary is not specified.
It's still persisted into the catalog. This is needed since its value will be needed by tablesync workers later. And tablesync workers fetch subscription configurations from the catalog.
In the copy_data case, it is not directly stored anywhere but it affects the state of the table which is stored in the catalog. So, I guess this is the convenient way if we decide to go with a new parameter.

2- copy_binary is not tied to another parameter
The patch does not disallow any combination of copy_binary with binary and copy_data options. I tried to explain what binary copy can and cannot do in the docs. Rest is up to the user now.

3- Removed version check for copy_binary
HEAD already does not have any check for binary option. Making binary copy work only if publisher and subscriber are the same major version can be too restricted.  

4- Added separate test file
Although I believe 002_types.pl and 014_binary.pl can be improved more for binary enabled subscription cases, this patch might not be the best place to make those changes.
032_binary_copy.pl now has the tests for binary copy option. There are also some regression tests in subscription.sql.

Finally, some other small fixes are done according to the reviews.

 Also, thanks Bharath for performance testing [1]. It can be seen that the patch has some benefits.

I appreciate further thoughts/reviews.




Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
vignesh C
Date:
On Tue, 28 Feb 2023 at 01:22, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi,
>
> Thanks for all of your reviews!
>
> So, I made some changes in the v10 according to your comments.
>
> 1- copy_format is changed to a boolean parameter copy_binary.
> It sounds easier to use a boolean to enable/disable binary copy. Default value is false, so nothing changes in the
currentbehaviour if copy_binary is not specified.
 
> It's still persisted into the catalog. This is needed since its value will be needed by tablesync workers later. And
tablesyncworkers fetch subscription configurations from the catalog.
 
> In the copy_data case, it is not directly stored anywhere but it affects the state of the table which is stored in
thecatalog. So, I guess this is the convenient way if we decide to go with a new parameter.
 
>
> 2- copy_binary is not tied to another parameter
> The patch does not disallow any combination of copy_binary with binary and copy_data options. I tried to explain what
binarycopy can and cannot do in the docs. Rest is up to the user now.
 
>
> 3- Removed version check for copy_binary
> HEAD already does not have any check for binary option. Making binary copy work only if publisher and subscriber are
thesame major version can be too restricted.
 
>
> 4- Added separate test file
> Although I believe 002_types.pl and 014_binary.pl can be improved more for binary enabled subscription cases, this
patchmight not be the best place to make those changes.
 
> 032_binary_copy.pl now has the tests for binary copy option. There are also some regression tests in
subscription.sql.
>
> Finally, some other small fixes are done according to the reviews.
>
>  Also, thanks Bharath for performance testing [1]. It can be seen that the patch has some benefits.
>
> I appreciate further thoughts/reviews.

Thanks for the patch, Few comments:
1) Are primary key required for the tables, if not required we could
remove the primary key which will speed up the test by not creating
the indexes and inserting in the indexes. Even if required just create
for one of the tables:
+# Create tables on both sides of the replication
+my $ddl = qq(
+       CREATE TABLE public.test_numerical (
+               a INTEGER PRIMARY KEY,
+               b NUMERIC,
+               c FLOAT,
+               d BIGINT
+       );
+       CREATE TABLE public.test_arrays (
+               a INTEGER[] PRIMARY KEY,
+               b NUMERIC[],
+               c TEXT[]
+       );
+    CREATE TABLE public.test_range_array (
+               a INTEGER PRIMARY KEY,
+               b TSTZRANGE,
+               c int8range[]
+       );
+    CREATE TYPE public.test_comp_basic_t AS (a FLOAT, b TEXT, c INTEGER);
+    CREATE TABLE public.test_one_comp (
+               a INTEGER PRIMARY KEY,
+               b public.test_comp_basic_t
+       ););
+

2) We can have the Insert/Select of tables in the same order so that
it is easy to verify. test_range_array/test_one_comp
insertion/selection order was different.
+# Insert some content and before creating a subscription
+$node_publisher->safe_psql(
+       'postgres', qq(
+    INSERT INTO public.test_numerical (a, b, c, d) VALUES
+               (1, 1.2, 1.3, 10),
+        (2, 2.2, 2.3, 20);
+       INSERT INTO public.test_arrays (a, b, c) VALUES
+               ('{1,2,3}', '{1.1, 1.2, 1.3}', '{"one", "two", "three"}'),
+        ('{3,1,2}', '{1.3, 1.1, 1.2}', '{"three", "one", "two"}');
+    INSERT INTO test_range_array (a, b, c) VALUES
+               (1, tstzrange('Mon Aug 04 00:00:00 2014
CEST'::timestamptz, 'infinity'), '{"[1,2]", "[10,20]"}'),
+               (2, tstzrange('Sat Aug 02 00:00:00 2014
CEST'::timestamptz, 'Mon Aug 04 00:00:00 2014 CEST'::timestamptz),
'{"[2,3]", "[20,30]"}');
+       INSERT INTO test_one_comp (a, b) VALUES
+               (1, ROW(1.0, 'a', 1)),
+               (2, ROW(2.0, 'b', 2));
+       ));
+
+# Create the subscription with copy_binary = true
+my $publisher_connstring = $node_publisher->connstr . ' dbname=postgres';
+$node_subscriber->safe_psql('postgres',
+           "CREATE SUBSCRIPTION tsub CONNECTION '$publisher_connstring' "
+         . "PUBLICATION tpub WITH (slot_name = tpub_slot, copy_binary
= true)");
+
+# Ensure nodes are in sync with each other
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'tsub');
+
+my $sync_check =  qq(
+       SET timezone = '+2';
+       SELECT a, b, c, d FROM test_numerical ORDER BY a;
+       SELECT a, b, c FROM test_arrays ORDER BY a;
+       SELECT a, b FROM test_one_comp ORDER BY a;
+       SELECT a, b, c FROM test_range_array ORDER BY a;
+);

3) Should we check the results for test_myvarchar table only, since
there is no change in other tables, we need not check other tables
again:
+# Now tablesync should succeed
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'tsub');
+
+$sync_check =  qq(
+       SET timezone = '+2';
+       SELECT a, b, c, d FROM test_numerical ORDER BY a;
+       SELECT a, b, c FROM test_arrays ORDER BY a;
+       SELECT a, b FROM test_one_comp ORDER BY a;
+       SELECT a, b, c FROM test_range_array ORDER BY a;
+    SELECT a FROM test_myvarchar;
+);
+
+# Check the synced data on subscriber
+$result = $node_subscriber->safe_psql('postgres', $sync_check);

4) Similarly check only for test_mismatching_types in this case:
+# Cannot sync due to type mismatch
+$node_subscriber->wait_for_log(qr/ERROR: ( [A-Z0-9]+:)? incorrect
binary data format/);
+
+# Setting copy_binary to false should allow syncing
+$node_subscriber->safe_psql(
+    'postgres', qq(
+    ALTER SUBSCRIPTION tsub SET (copy_binary = false);));
+
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'tsub');
+
+$sync_check =  qq(
+       SET timezone = '+2';
+       SELECT a, b, c, d FROM test_numerical ORDER BY a;
+       SELECT a, b, c FROM test_arrays ORDER BY a;
+       SELECT a, b FROM test_one_comp ORDER BY a;
+       SELECT a, b, c FROM test_range_array ORDER BY a;
+    SELECT a FROM test_myvarchar;
+    SELECT a FROM test_mismatching_types ORDER BY a;
+);
+
+# Check the synced data on subscribers
+$result = $node_subscriber->safe_psql('postgres', $sync_check);
+
+is( $result, '1|1.2|1.3|10
+2|2.2|2.3|20
+{1,2,3}|{1.1,1.2,1.3}|{one,two,three}
+{3,1,2}|{1.3,1.1,1.2}|{three,one,two}
+1|(1,a,1)
+2|(2,b,2)
+1|["2014-08-04 00:00:00+02",infinity)|{"[1,3)","[10,21)"}
+2|["2014-08-02 00:00:00+02","2014-08-04 00:00:00+02")|{"[2,4)","[20,31)"}
+a
+1
+2', 'check synced data on subscriber with copy_binary = false');

5) Should we change "Basic logical replication test" to "Test logical
replication of copy table in binary"
diff --git a/src/test/subscription/t/032_binary_copy.pl
b/src/test/subscription/t/032_binary_copy.pl
new file mode 100644
index 0000000000..bcad66e5ea
--- /dev/null
+++ b/src/test/subscription/t/032_binary_copy.pl
@@ -0,0 +1,223 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Basic logical replication test
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+

6) We can change "Initialize publisher node" to "Create publisher
node" to maintain consistency.
+# Initialize publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;

7) Should "Insert some content and before creating a subscription." be
changed to:
"Insert some content before creating a subscription."

+# Publish all tables
+$node_publisher->safe_psql('postgres',
+       "CREATE PUBLICATION tpub FOR ALL TABLES");
+
+# Insert some content and before creating a subscription
+$node_publisher->safe_psql(
+       'postgres', qq(
+    INSERT INTO public.test_numerical (a, b, c, d) VALUES
+               (1, 1.2, 1.3, 10),

Regards,
Vignesh



Re: Allow logical replication to copy tables in binary format

From
Bharath Rupireddy
Date:
On Tue, Feb 28, 2023 at 1:22 AM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi,
>
> Thanks for all of your reviews!
>
> So, I made some changes in the v10 according to your comments.

Thanks. Some quick comments on v10:

1.
+      <para>
+       If true, initial data synchronization will be performed in binary format
+      </para></entry>
It's not just the initial table sync right? The table sync can happen
at any other point of time when ALTER SUBSCRIPTION ... REFRESH
PUBLICATION WITH (copy = true) is run.
How about - "If true, the subscriber requests publication for
pre-existing data in binary format"?

2.
+          Specifies whether pre-existing data on the publisher will be copied
+          to the subscriber in binary format. The default is
<literal>false</literal>.
+          Binary format is very data type specific, it will not allow copying
+          between different column types as opposed to text format. Note that
+          if this option is enabled, all data types which will be copied during
+          the initial synchronization should have binary send and
receive functions.
+          If this option is disabled, data format for the initial
synchronization
+          will be text.
Perhaps, this should cover the recommended cases for enabling this new
option - something like below (may not need to have exact wording, but
the recommended cases?):

"It is recommended to enable this option only when 1) the column data
types have appropriate binary send/receive functions, 2) not
replicating between different major versions or different platforms,
3) both publisher and subscriber tables have the exact same column
types (not when replicating from smallint to int or numeric to int8
and so on), 4) both publisher and subscriber supports COPY with binary
option, otherwise the table copy can fail."

3. I think the newly added tests must verify if the binary COPY is
picked up when enabled. Perhaps, looking at the publisher's server log
for 'COPY ... WITH BINARY format'? Maybe it's an overkill, otherwise,
we have no way of testing that the option took effect.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Allow logical replication to copy tables in binary format

From
Jelte Fennema
Date:
> 3. I think the newly added tests must verify if the binary COPY is
> picked up when enabled. Perhaps, looking at the publisher's server log
> for 'COPY ... WITH BINARY format'? Maybe it's an overkill, otherwise,
> we have no way of testing that the option took effect.

Another way to test that BINARY is enabled could be to trigger one
of the failure cases.



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Attached v11.

vignesh C <vignesh21@gmail.com>, 28 Şub 2023 Sal, 13:59 tarihinde şunu yazdı:
Thanks for the patch, Few comments:
1) Are primary key required for the tables, if not required we could
remove the primary key which will speed up the test by not creating
the indexes and inserting in the indexes. Even if required just create
for one of the tables:

I think that having a primary key in tables for logical replication tests is good for checking if log. rep. duplicates any row. Other tests also have primary keys in almost all tables.

Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>, 28 Şub 2023 Sal, 15:27 tarihinde şunu yazdı:
1.
+      <para>
+       If true, initial data synchronization will be performed in binary format
+      </para></entry>
It's not just the initial table sync right? The table sync can happen
at any other point of time when ALTER SUBSCRIPTION ... REFRESH
PUBLICATION WITH (copy = true) is run.
How about - "If true, the subscriber requests publication for
pre-existing data in binary format"?

I changed it as you suggested.
I sometimes feel like the phrase "initial sync" is used for initial sync of a table, not a subscription. Or table syncs triggered by ALTER SUBSCRIPTION are ignored in some places where "initial sync" is used.

2.
Perhaps, this should cover the recommended cases for enabling this new
option - something like below (may not need to have exact wording, but
the recommended cases?):
"It is recommended to enable this option only when 1) the column data
types have appropriate binary send/receive functions, 2) not
replicating between different major versions or different platforms,
3) both publisher and subscriber tables have the exact same column
types (not when replicating from smallint to int or numeric to int8
and so on), 4) both publisher and subscriber supports COPY with binary
option, otherwise the table copy can fail."

I added a line stating that binary format is less portable across machine architectures and versions as stated in COPY [1].
I don't think we should add line saying "recommended", but state the restrictions clearly instead. It's also similar in COPY docs as well.
I think the explanation now covers all your points, right?
 
Jelte Fennema <postgres@jeltef.nl>, 28 Şub 2023 Sal, 16:25 tarihinde şunu yazdı:
> 3. I think the newly added tests must verify if the binary COPY is
> picked up when enabled. Perhaps, looking at the publisher's server log
> for 'COPY ... WITH BINARY format'? Maybe it's an overkill, otherwise,
> we have no way of testing that the option took effect.

Another way to test that BINARY is enabled could be to trigger one
of the failure cases.

Yes, there is already a failure case for binary copy which resolves with swithcing binary_copy to false.
But I also added checks for publisher logs now too.  



Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
Dilip Kumar
Date:
On Mon, Feb 27, 2023 at 2:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Feb 20, 2023 at 3:37 PM shiy.fnst@fujitsu.com
> <shiy.fnst@fujitsu.com> wrote:
> >
> > On Thu, Feb 16, 2023 8:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > So, doesn't this mean that there is no separate failure mode during
> > > the initial copy? I am clarifying this to see if the patch really
> > > needs a separate copy_format option for initial sync?
> > >
> >
> > In the case that the data type doesn't have binary output function, for apply
> > phase, the column will be sent in text format (see logicalrep_write_tuple()) and
> > it works fine. But with copy_format = binary, the walsender exits with an
> > error.
> >
> ...
> ...
> >
> > Then I got the following error in the publisher log.
> >
> > walsender ERROR:  no binary output function available for type public.myvarchar
> > walsender STATEMENT:  COPY public.tbl1 (a) TO STDOUT  WITH (FORMAT binary)
> >
>
> Thanks for sharing the example. I think to address this user can
> create a SUBSCRIPTION with 'binary = false' and then after the initial
> copy enables it with ALTER SUBSCRIPTION.  Personally, I feel it is not
> required to have a separate option to allow copy in binary mode. Note,
> where there is some use for it but having more options for similar
> work is also confusing as users need to pay attention to different
> options and their values. It won't be difficult to add such an option
> in the future if we see such cases and or users specifically require
> something like this.

I agree with this thought, basically adding an extra option will
always complicate things for the user.  And logically it doesn't make
much sense to copy data in text mode and then stream in binary mode
(except in some exception cases and for that, we can always alter the
subscription).  So IMHO it makes more sense that if the binary option
is selected then ideally it should choose to do the initial sync also
in the binary mode.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Allow logical replication to copy tables in binary format

From
Bharath Rupireddy
Date:
On Wed, Mar 1, 2023 at 4:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> > > walsender ERROR:  no binary output function available for type public.myvarchar
> > > walsender STATEMENT:  COPY public.tbl1 (a) TO STDOUT  WITH (FORMAT binary)
> > >
> >
> > Thanks for sharing the example. I think to address this user can
> > create a SUBSCRIPTION with 'binary = false' and then after the initial
> > copy enables it with ALTER SUBSCRIPTION.  Personally, I feel it is not
> > required to have a separate option to allow copy in binary mode. Note,
> > where there is some use for it but having more options for similar
> > work is also confusing as users need to pay attention to different
> > options and their values. It won't be difficult to add such an option
> > in the future if we see such cases and or users specifically require
> > something like this.
>
> I agree with this thought, basically adding an extra option will
> always complicate things for the user.  And logically it doesn't make
> much sense to copy data in text mode and then stream in binary mode
> (except in some exception cases and for that, we can always alter the
> subscription).  So IMHO it makes more sense that if the binary option
> is selected then ideally it should choose to do the initial sync also
> in the binary mode.

I think I was suggesting earlier to use a separate option for binary
table sync copy based on my initial knowledge of binary COPY. Now that
I have a bit more understanding of binary COPY and subscription's
existing binary option, +1 for using the same option for table sync
too.

If used the existing subscription binary option for the table sync,
there can be following possibilities for the users:
1. users might want to enable the binary option for table sync and
disable it for subsequent replication
2. users might want to enable the binary option for both table sync
and for subsequent replication
3. users might want to disable the binary option for table sync and
enable it for subsequent replication
4. users might want to disable binary option for both table sync and
for subsequent replication

Binary copy use-cases are a bit narrower compared to the existing
subscription binary option, it works only if:
a) the column data types have appropriate binary send/receive functions
b) not replicating between different major versions or different platforms
c) both publisher and subscriber tables have the exact same column
types (not when replicating from smallint to int or numeric to int8
and so on)
d) both publisher and subscriber supports COPY with binary option

Now if one enabled the binary option for table sync, that means, they
must have ensured all (a), (b), (c), and (d) are met. The point is if
one decides to use binary copy for table sync, it means that the
subsequent binary replication works too without any problem. If
required, one can disable it for normal replication i.e. post-table
sync.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>, 1 Mar 2023 Çar, 15:02 tarihinde şunu yazdı:
On Wed, Mar 1, 2023 at 4:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> I agree with this thought, basically adding an extra option will
> always complicate things for the user.  And logically it doesn't make
> much sense to copy data in text mode and then stream in binary mode
> (except in some exception cases and for that, we can always alter the
> subscription).  So IMHO it makes more sense that if the binary option
> is selected then ideally it should choose to do the initial sync also
> in the binary mode.

I agree that copying in text then streaming in binary does not have a good use-case.

I think I was suggesting earlier to use a separate option for binary
table sync copy based on my initial knowledge of binary COPY. Now that
I have a bit more understanding of binary COPY and subscription's
existing binary option, +1 for using the same option for table sync
too.

If used the existing subscription binary option for the table sync,
there can be following possibilities for the users:
1. users might want to enable the binary option for table sync and
disable it for subsequent replication
2. users might want to enable the binary option for both table sync
and for subsequent replication
3. users might want to disable the binary option for table sync and
enable it for subsequent replication
4. users might want to disable binary option for both table sync and
for subsequent replication

Binary copy use-cases are a bit narrower compared to the existing
subscription binary option, it works only if:
a) the column data types have appropriate binary send/receive functions
b) not replicating between different major versions or different platforms
c) both publisher and subscriber tables have the exact same column
types (not when replicating from smallint to int or numeric to int8
and so on)
d) both publisher and subscriber supports COPY with binary option

Now if one enabled the binary option for table sync, that means, they
must have ensured all (a), (b), (c), and (d) are met. The point is if
one decides to use binary copy for table sync, it means that the
subsequent binary replication works too without any problem. If
required, one can disable it for normal replication i.e. post-table
sync.

That was my intention in the beginning with this patch. Then the new option also made some sense at some point, and I added copy_binary option according to reviews.
The earlier versions of the patch didn't have that. Without the new option, this patch would also be smaller.

But before changing back to the point where these are all tied to binary option without a new option, I think we should decide if that's really the ideal way to do it.
I believe that the patch is all good now with the binary_copy option which is not tied to anything, explanations in the doc and separate tests etc.
But I also agree that binary=true should make everything in binary and binary=false should do them in text format. It makes more sense.

Best,
--
Melih Mutlu
Microsoft

RE: Allow logical replication to copy tables in binary format

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Melih,

If we do not have to treat the case Shi pointed out[1] as code-level, I agreed to
same option binary because it is simpler. I read the use-cases addressed by Bharath[2]
and I cannot find advantage for case 1 and 3 expect the case that binary functions
are not implemented.
Previously I said that the combination of "copy_format = binary" and "copy_data = false"
seemed strange[3], but this approach could solve it and other related ones
automatically.

I think we should add description to doc that it is more likely happen to fail
the initial copy user should enable binary format after synchronization if
tables have original datatype.

[1]:
https://www.postgresql.org/message-id/OSZPR01MB6310B58F069FF8E148B247FDFDA49%40OSZPR01MB6310.jpnprd01.prod.outlook.com
[2] https://www.postgresql.org/message-id/CALj2ACXiUsJoXt%3DfMpa4yYseB5h3un_syVh-J3RxL4-6r9Dx2A%40mail.gmail.com
[3]:
https://www.postgresql.org/message-id/TYAPR01MB5866968CF42FBAB73E8EEDF8F5AA9%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>, 1 Mar 2023 Çar, 18:40 tarihinde şunu yazdı:
Dear Melih,

If we do not have to treat the case Shi pointed out[1] as code-level, I agreed to
same option binary because it is simpler.

How is this an issue if we let the binary option do binary copy and not an issue if we have a separate copy_binary option?
You can easily have the similar errors when you set copy_binary=true if a type is missing binary send/receive functions.
And also, as Amit mentioned, the same issue can easily be avoided if binary=false until the initial sync is done. It can be set to true later.
 
I read the use-cases addressed by Bharath[2]
and I cannot find advantage for case 1 and 3 expect the case that binary functions
are not implemented.

Note that case 3 is already how it works on HEAD. Its advantages, as you already mentioned, is when some types are missing the binary functions.
I think that's why case 3 should be allowed even if a new option is added or not.

Previously I said that the combination of "copy_format = binary" and "copy_data = false"
seemed strange[3], but this approach could solve it and other related ones
automatically.

I think it is quite similar to the case where binary=true and enabled=false. In that case, the format is set but the subscription does not replicate anything. And this is allowed.
copy_binary=true and copy_data=false combination also sets the copy format but does not copy anything. Even if any table will not be copied at that moment, tables which might be added later might need to be copied (by ALTER SUBSCRIPTION). And setting the copy format beforehand can be useful in such cases. 
 
I think we should add description to doc that it is more likely happen to fail
the initial copy user should enable binary format after synchronization if
tables have original datatype.

I tried to explain when binary copy can cause failures in the doc. What exactly do you think is missing? 
 
Best,
--
Melih Mutlu
Microsoft

Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
On Thu, Mar 2, 2023 at 5:10 AM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi,
>
> Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>, 1 Mar 2023 Çar, 18:40 tarihinde şunu yazdı:
>>
>> Dear Melih,
>>
>> If we do not have to treat the case Shi pointed out[1] as code-level, I agreed to
>> same option binary because it is simpler.
>
>
> How is this an issue if we let the binary option do binary copy and not an issue if we have a separate copy_binary
option?
> You can easily have the similar errors when you set copy_binary=true if a type is missing binary send/receive
functions.
> And also, as Amit mentioned, the same issue can easily be avoided if binary=false until the initial sync is done. It
canbe set to true later. 
>
>>

IIUC most people seem to be coming down in favour of there being a
single unified option (the existing 'binary==true/false) which would
apply to both the COPY and the data replication parts.

I also agree
- Yes, it is simpler.
- Yes, there are various workarounds in case the COPY part failed

But, AFAICT the main question remains unanswered -- Are we happy to
break existing applications already using binary=true. E.g. I think
there might be cases where applications are working *only* because
their binary=true is internally (and probably unbeknownst to the user)
reverting to text. So if we unified everything under one 'binary'
option then binary=true will force COPY binary so now some previously
working applications will get COPY errors requiring workarounds. Is
that acceptable?

TBH I am not sure anymore if the complications justify the patch.

It seems we have to choose from 2 bad choices:
- separate options = this works but would be more confusing for the user
- unified option = this would be simpler and faster, but risks
breaking existing applications currently using 'binary=true'

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Thu, Mar 2, 2023 at 7:27 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Mar 2, 2023 at 5:10 AM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> >
> > Hi,
> >
> > Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>, 1 Mar 2023 Çar, 18:40 tarihinde şunu yazdı:
> >>
> >> Dear Melih,
> >>
> >> If we do not have to treat the case Shi pointed out[1] as code-level, I agreed to
> >> same option binary because it is simpler.
> >
> >
> > How is this an issue if we let the binary option do binary copy and not an issue if we have a separate copy_binary
option?
> > You can easily have the similar errors when you set copy_binary=true if a type is missing binary send/receive
functions.
> > And also, as Amit mentioned, the same issue can easily be avoided if binary=false until the initial sync is done.
Itcan be set to true later. 
> >
> >>
>
> IIUC most people seem to be coming down in favour of there being a
> single unified option (the existing 'binary==true/false) which would
> apply to both the COPY and the data replication parts.
>
> I also agree
> - Yes, it is simpler.
> - Yes, there are various workarounds in case the COPY part failed
>
> But, AFAICT the main question remains unanswered -- Are we happy to
> break existing applications already using binary=true. E.g. I think
> there might be cases where applications are working *only* because
> their binary=true is internally (and probably unbeknownst to the user)
> reverting to text. So if we unified everything under one 'binary'
> option then binary=true will force COPY binary so now some previously
> working applications will get COPY errors requiring workarounds. Is
> that acceptable?
>

I think one can look at this from another angle also where users would
be expecting that when binary = true and copy_data = true, all the
data transferred between publisher and subscriber should be in binary
format. Users have a workaround to set binary=true only after the
initial sync. Also, if at all, the behaviour change would be after
major version upgrade which shouldn't be a problem.

> TBH I am not sure anymore if the complications justify the patch.
>
> It seems we have to choose from 2 bad choices:
> - separate options = this works but would be more confusing for the user
> - unified option = this would be simpler and faster, but risks
> breaking existing applications currently using 'binary=true'
>

I would prefer a unified option as apart from other things you and
others mentioned that will be less of a maintenance burden in the
future.

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Kuntal Ghosh
Date:
On Thu, Mar 2, 2023 at 10:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > TBH I am not sure anymore if the complications justify the patch.
> >
> > It seems we have to choose from 2 bad choices:
> > - separate options = this works but would be more confusing for the user
> > - unified option = this would be simpler and faster, but risks
> > breaking existing applications currently using 'binary=true'
> >
>
> I would prefer a unified option as apart from other things you and
> others mentioned that will be less of a maintenance burden in the
> future.
>
+1
When someone sets the binary=true while creating a subscription, the
expectation would be that the data transfer will happen in binary mode
if binary in/out functions are available. As per current
implementation, that's not happening in the table-sync phase. So, it
makes sense to fix that behaviour in a major version release.
For the existing applications that are using (or unknowingly misusing)
the feature, as Amit mentioned, they have a workaround.


--
Thanks & Regards,
Kuntal Ghosh



Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
On Thu, Mar 2, 2023 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Mar 2, 2023 at 7:27 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
...
> > IIUC most people seem to be coming down in favour of there being a
> > single unified option (the existing 'binary==true/false) which would
> > apply to both the COPY and the data replication parts.
> >
> > I also agree
> > - Yes, it is simpler.
> > - Yes, there are various workarounds in case the COPY part failed
> >
> > But, AFAICT the main question remains unanswered -- Are we happy to
> > break existing applications already using binary=true. E.g. I think
> > there might be cases where applications are working *only* because
> > their binary=true is internally (and probably unbeknownst to the user)
> > reverting to text. So if we unified everything under one 'binary'
> > option then binary=true will force COPY binary so now some previously
> > working applications will get COPY errors requiring workarounds. Is
> > that acceptable?
> >
>
> I think one can look at this from another angle also where users would
> be expecting that when binary = true and copy_data = true, all the
> data transferred between publisher and subscriber should be in binary
> format. Users have a workaround to set binary=true only after the
> initial sync. Also, if at all, the behaviour change would be after
> major version upgrade which shouldn't be a problem.
>
> > TBH I am not sure anymore if the complications justify the patch.
> >
> > It seems we have to choose from 2 bad choices:
> > - separate options = this works but would be more confusing for the user
> > - unified option = this would be simpler and faster, but risks
> > breaking existing applications currently using 'binary=true'
> >
>
> I would prefer a unified option as apart from other things you and
> others mentioned that will be less of a maintenance burden in the
> future.

My concern was mostly just about the potential to break the behaviour
of existing binary=true applications in some edge cases.

If you are happy that doing so shouldn't be a problem, then I am also
+1 to use the unified option.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Wed, Mar 1, 2023 at 7:58 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>, 1 Mar 2023 Çar, 15:02 tarihinde şunu yazdı:
>>
>
> That was my intention in the beginning with this patch. Then the new option also made some sense at some point, and I
addedcopy_binary option according to reviews. 
> The earlier versions of the patch didn't have that. Without the new option, this patch would also be smaller.
>
> But before changing back to the point where these are all tied to binary option without a new option, I think we
shoulddecide if that's really the ideal way to do it. 
>

As per what I could read in this thread, most people prefer to use the
existing binary option rather than inventing a new way (option) to
binary copy in the initial sync phase. Do you agree? If so, it is
better to update the patch accordingly as this is the last CF for this
release and we have a limited time left.

--
With Regards,
Amit Kapila.



RE: Allow logical replication to copy tables in binary format

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Melih,

>> I think we should add description to doc that it is more likely happen to fail
>> the initial copy user should enable binary format after synchronization if
>> tables have original datatype.
>
> I tried to explain when binary copy can cause failures in the doc. What exactly
> do you think is missing?

I assumed here that "copy_format" and "binary" were combined into one option.
Currently the binary option has descriptions :

```
Even when this option is enabled, only data types having binary send and receive functions will be transferred in
binary
```

But this is not suitable for initial data sync, as we knew. I meant to say that
it must be updated if options are combined.

Note that following is not necessary for PG16, just an improvement for newer version.

Is it possible to automatically switch the binary option from 'true' to 'false'
when data transfer fails? As we found that while synchronizing the initial data
with binary format may lead another error, and it can be solved if the options
is changed. When DBAs check a log after synchronization and find the output like
"binary option was changed and worker will restart..." or something, they can turn
"binary" on again.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

On 7 Mar 2023 Tue at 04:10 Amit Kapila <amit.kapila16@gmail.com> wrote:
As per what I could read in this thread, most people prefer to use the
existing binary option rather than inventing a new way (option) to
binary copy in the initial sync phase. Do you agree?

I agree.
What do you think about the version checks? I removed any kind of check since it’s currently a different option. Should we check publisher version before doing binary copy to ensure that the publisher node supports binary option of COPY command?

Thanks,
-- 
Melih Mutlu
Microsoft

Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Wed, Mar 8, 2023 at 6:17 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> On 7 Mar 2023 Tue at 04:10 Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> As per what I could read in this thread, most people prefer to use the
>> existing binary option rather than inventing a new way (option) to
>> binary copy in the initial sync phase. Do you agree?
>
>
> I agree.
> What do you think about the version checks? I removed any kind of check since it’s currently a different option.
Shouldwe check publisher version before doing binary copy to ensure that the publisher node supports binary option of
COPYcommand? 
>

It is not clear to me which version check you wanted to add because we
seem to have a binary option in COPY from the time prior to logical
replication. I feel we need a publisher version 14 check as that is
where we start to support binary mode transfer in logical replication.
See the check in function libpqrcv_startstreaming().

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Attached v12 with a unified option.

Setting binary = true now allows the initial sync to happen in binary format.

Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
Here are some review comments for patch v12-0001

======
General

1.
There is no new test code. Are we sure that there are already
sufficient TAP tests doing binary testing with/without copy_data and
covering all the necessary combinations?

======
Commit Message

2.
Without this patch, table are copied in text format even if the
subscription is created with binary option enabled. This patch allows
logical replication
to perform in binary format starting from initial sync. When binary
format is beneficial
to use, allowing the subscription to copy tables in binary in table
sync phase may
reduce the time spent on copy depending on column types.

~

a. "table" -> "tables"

b. I don't think need to keep referring to the initial table
synchronization many times.

SUGGESTION
Without this patch, table synchronization COPY uses text format even
if the subscription is created with the binary option enabled. Copying
tables in binary format may reduce the time spent depending on column
types.

======
doc/src/sgml/logical-replication.sgml

3.
@@ -241,10 +241,11 @@
    types of the columns do not need to match, as long as the text
    representation of the data can be converted to the target type.  For
    example, you can replicate from a column of type <type>integer</type> to a
-   column of type <type>bigint</type>.  The target table can also have
-   additional columns not provided by the published table.  Any such columns
-   will be filled with the default value as specified in the definition of the
-   target table.
+   column of type <type>bigint</type>.  However, replication in
binary format is
+   type specific and does not allow to replicate data between different types
+   according to its restrictions.  The target table can also have additional
+   columns not provided by the published table.  Any such columns
will be filled
+   with the default value as specified in the definition of the target table.
   </para>

I am not sure if there is enough information here about the binary restrictions.
- e.g. does the column order matter for tablesync COPY binary?
- e.g. no mention of the send/receive function requirements of tablesync COPY.

But maybe here is not the place to write all such details anyway;
Instead of duplicating information IMO here should give a link to the
CREATE SUBSCRIPTION notes -- something like:

SUGGESTION
Note that replication in binary format is more restrictive. See CREATE
SUBSCRIPTION binary subscription parameter for details.

======
doc/src/sgml/ref/create_subscription.sgml

4.
@@ -189,11 +189,17 @@ CREATE SUBSCRIPTION <replaceable
class="parameter">subscription_name</replaceabl
         <term><literal>binary</literal> (<type>boolean</type>)</term>
         <listitem>
          <para>
-          Specifies whether the subscription will request the publisher to
-          send the data in binary format (as opposed to text).
-          The default is <literal>false</literal>.
-          Even when this option is enabled, only data types having
-          binary send and receive functions will be transferred in binary.
+          Specifies whether the subscription will both copy the initial data to
+          synchronize relations and request the publisher to send the data in
+          binary format (as opposed to text). The default is
<literal>false</literal>.
+          Binary format can be faster than the text format, but it is
less portable
+          across machine architectures and PostgreSQL versions.
Binary format is
+          also very data type specific, it will not allow copying
between different
+          column types as opposed to text format. Even when this
option is enabled,
+          only data types having binary send and receive functions will be
+          transferred in binary. Note that the initial synchronization requires
+          all data types to have binary send and receive functions, otherwise
+          the synchronization will fail.
          </para>

There seems to be a small ambiguity because this wording comes more
from our code-level understanding, rather than what the user sees.
e.g. I think "will be transferred" could mean also the COPY phase as
far as the user is concerned. Maybe some slight rewording can help.

There is also some use of "copy" (e.g. "will not allow copying") which
can be confused with the initial tablesync phase which is not what was
intended.

SUGGESTION
Specifies whether the subscription will request the publisher to send
the data in binary format (as opposed to text). The default is
<literal>false</literal>. Any initial table synchronization copy [link
to copy_data] also uses the same format. Using binary format can be
faster than the text format, but it is less portable across machine
architectures and PostgreSQL versions. Binary format is also data type
specific, it will not allow transfer between different column types as
opposed to text format. Even when the binary option is enabled, only
data types having binary send/receive functions can be transferred in
binary format. If these functions don't exist then the publisher send
will revert to sending text format. Note that the binary initial table
synchronization copy requires all data types to have binary
send/receive functions, otherwise it will fail.

======
src/backend/replication/logical/tablesync.c

5.
+
+ /*
+ * If the publisher is v14 or later, copy data in the required data format.
+ * If the publisher version is earlier, it doesn't support COPY with binary
+ * option.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000 &&
+ MySubscription->binary)
+ {
+ appendStringInfo(&cmd, " WITH (FORMAT binary)");
+ options = lappend(options, makeDefElem("format", (Node *)
makeString("binary"), -1));
+ }
+

5a.
I didn't think you need to say "copy data in the required data format".

Can’t you just say like:

SUGGESTION
If the publisher version is earlier than v14, it COPY command doesn't
support the binary option.

~

5b.
Does this also need to be mentioned as a note on the CREATE
SUBSCRIPTION docs page? e.g. COPY binary from server versions < v14
will work because it will just be skipping anyway and use text.

======
doc/src/sgml/ref/alter_subscription.sgml

6.
The v12 patch does not update the ALTER SUBSCRIPTION DOCS, but I
thought perhaps there should be some reference from the ALTER
copy_data back to the main CREATE SUBSCRIPTION page because if the
user leaves the default (copy_data=true) then the ALTER might cause
some unexpected errors is the subscription was already using binary
format.


------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
On Tue, Mar 14, 2023 at 11:06 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v12-0001
>
> ======
> General
>
> 1.
> There is no new test code. Are we sure that there are already
> sufficient TAP tests doing binary testing with/without copy_data and
> covering all the necessary combinations?
>

Oops. Please ignore this comment. Somehow I missed seeing those
032_binary_copy.pl tests earlier.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
Here are some review comments for patch v12-0001 (test code only)

======
src/test/subscription/t/014_binary.pl

# Check the synced data on subscribers

~

There are a couple of comments like the above that say: "on
subscribers" instead of "on subscriber".

~~~

I wondered if it might be useful to also include another test case
that demonstrates you can still use COPY with binary format even when
the table column orders are different, so long as the same names have
the same data types. In other words, it shows apparently, the binary
row COPY processes per column; not one single binary data copy
spanning all the replicated columns.

For example,

# --------------------------------
# Test syncing tables with different column order
$node_publisher->safe_psql(
    'postgres', qq(
    CREATE TABLE public.test_col_order (
        a bigint, b int
    );
    INSERT INTO public.test_col_order (a,b)
        VALUES (1,2),(3,4);
    ));

$node_subscriber->safe_psql(
    'postgres', qq(
    CREATE TABLE public.test_col_order (
        b int, a bigint
    );
    ALTER SUBSCRIPTION tsub REFRESH PUBLICATION;
    ));

# Ensure nodes are in sync with each other
$node_subscriber->wait_for_subscription_sync($node_publisher, 'tsub');

# Check the synced data on subscribers
$result = $node_subscriber->safe_psql('postgres', 'SELECT a,b FROM
public.test_col_order;');

is( $result, '1|2
3|4', 'check synced data on subscriber for different column order and
binary = true');
# --------------------------------

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Tue, Mar 14, 2023 at 6:18 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Tue, Mar 14, 2023 at 11:06 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Here are some review comments for patch v12-0001
> >
> > ======
> > General
> >
> > 1.
> > There is no new test code. Are we sure that there are already
> > sufficient TAP tests doing binary testing with/without copy_data and
> > covering all the necessary combinations?
> >
>
> Oops. Please ignore this comment. Somehow I missed seeing those
> 032_binary_copy.pl tests earlier.
>

I think it would better to write the tests for this feature in the
existing test file 014_binary as that would save some time for node
setup/shutdown and also that would be a more appropriate place for
these tests.

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Attached v13.

Peter Smith <smithpb2250@gmail.com>, 14 Mar 2023 Sal, 03:07 tarihinde şunu yazdı:
Here are some review comments for patch v12-0001

Thanks for reviewing. I tried to make explanations in docs better according to your comments.
What do you think?

 Amit Kapila <amit.kapila16@gmail.com>, 14 Mar 2023 Sal, 06:17 tarihinde şunu yazdı:
I think it would better to write the tests for this feature in the
existing test file 014_binary as that would save some time for node
setup/shutdown and also that would be a more appropriate place for
these tests.

I removed 032_binary_copy.pl and added those tests into 014_binary.pl
Also added the case with different column order as Peter suggested.

Best,
--
Melih Mutlu
Microsoft
Attachment

RE: Allow logical replication to copy tables in binary format

From
"Takamichi Osumi (Fujitsu)"
Date:
On Tuesday, March 14, 2023 8:02 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> Attached v13.
Hi,


Thanks for sharing v13. Few minor review comments.
(1) create_subscription.sgml

+          column types as opposed to text format. Even when this option is enabled,
+          only data types having binary send and receive functions will be
+          transferred in binary. Note that the initial synchronization requires

(1-1)

I think it's helpful to add a reference for the description about send and receive functions (e.g. to the page of
CREATETYPE). 

(1-2)

Also, it would be better to have a cross reference from there to this doc as one paragraph probably in "Notes". I
suggestedthis, because send and receive functions are described as "optional" there and missing them leads to error in
thecontext of binary table synchronization. 

(3) copy_table()

+       /*
+        * If the publisher version is earlier than v14, it COPY command doesn't
+        * support the binary option.
+        */

This sentence doesn't look correct grammatically. We can replace "it COPY command" with "subscription" for example.
Kindlyplease fix it. 


Best Regards,
    Takamichi Osumi




Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Tue, Mar 14, 2023 at 8:50 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
>
> On Tuesday, March 14, 2023 8:02 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> (3) copy_table()
>
> +       /*
> +        * If the publisher version is earlier than v14, it COPY command doesn't
> +        * support the binary option.
> +        */
>
> This sentence doesn't look correct grammatically. We can replace "it COPY command" with "subscription" for example.
Kindlyplease fix it. 
>

How about something like: "The binary option for replication is
supported since v14."?

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
Here are some review comments for v13-0001

======
doc/src/sgml/logical-replication.sgml

1.
@@ -241,10 +241,13 @@
    types of the columns do not need to match, as long as the text
    representation of the data can be converted to the target type.  For
    example, you can replicate from a column of type <type>integer</type> to a
-   column of type <type>bigint</type>.  The target table can also have
-   additional columns not provided by the published table.  Any such columns
-   will be filled with the default value as specified in the definition of the
-   target table.
+   column of type <type>bigint</type>.  However, replication in
binary format is
+   type specific and does not allow to replicate data between different types
+   according to its restrictions (See <literal>binary</literal> option of
+   <link linkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link>
+   for details).  The target table can also have additional columns
not provided
+   by the published table.  Any such columns will be filled with the default
+   value as specified in the definition of the target table.
   </para>

I don’t really think we should mention details of what the binary
problems are here, because then:
i) it is just duplicating information already on the CREATE SUBSCRIPTION page
ii) you will miss some restrictions. (e.g. here you said something
about "type specific" but didn't mention send/receive functions would
be mandatory for the copy_data option)

That's why in the previous v12 review [1] (comment #3) I suggested
that this page should just say something quite generic like "However,
replication in binary format is more restrictive", and link back to
the other page which has all the gory details.

======
doc/src/sgml/ref/alter_subscription.sgml

2.
My previous v12 review [1] (comment #6) suggested maybe updating this
page. But it was not done in v13. Did you accidentally miss the review
comment, or chose not to do it?

======
doc/src/sgml/ref/create_subscription.sgml

3.
          <para>
-          Specifies whether the subscription will request the publisher to
-          send the data in binary format (as opposed to text).
-          The default is <literal>false</literal>.
-          Even when this option is enabled, only data types having
-          binary send and receive functions will be transferred in binary.
+          Specifies whether the subscription will request the publisher to send
+          the data in binary format (as opposed to text). The default is
+          <literal>false</literal>. Any initial table synchronization copy
+          (see <literal>copy_data</literal>) also uses the same format. Binary
+          format can be faster than the text format, but it is less portable
+          across machine architectures and PostgreSQL versions.
Binary format is
+          also very data type specific, it will not allow copying
between different
+          column types as opposed to text format. Even when this
option is enabled,
+          only data types having binary send and receive functions will be
+          transferred in binary. Note that the initial synchronization requires
+          all data types to have binary send and receive functions, otherwise
+          the synchronization will fail.
          </para>


BEFORE
Binary format is also very data type specific, it will not allow
copying between different column types as opposed to text format.

SUGGESTION (worded more similar to what is already on the COPY page [2])
Binary format is very data type specific; for example, it will not
allow copying from a smallint column to an integer column, even though
that would work fine in text format.


~~~

4.

+         <para>
+          If the publisher is a <productname>PostgreSQL</productname> version
+          before 14, then any initial table synchronization will use
text format
+          even if this option is enabled.
+         </para>

IMO it will be clearer to explicitly say the option instead of 'this option'.

SUGGESTION
If the publisher is a <productname>PostgreSQL</productname> version
before 14, then any initial table synchronization will use text format
even if <literal>binary = true</literal>.


======
src/backend/replication/logical/tablesync.c

5.
+
+ /*
+ * If the publisher version is earlier than v14, it COPY command doesn't
+ * support the binary option.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000 &&
+ MySubscription->binary)
+ {
+ appendStringInfo(&cmd, " WITH (FORMAT binary)");
+ options = lappend(options, makeDefElem("format", (Node *)
makeString("binary"), -1));
+ }

Sorry, I gave a poor review comment for this previously. Now I have
revisited all the thread discussions about version checking. I feel
that some explanation should be given in the code comment so that
future readers of this code can understand why you decided to use v14
checking.

Something like this:

SUGGESTION
If the publisher version is earlier than v14, we use text format COPY.
Note - In fact COPY syntax "WITH (FORMAT binary)" has existed since
v9, but since the logical replication binary mode transfer was not
introduced until v14 it was decided to check using the later version.

------
[1] PS v12 review -
https://www.postgresql.org/message-id/CAHut%2BPsAS8HpjdbDv%2BRM-YUJaLO0UC3f5be%2BqN296%2BGrewsGXg%40mail.gmail.com
[2] pg docs COPY - https://www.postgresql.org/docs/current/sql-copy.html
[3] pg docs COPY v9.0 - https://www.postgresql.org/docs/9.0/sql-copy.html

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Wed, Mar 15, 2023 at 11:52 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> ======
> src/backend/replication/logical/tablesync.c
>
> 5.
> +
> + /*
> + * If the publisher version is earlier than v14, it COPY command doesn't
> + * support the binary option.
> + */
> + if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000 &&
> + MySubscription->binary)
> + {
> + appendStringInfo(&cmd, " WITH (FORMAT binary)");
> + options = lappend(options, makeDefElem("format", (Node *)
> makeString("binary"), -1));
> + }
>
> Sorry, I gave a poor review comment for this previously. Now I have
> revisited all the thread discussions about version checking. I feel
> that some explanation should be given in the code comment so that
> future readers of this code can understand why you decided to use v14
> checking.
>
> Something like this:
>
> SUGGESTION
> If the publisher version is earlier than v14, we use text format COPY.
>

I think this isn't explicit that we supported the binary format since
v14. So, I would prefer my version of the comment as suggested in the
previous email.

--
With Regards,
Amit Kapila.



RE: Allow logical replication to copy tables in binary format

From
"Takamichi Osumi (Fujitsu)"
Date:
Hi,


On Wednesday, March 15, 2023 2:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Mar 14, 2023 at 8:50 PM Takamichi Osumi (Fujitsu)
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > On Tuesday, March 14, 2023 8:02 PM Melih Mutlu
> <m.melihmutlu@gmail.com> wrote:
> > (3) copy_table()
> >
> > +       /*
> > +        * If the publisher version is earlier than v14, it COPY command
> doesn't
> > +        * support the binary option.
> > +        */
> >
> > This sentence doesn't look correct grammatically. We can replace "it COPY
> command" with "subscription" for example. Kindly please fix it.
> >
> 
> How about something like: "The binary option for replication is supported since
> v14."?
Yes, this looks best to me. I agree with this suggestion.


Best Regards,
    Takamichi Osumi


Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Tue, Mar 14, 2023 at 4:32 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Attached v13.
>

I have a question related to the below test in the patch:

+# Setting binary to false should allow syncing
+$node_subscriber->safe_psql(
+    'postgres', qq(
+    ALTER SUBSCRIPTION tsub SET (binary = false);));
+
+# Ensure the COPY command is executed in text format on the publisher
+$node_publisher->wait_for_log(qr/LOG: ( [a-z0-9]+:)? COPY (.+)? TO STDOUT\n/);
+
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'tsub');
+
+# Check the synced data on the subscriber
+$result = $node_subscriber->safe_psql('postgres', 'SELECT a FROM
test_mismatching_types ORDER BY a;');
+
+is( $result, '1
+2', 'check synced data on subscriber with binary = false');
+
+# Test syncing tables with different column order
+$node_publisher->safe_psql(
+    'postgres', qq(
+    CREATE TABLE public.test_col_order (
+        a bigint, b int
+    );
+    INSERT INTO public.test_col_order (a,b)
+        VALUES (1,2),(3,4);
+    ));

What purpose does this test serve w.r.t this patch? Before checking
the sync for different column orders, the patch has already changed
binary to false, so it doesn't seem to test the functionality of this
patch. Am, I missing something?

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Please see the attached patch.

Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com>, 14 Mar 2023 Sal, 18:20 tarihinde şunu yazdı:
(1) create_subscription.sgml

+          column types as opposed to text format. Even when this option is enabled,
+          only data types having binary send and receive functions will be
+          transferred in binary. Note that the initial synchronization requires

(1-1)

I think it's helpful to add a reference for the description about send and receive functions (e.g. to the page of CREATE TYPE).

Done.
 

(1-2)

Also, it would be better to have a cross reference from there to this doc as one paragraph probably in "Notes". I suggested this, because send and receive functions are described as "optional" there and missing them leads to error in the context of binary table synchronization.

I'm not sure whether this is necessary. In case of missing send/receive functions, error logs are already clear about what's wrong and logical replication docs also explain what could go wrong with binary.
 
(3) copy_table()

+       /*
+        * If the publisher version is earlier than v14, it COPY command doesn't
+        * support the binary option.
+        */

This sentence doesn't look correct grammatically. We can replace "it COPY command" with "subscription" for example. Kindly please fix it.

Changed this with Amit's suggestion [1].



Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Please see v14 [1].

Peter Smith <smithpb2250@gmail.com>, 15 Mar 2023 Çar, 09:22 tarihinde şunu yazdı:
Here are some review comments for v13-0001

======
doc/src/sgml/logical-replication.sgml

1.
That's why in the previous v12 review [1] (comment #3) I suggested
that this page should just say something quite generic like "However,
replication in binary format is more restrictive", and link back to
the other page which has all the gory details.

You're right. Changed it with what you suggested.
 
2.
My previous v12 review [1] (comment #6) suggested maybe updating this
page. But it was not done in v13. Did you accidentally miss the review
comment, or chose not to do it?

Sorry, I missed this. Added a line leading to CREATE SUBSCRIPTION doc. 
 
======
doc/src/sgml/ref/create_subscription.sgml

3.
BEFORE
Binary format is also very data type specific, it will not allow
copying between different column types as opposed to text format.

SUGGESTION (worded more similar to what is already on the COPY page [2])
Binary format is very data type specific; for example, it will not
allow copying from a smallint column to an integer column, even though
that would work fine in text format.

Done.
 
4.
SUGGESTION
If the publisher is a <productname>PostgreSQL</productname> version
before 14, then any initial table synchronization will use text format
even if <literal>binary = true</literal>.

Done.
 
SUGGESTION
If the publisher version is earlier than v14, we use text format COPY.
Note - In fact COPY syntax "WITH (FORMAT binary)" has existed since
v9, but since the logical replication binary mode transfer was not
introduced until v14 it was decided to check using the later version.

Changed it as suggested here [2].


Thanks,
--
Melih Mutlu
Microsoft

Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:

Amit Kapila <amit.kapila16@gmail.com>, 15 Mar 2023 Çar, 12:31 tarihinde şunu yazdı:
On Tue, Mar 14, 2023 at 4:32 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
 
What purpose does this test serve w.r.t this patch? Before checking
the sync for different column orders, the patch has already changed
binary to false, so it doesn't seem to test the functionality of this
patch. Am, I missing something?

I missed that binary has changed to false before testing column orders. I moved that test case up before changing binary to false.
Please see v14 [1].


Thanks,
--
Melih Mutlu
Microsoft

Re: Allow logical replication to copy tables in binary format

From
vignesh C
Date:
On Wed, 15 Mar 2023 at 15:30, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi,
>
> Please see the attached patch.

One comment:
1) There might be a chance the result order of select may vary as
"ORDER BY" is not specified,  Should we add "ORDER BY" as the table
has multiple rows:
+# Check the synced data on the subscriber
+$result = $node_subscriber->safe_psql('postgres', 'SELECT a,b FROM
+public.test_col_order;');
+
+is( $result, '1|2
+3|4', 'check synced data on subscriber for different column order');

Regards,
Vignesh



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

vignesh C <vignesh21@gmail.com>, 15 Mar 2023 Çar, 18:12 tarihinde şunu yazdı:
One comment:
1) There might be a chance the result order of select may vary as
"ORDER BY" is not specified,  Should we add "ORDER BY" as the table
has multiple rows:
+# Check the synced data on the subscriber
+$result = $node_subscriber->safe_psql('postgres', 'SELECT a,b FROM
+public.test_col_order;');
+
+is( $result, '1|2
+3|4', 'check synced data on subscriber for different column order');

Right, it needs to be ordered. Fixed.

Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
Here are some review comments for v15-0001

======
doc/src/sgml/logical-replication.sgml

1.
+   target table. However, logical replication in binary format is
more restrictive,
+   see <literal>binary</literal> option of
+   <link linkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link>
+   for more details.

IMO  (and Chat-GPT agrees) the new text should be 2 sentences.

Also, I changed "more details" --> "details" because none are provided here,.

SUGGESTION
However, logical replication in <literal>binary</literal> format is
more restrictive. See the binary option of <link
linkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link> for details.

======
doc/src/sgml/ref/alter_subscription.sgml

2.
+         <para>
+          See <literal>binary</literal> option of <xref
linkend="sql-createsubscription"/>
+          for details of copying pre-existing data in binary format.
+         </para>

Should the link should be defined more like you did above using the
<command> markup to get the better font?

SUGGESTION (also minor rewording)
See the <literal>binary</literal> option of <link
linkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link> for details about copying pre-existing
data in binary format.

======
doc/src/sgml/ref/create_subscription.sgml

3.
          <para>
-          Specifies whether the subscription will request the publisher to
-          send the data in binary format (as opposed to text).
-          The default is <literal>false</literal>.
-          Even when this option is enabled, only data types having
-          binary send and receive functions will be transferred in binary.
+          Specifies whether the subscription will request the publisher to send
+          the data in binary format (as opposed to text). The default is
+          <literal>false</literal>. Any initial table synchronization copy
+          (see <literal>copy_data</literal>) also uses the same format. Binary
+          format can be faster than the text format, but it is less portable
+          across machine architectures and PostgreSQL versions. Binary format
+          is very data type specific; for example, it will not allow copying
+          from a smallint column to an integer column, even though that would
+          work fine in text format. Even when this option is enabled, only data
+          types having binary send and receive functions will be transferred in
+          binary. Note that the initial synchronization requires all data types
+          to have binary send and receive functions, otherwise the
synchronization
+          will fail (see <xref linkend="sql-createtype"/> for more about
+          send/receive functions).
          </para>

IMO that part saying "from a smallint column to an integer column"
should have <type></type> markups for "smallint" and "integer".

======
src/backend/replication/logical/tablesync.c

4.
+ /*
+ * The binary option for replication is supported since v14
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000 &&
+ MySubscription->binary)
+ {
+ appendStringInfo(&cmd, " WITH (FORMAT binary)");
+ options = lappend(options, makeDefElem("format", (Node *)
makeString("binary"), -1));
+ }

Should this now be a single-line comment instead of spanning 3 lines?

======
src/test/subscription/t/014_binary.pl

5.
Everything looked OK to me, but the TAP file has only small comments
for each test step, which forces you to read everything from
top-to-bottom to understand what is going on. I felt it might be
easier to understand the tests if you add a few "bigger" comments just
to break the tests into the categories being tested. For example,
something like:


# ------------------------------------------------------
# Ensure binary mode also executes COPY in binary format
# ------------------------------------------------------

~

# --------------------------------------
# Ensure normal binary replication works
# --------------------------------------

~

# ------------------------------------------------------------------------------
# Use ALTER SUBSCRIPTION to change to text format and then back to binary format
# ------------------------------------------------------------------------------

~

# ---------------------------------------------------------------
# Test binary replication without and with send/receive functions
# ---------------------------------------------------------------

~

# ----------------------------------------------
# Test different column orders on pub/sub tables
# ----------------------------------------------

~

# -----------------------------------------------------
# Test mismatched column types with/without binary mode
# -----------------------------------------------------

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
On Wed, Mar 15, 2023 at 5:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 15, 2023 at 11:52 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > ======
> > src/backend/replication/logical/tablesync.c
> >
> > 5.
> > +
> > + /*
> > + * If the publisher version is earlier than v14, it COPY command doesn't
> > + * support the binary option.
> > + */
> > + if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000 &&
> > + MySubscription->binary)
> > + {
> > + appendStringInfo(&cmd, " WITH (FORMAT binary)");
> > + options = lappend(options, makeDefElem("format", (Node *)
> > makeString("binary"), -1));
> > + }
> >
> > Sorry, I gave a poor review comment for this previously. Now I have
> > revisited all the thread discussions about version checking. I feel
> > that some explanation should be given in the code comment so that
> > future readers of this code can understand why you decided to use v14
> > checking.
> >
> > Something like this:
> >
> > SUGGESTION
> > If the publisher version is earlier than v14, we use text format COPY.
> >
>
> I think this isn't explicit that we supported the binary format since
> v14. So, I would prefer my version of the comment as suggested in the
> previous email.
>

Hmm, but my *full* suggestion was bigger than what is misquoted above,
and it certainly did say " logical replication binary mode transfer
was not introduced until v14".

SUGGESTION
If the publisher version is earlier than v14, we use text format COPY.
Note - In fact COPY syntax "WITH (FORMAT binary)" has existed since
v9, but since the logical replication binary mode transfer was not
introduced until v14 it was decided to check using the later version.

~~

Anyway, the shortened comment as in the latest v15 patch is fine by me too.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Allow logical replication to copy tables in binary format

From
"shiy.fnst@fujitsu.com"
Date:
On Thu, Mar 16, 2023 2:26 AM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> 
> Right, it needs to be ordered. Fixed.
> 

Hi,

Thanks for updating the patch. I tested some cases like toast data, combination
of row filter and column lists, and it works well.

Here is a comment:

+# Ensure the COPY command is executed in binary format on the publisher
+$node_publisher->wait_for_log(qr/LOG: ( [a-z0-9]+:)? COPY (.+)? TO STDOUT WITH \(FORMAT binary\)/);

The test failed with `log_error_verbosity = verbose` because it couldn't match
the following log:
2023-03-16 09:45:50.096 CST [2499415] pg_16398_sync_16391_7210954376230900539 LOG:  00000: statement: COPY
public.test_arrays(a, b, c) TO STDOUT WITH (FORMAT binary)
 

I think we should make it pass, see commit 19408aae7f.
Should it be changed to:

$node_publisher->wait_for_log(qr/LOG: ( [A-Z0-9]+:)? statement: COPY (.+)? TO STDOUT WITH \(FORMAT binary\)/);

Besides, for the same reason, this line also needs to be modified.
+$node_publisher->wait_for_log(qr/LOG: ( [a-z0-9]+:)? COPY (.+)? TO STDOUT\n/);

Regards,
Shi Yu

Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Thu, Mar 16, 2023 at 5:59 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Wed, Mar 15, 2023 at 5:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Mar 15, 2023 at 11:52 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > ======
> > > src/backend/replication/logical/tablesync.c
> > >
> > > 5.
> > > +
> > > + /*
> > > + * If the publisher version is earlier than v14, it COPY command doesn't
> > > + * support the binary option.
> > > + */
> > > + if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000 &&
> > > + MySubscription->binary)
> > > + {
> > > + appendStringInfo(&cmd, " WITH (FORMAT binary)");
> > > + options = lappend(options, makeDefElem("format", (Node *)
> > > makeString("binary"), -1));
> > > + }
> > >
> > > Sorry, I gave a poor review comment for this previously. Now I have
> > > revisited all the thread discussions about version checking. I feel
> > > that some explanation should be given in the code comment so that
> > > future readers of this code can understand why you decided to use v14
> > > checking.
> > >
> > > Something like this:
> > >
> > > SUGGESTION
> > > If the publisher version is earlier than v14, we use text format COPY.
> > >
> >
> > I think this isn't explicit that we supported the binary format since
> > v14. So, I would prefer my version of the comment as suggested in the
> > previous email.
> >
>
> Hmm, but my *full* suggestion was bigger than what is misquoted above,
> and it certainly did say " logical replication binary mode transfer
> was not introduced until v14".
>
> SUGGESTION
> If the publisher version is earlier than v14, we use text format COPY.
> Note - In fact COPY syntax "WITH (FORMAT binary)" has existed since
> v9, but since the logical replication binary mode transfer was not
> introduced until v14 it was decided to check using the later version.
>

I find this needlessly verbose.

> ~~
>
> Anyway, the shortened comment as in the latest v15 patch is fine by me too.
>

Okay, then let's go with that.

--
With Regards,
Amit Kapila.



RE: Allow logical replication to copy tables in binary format

From
"houzj.fnst@fujitsu.com"
Date:
Hi,

Thanks for updating the patch, I think it is a useful feature.

I looked at the v15 patch and the patch looks mostly good to me.
Here are few comments:

1.
+    {
+        appendStringInfo(&cmd, " WITH (FORMAT binary)");

We could use appendStringInfoString here.


2.
+# It should fail
+$node_subscriber->wait_for_log(qr/ERROR: ( [A-Z0-9]+:)? no binary input function available for type/);
...
+# Cannot sync due to type mismatch
+$node_subscriber->wait_for_log(qr/ERROR: ( [A-Z0-9]+:)? incorrect binary data format/);
...
+# Ensure the COPY command is executed in text format on the publisher
+$node_publisher->wait_for_log(qr/LOG: ( [a-z0-9]+:)? COPY (.+)? TO STDOUT\n/);

I think it would be better to pass the log offset when using wait_for_log,
because otherwise it will check the whole log file to find the target message,
This might not be a big problem, but it has a risk of getting unexpected log message
which was generated by previous commands.

Best Regards,
Hou zj

Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Wed, Mar 15, 2023 at 3:33 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com>, 15 Mar 2023 Çar, 12:31 tarihinde şunu yazdı:
>>
>> On Tue, Mar 14, 2023 at 4:32 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
>
>>
>> What purpose does this test serve w.r.t this patch? Before checking
>> the sync for different column orders, the patch has already changed
>> binary to false, so it doesn't seem to test the functionality of this
>> patch. Am, I missing something?
>
>
> I missed that binary has changed to false before testing column orders. I moved that test case up before changing
binaryto false. 
> Please see v14 [1].
>

After thinking some more about this test, I don't think we need this
test as this doesn't add any value to this patch. This tests the
column orders which is well-established functionality of the apply
worker.

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
"Euler Taveira"
Date:
On Wed, Mar 8, 2023, at 11:50 PM, Amit Kapila wrote:
On Wed, Mar 8, 2023 at 6:17 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> On 7 Mar 2023 Tue at 04:10 Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> As per what I could read in this thread, most people prefer to use the
>> existing binary option rather than inventing a new way (option) to
>> binary copy in the initial sync phase. Do you agree?
>
>
> I agree.
> What do you think about the version checks? I removed any kind of check since it’s currently a different option. Should we check publisher version before doing binary copy to ensure that the publisher node supports binary option of COPY command?
>

It is not clear to me which version check you wanted to add because we
seem to have a binary option in COPY from the time prior to logical
replication. I feel we need a publisher version 14 check as that is
where we start to support binary mode transfer in logical replication.
See the check in function libpqrcv_startstreaming().
... then you are breaking existent cases. Even if you have a convincing
argument, you are introducing a behavior change in prior versions (commit
messages should always indicate that you are breaking backward compatibility).

+
+   /*
+    * The binary option for replication is supported since v14
+    */
+   if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000 &&
+       MySubscription->binary)
+   {
+       appendStringInfo(&cmd, " WITH (FORMAT binary)");
+       options = lappend(options, makeDefElem("format", (Node *) makeString("binary"), -1));
+   }
+

What are the arguments to support since v14 instead of the to-be-released
version? I read the thread but it is not clear. It was said about the
restrictive nature of this feature and it will be frustrating to see that the
same setup (with the same commands) works with v14 and v15 but it doesn't with
v16. IMO it should be >= 16 and documentation should explain that v14/v15 uses
text format during initial table synchronization even if binary = true.

Should there be a fallback mode (text) if initial table synchronization failed
because of the binary option? Maybe a different setting (auto) to support such
behavior.


--
Euler Taveira

Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Thu, Mar 16, 2023 at 8:27 AM Euler Taveira <euler@eulerto.com> wrote:
>
> On Wed, Mar 8, 2023, at 11:50 PM, Amit Kapila wrote:
>
> It is not clear to me which version check you wanted to add because we
> seem to have a binary option in COPY from the time prior to logical
> replication. I feel we need a publisher version 14 check as that is
> where we start to support binary mode transfer in logical replication.
> See the check in function libpqrcv_startstreaming().
>
> ... then you are breaking existent cases. Even if you have a convincing
> argument, you are introducing a behavior change in prior versions (commit
> messages should always indicate that you are breaking backward compatibility).
>
> +
> +   /*
> +    * The binary option for replication is supported since v14
> +    */
> +   if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000 &&
> +       MySubscription->binary)
> +   {
> +       appendStringInfo(&cmd, " WITH (FORMAT binary)");
> +       options = lappend(options, makeDefElem("format", (Node *) makeString("binary"), -1));
> +   }
> +
>
> What are the arguments to support since v14 instead of the to-be-released
> version? I read the thread but it is not clear. It was said about the
> restrictive nature of this feature and it will be frustrating to see that the
> same setup (with the same commands) works with v14 and v15 but it doesn't with
> v16.
>

If the failure has to happen it will anyway happen later when the
publisher will be upgraded to v16. The reason for the version checks
as v14 was to allow the initial sync from the same version where the
binary mode for replication was introduced. However, if we expect
failures in the existing setup, I am fine with supporting this for >=
v16.

> IMO it should be >= 16 and documentation should explain that v14/v15 uses
> text format during initial table synchronization even if binary = true.
>

Yeah, if we change the version then the corresponding text in the
patch should also be changed.

> Should there be a fallback mode (text) if initial table synchronization failed
> because of the binary option? Maybe a different setting (auto) to support such
> behavior.
>

I think the workaround is that the user disables binary mode for the
time of initial sync. I think if we want to extend and add a fallback
(text) mode then it is better to keep it as default behavior rather
than introducing a new setting like 'auto'. Personally, I feel it can
be added later after doing some more study.

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Please see the attached v16.

Peter Smith <smithpb2250@gmail.com>, 16 Mar 2023 Per, 03:03 tarihinde şunu yazdı:
Here are some review comments for v15-0001

I applied your comments in the updated patch.

shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com>, 16 Mar 2023 Per, 05:35 tarihinde şunu yazdı:
On Thu, Mar 16, 2023 2:26 AM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Right, it needs to be ordered. Fixed.
>

Hi,

Thanks for updating the patch. I tested some cases like toast data, combination
of row filter and column lists, and it works well.

Thanks for testing. I changed wait_for_log lines as you suggested.
 
houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com>, 16 Mar 2023 Per, 05:55 tarihinde şunu yazdı:
1.
+       {
+               appendStringInfo(&cmd, " WITH (FORMAT binary)");

We could use appendStringInfoString here.

Done.
 
2.
I think it would be better to pass the log offset when using wait_for_log,
because otherwise it will check the whole log file to find the target message,
This might not be a big problem, but it has a risk of getting unexpected log message
which was generated by previous commands.

You're right. I added offsets for wait_for_log's .

Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Amit Kapila <amit.kapila16@gmail.com>, 16 Mar 2023 Per, 06:25 tarihinde şunu yazdı:
On Thu, Mar 16, 2023 at 8:27 AM Euler Taveira <euler@eulerto.com> wrote:
>
> On Wed, Mar 8, 2023, at 11:50 PM, Amit Kapila wrote:
>
> It is not clear to me which version check you wanted to add because we
> seem to have a binary option in COPY from the time prior to logical
> replication. I feel we need a publisher version 14 check as that is
> where we start to support binary mode transfer in logical replication.
> See the check in function libpqrcv_startstreaming().
>
> ... then you are breaking existent cases. Even if you have a convincing
> argument, you are introducing a behavior change in prior versions (commit
> messages should always indicate that you are breaking backward compatibility).
>
> +
> +   /*
> +    * The binary option for replication is supported since v14
> +    */
> +   if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000 &&
> +       MySubscription->binary)
> +   {
> +       appendStringInfo(&cmd, " WITH (FORMAT binary)");
> +       options = lappend(options, makeDefElem("format", (Node *) makeString("binary"), -1));
> +   }
> +
>
> What are the arguments to support since v14 instead of the to-be-released
> version? I read the thread but it is not clear. It was said about the
> restrictive nature of this feature and it will be frustrating to see that the
> same setup (with the same commands) works with v14 and v15 but it doesn't with
> v16.
>

If the failure has to happen it will anyway happen later when the
publisher will be upgraded to v16. The reason for the version checks
as v14 was to allow the initial sync from the same version where the
binary mode for replication was introduced. However, if we expect
failures in the existing setup, I am fine with supporting this for >=
v16.

Upgrading the subscriber to v16 and keeping the subscriber in v14 could break existing subscriptions. I don't know how likely such a case is.

I don't have a strong preference on this. What do you think? Should we change it >=v16 or keep it as it is?

Best,
--
Melih Mutlu
Microsoft

Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Thu, Mar 16, 2023 at 6:59 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com>, 16 Mar 2023 Per, 06:25 tarihinde şunu yazdı:
>>
>> On Thu, Mar 16, 2023 at 8:27 AM Euler Taveira <euler@eulerto.com> wrote:
>> >
>> > On Wed, Mar 8, 2023, at 11:50 PM, Amit Kapila wrote:
>> >
>> > It is not clear to me which version check you wanted to add because we
>> > seem to have a binary option in COPY from the time prior to logical
>> > replication. I feel we need a publisher version 14 check as that is
>> > where we start to support binary mode transfer in logical replication.
>> > See the check in function libpqrcv_startstreaming().
>> >
>> > ... then you are breaking existent cases. Even if you have a convincing
>> > argument, you are introducing a behavior change in prior versions (commit
>> > messages should always indicate that you are breaking backward compatibility).
>> >
>> > +
>> > +   /*
>> > +    * The binary option for replication is supported since v14
>> > +    */
>> > +   if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000 &&
>> > +       MySubscription->binary)
>> > +   {
>> > +       appendStringInfo(&cmd, " WITH (FORMAT binary)");
>> > +       options = lappend(options, makeDefElem("format", (Node *) makeString("binary"), -1));
>> > +   }
>> > +
>> >
>> > What are the arguments to support since v14 instead of the to-be-released
>> > version? I read the thread but it is not clear. It was said about the
>> > restrictive nature of this feature and it will be frustrating to see that the
>> > same setup (with the same commands) works with v14 and v15 but it doesn't with
>> > v16.
>> >
>>
>> If the failure has to happen it will anyway happen later when the
>> publisher will be upgraded to v16. The reason for the version checks
>> as v14 was to allow the initial sync from the same version where the
>> binary mode for replication was introduced. However, if we expect
>> failures in the existing setup, I am fine with supporting this for >=
>> v16.
>
>
> Upgrading the subscriber to v16 and keeping the subscriber in v14 could break existing subscriptions. I don't know
howlikely such a case is. 
>
> I don't have a strong preference on this. What do you think? Should we change it >=v16 or keep it as it is?
>

I think to reduce the risk of breakage, let's change the check to
>=v16. Also, accordingly, update the doc and commit message.

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
On Thu, Mar 16, 2023 at 1:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 15, 2023 at 3:33 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com>, 15 Mar 2023 Çar, 12:31 tarihinde şunu yazdı:
> >>
> >> On Tue, Mar 14, 2023 at 4:32 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> >
> >
> >>
> >> What purpose does this test serve w.r.t this patch? Before checking
> >> the sync for different column orders, the patch has already changed
> >> binary to false, so it doesn't seem to test the functionality of this
> >> patch. Am, I missing something?
> >
> >
> > I missed that binary has changed to false before testing column orders. I moved that test case up before changing
binaryto false. 
> > Please see v14 [1].
> >
>
> After thinking some more about this test, I don't think we need this
> test as this doesn't add any value to this patch. This tests the
> column orders which is well-established functionality of the apply
> worker.
>

I agree that different column order is a "well-established
functionality of the apply worker".

But when I searched the TAP tests I could not find any existing tests
that check the combination of
- different column orders
- CREATE SUBSCRIPTION with parameters binary=true and copy_data=true

So there seemed to be a gap in the test coverage, which is why I suggested it.

I guess that test was not strictly tied to this patch. Should I post
this new test suggestion as a separate thread or do you think there is
no point because it will not get any support?

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
On Fri, Mar 17, 2023 at 12:20 AM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi,
>
> Please see the attached v16.
>
> Peter Smith <smithpb2250@gmail.com>, 16 Mar 2023 Per, 03:03 tarihinde şunu yazdı:
>>
>> Here are some review comments for v15-0001
>
>
> I applied your comments in the updated patch.

Thanks.

I checked patchv16-0001 and have only one minor comment.

======
doc/src/sgml/logical-replication.sgml

1.
diff --git a/doc/src/sgml/logical-replication.sgml
b/doc/src/sgml/logical-replication.sgml
index 6b0e300adc..bad25e54cd 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -251,7 +251,10 @@
    column of type <type>bigint</type>.  The target table can also have
    additional columns not provided by the published table.  Any such columns
    will be filled with the default value as specified in the definition of the
-   target table.
+   target table. However, logical replication in <literal>binary</literal>
+   format is more restrictive. See the <literal>binary</literal> option of
+   <link linkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link>
+   for details.
   </para>

IMO the sentence "However, logical replication in binary format is
more restrictive." should just be plain text.

There should not be the <literal>binary</literal> markup in that 1st sentence.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Allow logical replication to copy tables in binary format

From
"shiy.fnst@fujitsu.com"
Date:
On Thu, Mar 16, 2023 9:20 PM Melih Mutlu <m.melihmutlu@gmail.com>  wrote:
> 
> Hi,
> 
> Please see the attached v16.
> 

Thanks for updating the patch.

+# Cannot sync due to type mismatch
+$node_subscriber->wait_for_log(qr/ERROR: ( [A-Z0-9]+:)? incorrect binary data format/);

+# Ensure the COPY command is executed in text format on the publisher
+$node_publisher->wait_for_log(qr/LOG: ( [A-Z0-9]+:)? statement: COPY (.+)? TO STDOUT\n/);

It looks that you forgot to pass `offset` into wait_for_log().

Regards,
Shi Yu

Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Fri, Mar 17, 2023 at 6:42 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Mar 16, 2023 at 1:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Mar 15, 2023 at 3:33 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> > >
> > > Amit Kapila <amit.kapila16@gmail.com>, 15 Mar 2023 Çar, 12:31 tarihinde şunu yazdı:
> > >>
> > >> On Tue, Mar 14, 2023 at 4:32 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> > >
> > >
> > >>
> > >> What purpose does this test serve w.r.t this patch? Before checking
> > >> the sync for different column orders, the patch has already changed
> > >> binary to false, so it doesn't seem to test the functionality of this
> > >> patch. Am, I missing something?
> > >
> > >
> > > I missed that binary has changed to false before testing column orders. I moved that test case up before changing
binaryto false. 
> > > Please see v14 [1].
> > >
> >
> > After thinking some more about this test, I don't think we need this
> > test as this doesn't add any value to this patch. This tests the
> > column orders which is well-established functionality of the apply
> > worker.
> >
>
> I agree that different column order is a "well-established
> functionality of the apply worker".
>
> But when I searched the TAP tests I could not find any existing tests
> that check the combination of
> - different column orders
> - CREATE SUBSCRIPTION with parameters binary=true and copy_data=true
>
> So there seemed to be a gap in the test coverage, which is why I suggested it.
>
> I guess that test was not strictly tied to this patch. Should I post
> this new test suggestion as a separate thread or do you think there is
> no point because it will not get any support?
>

Personally, I don't think we need to test every possible combination
unless it is really achieving something meaningful. In this particular
case, I don't see the need or maybe I am missing something.

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Sharing v17.

Amit Kapila <amit.kapila16@gmail.com>, 17 Mar 2023 Cum, 03:02 tarihinde şunu yazdı:
I think to reduce the risk of breakage, let's change the check to
>=v16. Also, accordingly, update the doc and commit message.

Done.

Peter Smith <smithpb2250@gmail.com>, 17 Mar 2023 Cum, 04:58 tarihinde şunu yazdı:
IMO the sentence "However, logical replication in binary format is
more restrictive." should just be plain text.

Done.

 shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com>, 17 Mar 2023 Cum, 05:26 tarihinde şunu yazdı:
It looks that you forgot to pass `offset` into wait_for_log().

Yes, I somehow didn't include those lines into the patch. Thanks for noticing. Fixed them now.


Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
vignesh C
Date:
On Fri, 17 Mar 2023 at 17:55, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi,
>
> Sharing v17.
>
> Amit Kapila <amit.kapila16@gmail.com>, 17 Mar 2023 Cum, 03:02 tarihinde şunu yazdı:
>>
>> I think to reduce the risk of breakage, let's change the check to
>> >=v16. Also, accordingly, update the doc and commit message.
>
>
> Done.
>
> Peter Smith <smithpb2250@gmail.com>, 17 Mar 2023 Cum, 04:58 tarihinde şunu yazdı:
>>
>> IMO the sentence "However, logical replication in binary format is
>> more restrictive." should just be plain text.
>
>
> Done.
>
>  shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com>, 17 Mar 2023 Cum, 05:26 tarihinde şunu yazdı:
>>
>> It looks that you forgot to pass `offset` into wait_for_log().
>
>
> Yes, I somehow didn't include those lines into the patch. Thanks for noticing. Fixed them now.

Thanks for the updated patch, few comments:
1) Currently we refer the link to the beginning of create subscription
page, this can be changed to refer to binary option contents in create
subscription:
+         <para>
+          See the <literal>binary</literal> option of
+          <link linkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link>
+          for details about copying pre-existing data in binary format.
+         </para>

2) Running pgperltidy shows the test script 014_binary.pl could be
slightly improved as in the attachment.

Regards,
Vignesh

Attachment

Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
On Fri, Mar 17, 2023 at 11:25 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi,
>
> Sharing v17.
>
> Amit Kapila <amit.kapila16@gmail.com>, 17 Mar 2023 Cum, 03:02 tarihinde şunu yazdı:
>>
>> I think to reduce the risk of breakage, let's change the check to
>> >=v16. Also, accordingly, update the doc and commit message.
>
>
> Done.
>

Here are my review comments for v17-0001


======
Commit message

1.
Binary copy is supported for v16 or later.

~

As written that's very general and not quite correct. E.g. COPY ...
WITH (FORMAT binary) has been available for a long time. IMO that
commit message sentence ought to be more specific.

SUGGESTION
Binary copy for logical replication table synchronization is supported
only when both publisher and subscriber are v16 or later.

======
src/backend/replication/logical/tablesync.c

2.
@@ -1168,6 +1170,15 @@ copy_table(Relation rel)

  appendStringInfoString(&cmd, ") TO STDOUT");
  }
+
+ /* The binary option for replication is supported since v16 */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 160000 &&
+ MySubscription->binary)
+ {
+ appendStringInfoString(&cmd, " WITH (FORMAT binary)");
+ options = lappend(options, makeDefElem("format", (Node *)
makeString("binary"), -1));
+ }


Logical replication binary mode was introduced in v14, so the old
comment ("The binary option for replication is supported since v14")
was correct. Unfortunately, after changing the code check to 16000, I
think the new comment ("The binary option for replication is supported
since v16") became incorrect, and so it needs some rewording. Maybe it
should say something like below:

SUGGESTION
If the publisher is v16 or later, then any initial table
synchronization will use the same format as specified by the
subscription binary mode. If the publisher is before v16, then any
initial table synchronization will use text format regardless of the
subscription binary mode.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Sat, Mar 18, 2023 at 3:11 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Fri, Mar 17, 2023 at 11:25 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> ======
> src/backend/replication/logical/tablesync.c
>
> 2.
> @@ -1168,6 +1170,15 @@ copy_table(Relation rel)
>
>   appendStringInfoString(&cmd, ") TO STDOUT");
>   }
> +
> + /* The binary option for replication is supported since v16 */
> + if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 160000 &&
> + MySubscription->binary)
> + {
> + appendStringInfoString(&cmd, " WITH (FORMAT binary)");
> + options = lappend(options, makeDefElem("format", (Node *)
> makeString("binary"), -1));
> + }
>
>
> Logical replication binary mode was introduced in v14, so the old
> comment ("The binary option for replication is supported since v14")
> was correct. Unfortunately, after changing the code check to 16000, I
> think the new comment ("The binary option for replication is supported
> since v16") became incorrect, and so it needs some rewording. Maybe it
> should say something like below:
>
> SUGGESTION
> If the publisher is v16 or later, then any initial table
> synchronization will use the same format as specified by the
> subscription binary mode. If the publisher is before v16, then any
> initial table synchronization will use text format regardless of the
> subscription binary mode.
>

I agree that the previous comment should be updated but I would prefer
something along the lines: "Prior to v16, initial table
synchronization will use text format even if the binary option is
enabled for a subscription."

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
There are a couple of TAP tests where the copy binary is expected to
fail. And when it fails, you do binary=false (changing the format back
to 'text') so the test is then expected to be able to proceed.

I don't know if this happens in practice, but IIUC in theory, if the
timing is extremely bad, the tablesync could relaunch in binary mode
multiple times (any fail multiple times?) before your binary=false
change takes effect.

So, I was wondering if it could help to use the subscription
'disable_on_error=true' parameter for those cases so that the
tablesync won't needlessly attempt to relaunch until you have set
binary=false and then re-enabled the subscription.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Mon, Mar 20, 2023 at 3:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> There are a couple of TAP tests where the copy binary is expected to
> fail. And when it fails, you do binary=false (changing the format back
> to 'text') so the test is then expected to be able to proceed.
>
> I don't know if this happens in practice, but IIUC in theory, if the
> timing is extremely bad, the tablesync could relaunch in binary mode
> multiple times (any fail multiple times?) before your binary=false
> change takes effect.
>
> So, I was wondering if it could help to use the subscription
> 'disable_on_error=true' parameter for those cases so that the
> tablesync won't needlessly attempt to relaunch until you have set
> binary=false and then re-enabled the subscription.
>

+1. That would make tests more reliable.

--
With Regards,
Amit Kapila.



RE: Allow logical replication to copy tables in binary format

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Melih,

Thank you for updating the patch.
I checked your added description about initial data sync and I think it's OK.

Few minor comments:

01. copy_table

```
+    List        *options = NIL;
```

I found a unnecessary blank just after "List". You can remove it and align definition.

02. copy_table

```
+        options = lappend(options, makeDefElem("format", (Node *) makeString("binary"), -1));
```

The line seems to exceed 80 characters. How do you think to change like following?

```
        options = lappend(options,
                          makeDefElem("format",
                                      (Node *) makeString("binary"), -1));
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Please see the attached patch.

vignesh C <vignesh21@gmail.com>, 18 Mar 2023 Cmt, 12:03 tarihinde şunu yazdı:
On Fri, 17 Mar 2023 at 17:55, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
1) Currently we refer the link to the beginning of create subscription
page, this can be changed to refer to binary option contents in create
subscription:

Done.
 
2) Running pgperltidy shows the test script 014_binary.pl could be
slightly improved as in the attachment.

Couldn't apply the patch you attached but I ran pgperltidy myself and I guess the result should be similar. 


Peter Smith <smithpb2250@gmail.com>, 18 Mar 2023 Cmt, 12:41 tarihinde şunu yazdı:
Commit message

1.
Binary copy is supported for v16 or later.

Done as you suggested.

Amit Kapila <amit.kapila16@gmail.com>, 18 Mar 2023 Cmt, 13:11 tarihinde şunu yazdı:
On Sat, Mar 18, 2023 at 3:11 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> SUGGESTION
> If the publisher is v16 or later, then any initial table
> synchronization will use the same format as specified by the
> subscription binary mode. If the publisher is before v16, then any
> initial table synchronization will use text format regardless of the
> subscription binary mode.
>

I agree that the previous comment should be updated but I would prefer
something along the lines: "Prior to v16, initial table
synchronization will use text format even if the binary option is
enabled for a subscription."

Done. 

Amit Kapila <amit.kapila16@gmail.com>, 20 Mar 2023 Pzt, 05:13 tarihinde şunu yazdı:
On Mon, Mar 20, 2023 at 3:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> There are a couple of TAP tests where the copy binary is expected to
> fail. And when it fails, you do binary=false (changing the format back
> to 'text') so the test is then expected to be able to proceed.
>
> I don't know if this happens in practice, but IIUC in theory, if the
> timing is extremely bad, the tablesync could relaunch in binary mode
> multiple times (any fail multiple times?) before your binary=false
> change takes effect.
>
> So, I was wondering if it could help to use the subscription
> 'disable_on_error=true' parameter for those cases so that the
> tablesync won't needlessly attempt to relaunch until you have set
> binary=false and then re-enabled the subscription.
>

+1. That would make tests more reliable.

Done.

Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>, 20 Mar 2023 Pzt, 07:13 tarihinde şunu yazdı:
Dear Melih,

Thank you for updating the patch.
I checked your added description about initial data sync and I think it's OK.

Few minor comments:

Fixed both comments.

Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
Here are my review comments for v18-0001

======
doc/src/sgml/logical-replication.sgml

1.
+   target table. However, logical replication in binary format is more
+   restrictive. See the <literal>binary</literal> option of
+   <link linkend="sql-createsubscription-binary"><command>CREATE
SUBSCRIPTION</command></link>
+   for details.
   </para>

Because you've changed the linkend to be the binary option, IMO now
the <link> part also needs to be modified. Otherwise, this page has
multiple "CREATE SUBSCRIPTION" links which jump to different places,
which just seems wrong to me.

SUGGESTION (for the "See the" sentence)

See the <link linkend="sql-createsubscription-binary"><literal>binary</literal>
option</link> of <command>CREATE SUBSCRIPTION</command> for details.

======
doc/src/sgml/ref/alter_subscription.sgml

2.
+         <para>
+          See the <literal>binary</literal> option of
+          <link
linkend="sql-createsubscription-binary"><command>CREATE
SUBSCRIPTION</command></link>
+          for details about copying pre-existing data in binary format.
+         </para>

(Same as review comment #1 above)

SUGGESTION
See the <link linkend="sql-createsubscription-binary"><literal>binary</literal>
option</link> of <command>CREATE SUBSCRIPTION</command> for details
about copying pre-existing data in binary format.


======
src/backend/replication/logical/tablesync.c

3.
+ /*
+ * Prior to v16, initial table synchronization will use text format even
+ * if the binary option is enabled for a subscription.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 160000 &&
+ MySubscription->binary)
+ {
+ appendStringInfoString(&cmd, " WITH (FORMAT binary)");
+ options = lappend(options,
+   makeDefElem("format",
+   (Node *) makeString("binary"), -1));
+ }

I think there can only be 0 or 1 list element in 'options'.

So, why does the code here use lappend(options,...) instead of just
using list_make1(...)?

======
src/test/subscription/t/014_binary.pl

4.
# -----------------------------------------------------
# Test mismatched column types with/without binary mode
# -----------------------------------------------------

# Test syncing tables with mismatching column types
$node_publisher->safe_psql(
'postgres', qq(
    CREATE TABLE public.test_mismatching_types (
        a bigint PRIMARY KEY
    );
    INSERT INTO public.test_mismatching_types (a)
        VALUES (1), (2);
    ));

# Check the subscriber log from now on.
$offset = -s $node_subscriber->logfile;

# Ensure the subscription is enabled. disable_on_error is still true,
# so the subscription can be disabled due to missing realtion until
# the test_mismatching_types table is created.
$node_subscriber->safe_psql(
'postgres', qq(
    CREATE TABLE public.test_mismatching_types (
        a int PRIMARY KEY
    );
ALTER SUBSCRIPTION tsub ENABLE;
    ALTER SUBSCRIPTION tsub REFRESH PUBLICATION;
    ));

~~

I found the "Ensure the subscription is enabled..." comment and the
necessity for enabling the subscription to be confusing.

Can't some complications all be eliminated just by creating the table
on the subscribe side first?

For example, I rearranged that test (above fragment) like below and it
still works OK for me:

# -----------------------------------------------------
# Test mismatched column types with/without binary mode
# -----------------------------------------------------

# Create the table on the subscriber side
$node_subscriber->safe_psql(
    'postgres', qq(
    CREATE TABLE public.test_mismatching_types (
        a int PRIMARY KEY
    )));

# Check the subscriber log from now on.
$offset = -s $node_subscriber->logfile;

# Test syncing tables with mismatching column types
$node_publisher->safe_psql(
    'postgres', qq(
    CREATE TABLE public.test_mismatching_types (
        a bigint PRIMARY KEY
    );
    INSERT INTO public.test_mismatching_types (a)
        VALUES (1), (2);
    ));

# Refresh the publication to trigger the tablesync
$node_subscriber->safe_psql(
    'postgres', qq(
    ALTER SUBSCRIPTION tsub REFRESH PUBLICATION;
    ));

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Tue, Mar 21, 2023 at 7:03 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
>
> ======
> src/test/subscription/t/014_binary.pl
>
> 4.
> # -----------------------------------------------------
> # Test mismatched column types with/without binary mode
> # -----------------------------------------------------
>
> # Test syncing tables with mismatching column types
> $node_publisher->safe_psql(
> 'postgres', qq(
>     CREATE TABLE public.test_mismatching_types (
>         a bigint PRIMARY KEY
>     );
>     INSERT INTO public.test_mismatching_types (a)
>         VALUES (1), (2);
>     ));
>
> # Check the subscriber log from now on.
> $offset = -s $node_subscriber->logfile;
>
> # Ensure the subscription is enabled. disable_on_error is still true,
> # so the subscription can be disabled due to missing realtion until
> # the test_mismatching_types table is created.
> $node_subscriber->safe_psql(
> 'postgres', qq(
>     CREATE TABLE public.test_mismatching_types (
>         a int PRIMARY KEY
>     );
> ALTER SUBSCRIPTION tsub ENABLE;
>     ALTER SUBSCRIPTION tsub REFRESH PUBLICATION;
>     ));
>
> ~~
>
> I found the "Ensure the subscription is enabled..." comment and the
> necessity for enabling the subscription to be confusing.
>
> Can't some complications all be eliminated just by creating the table
> on the subscribe side first?
>

Hmm, that would make this test inconsistent with other tests and
probably difficult to understand and extend. I don't like to say this
but I think introducing disable_on_error has introduced more
complexities in the patch due to the requirement of enabling
subscription again and again. I feel it would be better without using
disable_on_error option in these tests.

One minor point:
+          format can be faster than the text format, but it is less portable
+          across machine architectures and PostgreSQL versions.

As per email [1], it would be better to use <productname> tag here
with PostgreSQL.

[1] - https://www.postgresql.org/message-id/932629.1679322674%40sss.pgh.pa.us

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:


Amit Kapila <amit.kapila16@gmail.com>, 21 Mar 2023 Sal, 09:03 tarihinde şunu yazdı:
On Tue, Mar 21, 2023 at 7:03 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
>
> ======
> src/test/subscription/t/014_binary.pl
>
> 4.
> # -----------------------------------------------------
> # Test mismatched column types with/without binary mode
> # -----------------------------------------------------
>
> # Test syncing tables with mismatching column types
> $node_publisher->safe_psql(
> 'postgres', qq(
>     CREATE TABLE public.test_mismatching_types (
>         a bigint PRIMARY KEY
>     );
>     INSERT INTO public.test_mismatching_types (a)
>         VALUES (1), (2);
>     ));
>
> # Check the subscriber log from now on.
> $offset = -s $node_subscriber->logfile;
>
> # Ensure the subscription is enabled. disable_on_error is still true,
> # so the subscription can be disabled due to missing realtion until
> # the test_mismatching_types table is created.
> $node_subscriber->safe_psql(
> 'postgres', qq(
>     CREATE TABLE public.test_mismatching_types (
>         a int PRIMARY KEY
>     );
> ALTER SUBSCRIPTION tsub ENABLE;
>     ALTER SUBSCRIPTION tsub REFRESH PUBLICATION;
>     ));
>
> ~~
>
> I found the "Ensure the subscription is enabled..." comment and the
> necessity for enabling the subscription to be confusing.
>
> Can't some complications all be eliminated just by creating the table
> on the subscribe side first?
>

Hmm, that would make this test inconsistent with other tests and
probably difficult to understand and extend. I don't like to say this
but I think introducing disable_on_error has introduced more
complexities in the patch due to the requirement of enabling
subscription again and again. I feel it would be better without using
disable_on_error option in these tests.

While this change would make the test inconsistent, I think it also would make more confusing.
Explaining the issue explicitly with a comment seems better to me than the trick of changing order of table creation just for some test cases.
But I'm also ok with removing the use of disable_on_error if that's what you agree on.

Thanks,
--
Melih Mutlu
Microsoft

Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Tue, Mar 21, 2023 at 2:16 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com>, 21 Mar 2023 Sal, 09:03 tarihinde şunu yazdı:
>>
>> On Tue, Mar 21, 2023 at 7:03 AM Peter Smith <smithpb2250@gmail.com> wrote:
>> >
>> >
>> > ======
>> > src/test/subscription/t/014_binary.pl
>> >
>> > 4.
>> > # -----------------------------------------------------
>> > # Test mismatched column types with/without binary mode
>> > # -----------------------------------------------------
>> >
>> > # Test syncing tables with mismatching column types
>> > $node_publisher->safe_psql(
>> > 'postgres', qq(
>> >     CREATE TABLE public.test_mismatching_types (
>> >         a bigint PRIMARY KEY
>> >     );
>> >     INSERT INTO public.test_mismatching_types (a)
>> >         VALUES (1), (2);
>> >     ));
>> >
>> > # Check the subscriber log from now on.
>> > $offset = -s $node_subscriber->logfile;
>> >
>> > # Ensure the subscription is enabled. disable_on_error is still true,
>> > # so the subscription can be disabled due to missing realtion until
>> > # the test_mismatching_types table is created.
>> > $node_subscriber->safe_psql(
>> > 'postgres', qq(
>> >     CREATE TABLE public.test_mismatching_types (
>> >         a int PRIMARY KEY
>> >     );
>> > ALTER SUBSCRIPTION tsub ENABLE;
>> >     ALTER SUBSCRIPTION tsub REFRESH PUBLICATION;
>> >     ));
>> >
>> > ~~
>> >
>> > I found the "Ensure the subscription is enabled..." comment and the
>> > necessity for enabling the subscription to be confusing.
>> >
>> > Can't some complications all be eliminated just by creating the table
>> > on the subscribe side first?
>> >
>>
>> Hmm, that would make this test inconsistent with other tests and
>> probably difficult to understand and extend. I don't like to say this
>> but I think introducing disable_on_error has introduced more
>> complexities in the patch due to the requirement of enabling
>> subscription again and again. I feel it would be better without using
>> disable_on_error option in these tests.
>
>
> While this change would make the test inconsistent, I think it also would make more confusing.
>

I also think so.

> Explaining the issue explicitly with a comment seems better to me than the trick of changing order of table creation
justfor some test cases. 
> But I'm also ok with removing the use of disable_on_error if that's what you agree on.
>

Let's do that way for now.

--
With Regards,
Amit Kapila.



Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi,

Peter Smith <smithpb2250@gmail.com>, 21 Mar 2023 Sal, 04:33 tarihinde şunu yazdı:
Here are my review comments for v18-0001

======
doc/src/sgml/logical-replication.sgml

1.
+   target table. However, logical replication in binary format is more
+   restrictive. See the <literal>binary</literal> option of
+   <link linkend="sql-createsubscription-binary"><command>CREATE
SUBSCRIPTION</command></link>
+   for details.
   </para>

Because you've changed the linkend to be the binary option, IMO now
the <link> part also needs to be modified. Otherwise, this page has
multiple "CREATE SUBSCRIPTION" links which jump to different places,
which just seems wrong to me.

Makes sense. I changed it as you suggested.
 
3.
I think there can only be 0 or 1 list element in 'options'.

So, why does the code here use lappend(options,...) instead of just
using list_make1(...)?

Changed it to list_make1.

Amit Kapila <amit.kapila16@gmail.com>, 21 Mar 2023 Sal, 12:27 tarihinde şunu yazdı:
> Explaining the issue explicitly with a comment seems better to me than the trick of changing order of table creation just for some test cases.
> But I'm also ok with removing the use of disable_on_error if that's what you agree on.
>

Let's do that way for now.

Done.

Thanks, 
--
Melih Mutlu
Microsoft
Attachment

Re: Allow logical replication to copy tables in binary format

From
Peter Smith
Date:
Thanks for all the patch updates. Patch v19 LGTM.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Allow logical replication to copy tables in binary format

From
"shiy.fnst@fujitsu.com"
Date:
On Wed Mar 22, 2023 7:29 AM Peter Smith <smithpb2250@gmail.com> wrote:
> 
> Thanks for all the patch updates. Patch v19 LGTM.
> 

+1

Regards,
Shi Yu

Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Wed, Mar 22, 2023 at 9:00 AM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Wed Mar 22, 2023 7:29 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Thanks for all the patch updates. Patch v19 LGTM.
> >
>
> +1
>

The patch looks mostly good to me. However, I have one
question/comment as follows:

-       <varlistentry>
+       <varlistentry id="sql-createsubscription-binary" xreflabel="binary">
         <term><literal>binary</literal> (<type>boolean</type>)</term>
         <listitem>

To allow references to the binary option, we add the varlistentry id
here. It looks slightly odd to me to add id for just one entry, see
commit 78ee60ed84bb3a1cf0b6bd9a715dcbcf252a90f5 where we have
purposefully added ids to allow future references. Shall we add id to
other options as well on this page?

--
With Regards,
Amit Kapila.



RE: Allow logical replication to copy tables in binary format

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit, hackers,

> The patch looks mostly good to me. However, I have one
> question/comment as follows:
> 
> -       <varlistentry>
> +       <varlistentry id="sql-createsubscription-binary" xreflabel="binary">
>          <term><literal>binary</literal> (<type>boolean</type>)</term>
>          <listitem>
> 
> To allow references to the binary option, we add the varlistentry id
> here. It looks slightly odd to me to add id for just one entry, see
> commit 78ee60ed84bb3a1cf0b6bd9a715dcbcf252a90f5 where we have
> purposefully added ids to allow future references. Shall we add id to
> other options as well on this page?

I have analyzed same points and made patch that could be applied atop v19-0001.
Please check 0002 patch.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment

Re: Allow logical replication to copy tables in binary format

From
Amit Kapila
Date:
On Wed, Mar 22, 2023 at 4:06 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> > The patch looks mostly good to me. However, I have one
> > question/comment as follows:
> >
> > -       <varlistentry>
> > +       <varlistentry id="sql-createsubscription-binary" xreflabel="binary">
> >          <term><literal>binary</literal> (<type>boolean</type>)</term>
> >          <listitem>
> >
> > To allow references to the binary option, we add the varlistentry id
> > here. It looks slightly odd to me to add id for just one entry, see
> > commit 78ee60ed84bb3a1cf0b6bd9a715dcbcf252a90f5 where we have
> > purposefully added ids to allow future references. Shall we add id to
> > other options as well on this page?
>
> I have analyzed same points and made patch that could be applied atop v19-0001.
> Please check 0002 patch.
>

Pushed the 0001. It may be better to start a separate thread for 0002.

--
With Regards,
Amit Kapila.



RE: Allow logical replication to copy tables in binary format

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit,

> Pushed the 0001. It may be better to start a separate thread for 0002.

Good job! I have started new thread [1] for 0002.

[1]:
https://www.postgresql.org/message-id/TYAPR01MB58667AE04D291924671E2051F5879%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Allow logical replication to copy tables in binary format

From
Melih Mutlu
Date:
Hi, 

Amit Kapila <amit.kapila16@gmail.com>, 23 Mar 2023 Per, 08:48 tarihinde şunu yazdı:
Pushed the 0001. It may be better to start a separate thread for 0002.

Great! Thanks.

Best, 
--
Melih Mutlu
Microsoft