Re: Allow logical replication to copy tables in binary format - Mailing list pgsql-hackers

From Melih Mutlu
Subject Re: Allow logical replication to copy tables in binary format
Date
Msg-id CAGPVpCR_g8c1xXfR4kkSXiSt314C0it0seRzfqgTd+oF7N_5jw@mail.gmail.com
Whole thread Raw
In response to RE: Allow logical replication to copy tables in binary format  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
Responses RE: Allow logical replication to copy tables in binary format  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
List pgsql-hackers
Hi Takamichi,

Thanks for your reviews. 

I addressed your reviews, please find the attached patch. 

osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com>, 16 Eyl 2022 Cum, 16:51 tarihinde şunu yazdı:
(1) whitespace issues

Fixed 

(2) Suggestion to update another general description about the subscription

Kindly have a look at doc/src/sgml/logical-replication.sgml.

"The data types of the columns do not need to match,
as long as the text representation of the data can be converted to the target type.
For example, you can replicate from a column of type integer to a column of type bigint."

With the patch, I think we have an impact about those descriptions
since enabling the binary option for a subscription and executing the
initial synchronization requires the same data types for binary format.

I suggest that we update those descriptions as well.

You're right, this needs to be stated in the docs. Modified descriptions accordingly.
 
(3) shouldn't test that we fail expectedly with binary copy for different types ?

How about having a test that we correctly fail with different data types
between the publisher and the subscriber, for instance ?

Modified 002_types.pl test such that it now tests the replication between different data types. 
It's expected to fail if the binary is enabled, and succeed if not.
 
(4) Error message of the binary format copy

I've gotten below message from data type contradiction (between integer and bigint).
Probably, this is unclear for the users to understand the direct cause
and needs to be adjusted ?
This might be a similar comment Euler mentioned in [1].

2022-09-16 11:54:54.835 UTC [4570] ERROR:  insufficient data left in message
2022-09-16 11:54:54.835 UTC [4570] CONTEXT:  COPY tab, line 1, column id

It's already unclear for users to understand what's the issue if they're copying data between different column types via the COPY command.
This issue comes from COPY, and logical replication just utilizes COPY.
I don't think it would make sense to adjust an error message from a functionality which logical replication only uses and has no direct impact on.
It might be better to do this in a separate patch. What do you think?
 
(5) Minor adjustment of the test comment in 002_types.pl.

+is( $result, $sync_result, 'check initial sync on subscriber');
+is( $result_binary, $sync_result, 'check initial sync on subscriber in binary');

 # Insert initial test data

There are two same comments which say "Insert initial test data" in this file.
We need to update them, one for the initial table sync and
the other for the application of changes.

Fixed. 

I agree with the direction to support binary copy for v16 and later.

IIUC, the binary format replication with different data types fails even during apply phase on HEAD.
I thought that means, the upgrade concern only applies to a scenario that the user executes
only initial table synchronizations between the publisher and subscriber
and doesn't replicate any data at apply phase after that. I would say
this isn't a valid scenario and your proposal makes sense.
 
No, logical replication in binary does not fail on apply phase if data types are different.  
The concern with upgrade (if data types are not the same) would be not being able to create a new subscription with binary enabled or replicate new tables added into publication.
Replication of tables from existing subscriptions would not be affected by this change since they will already be in the apply phase, not tablesync.
Do you think this would still be an issue?


Thanks,
Melih
  

Attachment

pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: Small miscellaneous fixes
Next
From: Dave Page
Date:
Subject: Re: Tracking last scan time