Re: Allow logical replication to copy tables in binary format - Mailing list pgsql-hackers

From Melih Mutlu
Subject Re: Allow logical replication to copy tables in binary format
Date
Msg-id CAGPVpCRaxRA-SqWYsWvqWPB_CfVyasvCwC2oqg9oQgZfP0QBtQ@mail.gmail.com
Whole thread Raw
In response to RE: Allow logical replication to copy tables in binary format  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
Responses RE: Allow logical replication to copy tables in binary format
List pgsql-hackers
Hello,

osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com>, 12 Eki 2022 Çar, 04:36 tarihinde şunu yazdı:
>       I agree with the direction to support binary copy for v16 and later.
>
>       IIUC, the binary format replication with different data types fails even
> during apply phase on HEAD.
>       I thought that means, the upgrade concern only applies to a scenario
> that the user executes
>       only initial table synchronizations between the publisher and subscriber
>       and doesn't replicate any data at apply phase after that. I would say
>       this isn't a valid scenario and your proposal makes sense.
>
> No, logical replication in binary does not fail on apply phase if data types are
> different.
With HEAD, I observe in some case we fail at apply phase because of different data types like
integer vs. bigint as written scenario in [1]. In short, I think we can slightly
adjust your documentation and make it more general so that the description applies to
both table sync phase and apply phase.

Yes, you're right. I somehow had the impression that HEAD supports replication between different types in binary. 
But as can be shown in the scenario you mentioned, it does not work.

I'll suggest a below change for your sentence of logical-replication.sgml.
FROM:
In binary case, it is not allowed to replicate data between different types due to restrictions inherited from COPY.
TO:
Binary format is type specific and does not allow to replicate data between different types according to its
restrictions.

In this case, this change makes sense since this patch does actually not introduce this issue. It already exists in HEAD too. 
 
If my idea above is correct, then I feel we can remove all the fixes for create_subscription.sgml.
I'm not sure if I should pursue this perspective of the document improvement
any further after this email, since this isn't essentially because of this patch.
 
I'm only keeping the following change in create_subscription.sgml to indicate binary option copies in binary format now.
-          Specifies whether the subscription will request the publisher to
-          send the data in binary format (as opposed to text).
+          Specifies whether the subscription will copy the initial data to
+          synchronize relations in binary format and also request the publisher
+          to send the data in binary format too (as opposed to text).

 
> The concern with upgrade (if data types are not the same) would be not being
> able to create a new subscription with binary enabled or replicate new tables
> added into publication.
> Replication of tables from existing subscriptions would not be affected by this
> change since they will already be in the apply phase, not tablesync.
> Do you think this would still be an issue?
Okay, thanks for explaining this. I understand that
the upgrade concern applies to the table sync that is executed
between text format (before the patch) and binary format (after the patch).

I was thinking apply would work with different types in binary format.
Since apply also would not work, then the scenario that I tried to explain earlier is not a concern anymore. 


Attached patch with updated version of this patch.

Thanks,
Melih
 
Attachment

pgsql-hackers by date:

Previous
From: Thom Brown
Date:
Subject: Re: Add 64-bit XIDs into PostgreSQL 15
Next
From: Ian Lawrence Barwick
Date:
Subject: Re: Commit fest 2022-11