Re: Allow logical replication to copy tables in binary format - Mailing list pgsql-hackers

From Melih Mutlu
Subject Re: Allow logical replication to copy tables in binary format
Date
Msg-id CAGPVpCSegnqzEJiCEPhXY3WFKfC_3=3h+hSCfc4=6qRAfmK+rQ@mail.gmail.com
Whole thread Raw
In response to Re: Allow logical replication to copy tables in binary format  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
Hi,

Attached v11.

vignesh C <vignesh21@gmail.com>, 28 Şub 2023 Sal, 13:59 tarihinde şunu yazdı:
Thanks for the patch, Few comments:
1) Are primary key required for the tables, if not required we could
remove the primary key which will speed up the test by not creating
the indexes and inserting in the indexes. Even if required just create
for one of the tables:

I think that having a primary key in tables for logical replication tests is good for checking if log. rep. duplicates any row. Other tests also have primary keys in almost all tables.

Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>, 28 Şub 2023 Sal, 15:27 tarihinde şunu yazdı:
1.
+      <para>
+       If true, initial data synchronization will be performed in binary format
+      </para></entry>
It's not just the initial table sync right? The table sync can happen
at any other point of time when ALTER SUBSCRIPTION ... REFRESH
PUBLICATION WITH (copy = true) is run.
How about - "If true, the subscriber requests publication for
pre-existing data in binary format"?

I changed it as you suggested.
I sometimes feel like the phrase "initial sync" is used for initial sync of a table, not a subscription. Or table syncs triggered by ALTER SUBSCRIPTION are ignored in some places where "initial sync" is used.

2.
Perhaps, this should cover the recommended cases for enabling this new
option - something like below (may not need to have exact wording, but
the recommended cases?):
"It is recommended to enable this option only when 1) the column data
types have appropriate binary send/receive functions, 2) not
replicating between different major versions or different platforms,
3) both publisher and subscriber tables have the exact same column
types (not when replicating from smallint to int or numeric to int8
and so on), 4) both publisher and subscriber supports COPY with binary
option, otherwise the table copy can fail."

I added a line stating that binary format is less portable across machine architectures and versions as stated in COPY [1].
I don't think we should add line saying "recommended", but state the restrictions clearly instead. It's also similar in COPY docs as well.
I think the explanation now covers all your points, right?
 
Jelte Fennema <postgres@jeltef.nl>, 28 Şub 2023 Sal, 16:25 tarihinde şunu yazdı:
> 3. I think the newly added tests must verify if the binary COPY is
> picked up when enabled. Perhaps, looking at the publisher's server log
> for 'COPY ... WITH BINARY format'? Maybe it's an overkill, otherwise,
> we have no way of testing that the option took effect.

Another way to test that BINARY is enabled could be to trigger one
of the failure cases.

Yes, there is already a failure case for binary copy which resolves with swithcing binary_copy to false.
But I also added checks for publisher logs now too.  



Thanks,
--
Melih Mutlu
Microsoft
Attachment

pgsql-hackers by date:

Previous
From: Maxim Orlov
Date:
Subject: Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)
Next
From: Damir Belyalov
Date:
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)