Re: Allow logical replication to copy tables in binary format - Mailing list pgsql-hackers

From Melih Mutlu
Subject Re: Allow logical replication to copy tables in binary format
Date
Msg-id CAGPVpCQYi9AYQSS=RmGgVNjz5ZEnLB8mACwd9aioVhLmbgiAMA@mail.gmail.com
Whole thread Raw
In response to Re: Allow logical replication to copy tables in binary format  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses RE: Allow logical replication to copy tables in binary format
List pgsql-hackers
Hi,

Please see the attached patch for following changes. 

Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>, 30 Oca 2023 Pzt, 15:34 tarihinde şunu yazdı:
On Mon, Jan 30, 2023 at 4:19 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote: 
It'd be better and clearer to have a separate TAP test file IMO since
the focus of the feature here isn't to just test for data types. With
separate tests, you can verify "ERROR:  incorrect binary data format
in logical replication column 1" cases.

Moved some tests from 002_types.pl to 014_binary.pl since this is where most binary features are tested. It covers now "incorrect data format" cases too.
Also added some regression tests for copy_format parameter.
 
With the above said, do you think checking for publisher versions is
needed? The user can decide to enable/disable binary COPY based on the
publisher's version no?
+    /* If the publisher is v16 or later, specify the format to copy data. */
+    if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 160000)
+    {

If the user decides to enable it, then it might be nice to not allow it for previous versions. 
But I'm not sure. I'm okay to remove it if you all agree.
 
Few more comments on v7:
1.
+          Specifies the format in which pre-existing data on the publisher will
+          copied to the subscriber. Supported formats are
+          <literal>text</literal> and <literal>binary</literal>. The default is
+          <literal>text</literal>.
It'll be good to call out the cases in the documentation as to where
copy_format can be enabled and needs to be disabled.

Modified that description a bit. Can you check if that's okay now?
 
2.
+             errmsg("%s value should be either \"text\" or \"binary\"",
How about "value must be either ...."?
 
Done
 
3.
Why should the subscription's binary option and copy_format option be
tied at all? Tying these two options hurts usability. Is there a
fundamental reason? I think they both are/must be independent. One
deals with data types and another deals with how initial table data is
copied.

My initial purpose with this patch was just making subscriptions with binary option enabled fully binary from initial copy to apply. Things have changed a bit when we decided to move binary copy behind a parameter.
I didn't actually think there would be any use case where a user wants the initial copy to be in binary format for a sub with binary = false. Do you think it would be useful to copy in binary even for a sub with binary disabled?

Thanks,
--
Melih Mutlu
Microsoft
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Next
From: John Naylor
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum