Re: Allow logical replication to copy tables in binary format - Mailing list pgsql-hackers

From Melih Mutlu
Subject Re: Allow logical replication to copy tables in binary format
Date
Msg-id CAGPVpCQxgH1WQCxUE0kts+TajXdS3bp9ohhX=M8F_SZY4J7hNQ@mail.gmail.com
Whole thread Raw
In response to Re: Allow logical replication to copy tables in binary format  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses RE: Allow logical replication to copy tables in binary format  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Re: Allow logical replication to copy tables in binary format  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi,

Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>, 1 Mar 2023 Çar, 15:02 tarihinde şunu yazdı:
On Wed, Mar 1, 2023 at 4:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> I agree with this thought, basically adding an extra option will
> always complicate things for the user.  And logically it doesn't make
> much sense to copy data in text mode and then stream in binary mode
> (except in some exception cases and for that, we can always alter the
> subscription).  So IMHO it makes more sense that if the binary option
> is selected then ideally it should choose to do the initial sync also
> in the binary mode.

I agree that copying in text then streaming in binary does not have a good use-case.

I think I was suggesting earlier to use a separate option for binary
table sync copy based on my initial knowledge of binary COPY. Now that
I have a bit more understanding of binary COPY and subscription's
existing binary option, +1 for using the same option for table sync
too.

If used the existing subscription binary option for the table sync,
there can be following possibilities for the users:
1. users might want to enable the binary option for table sync and
disable it for subsequent replication
2. users might want to enable the binary option for both table sync
and for subsequent replication
3. users might want to disable the binary option for table sync and
enable it for subsequent replication
4. users might want to disable binary option for both table sync and
for subsequent replication

Binary copy use-cases are a bit narrower compared to the existing
subscription binary option, it works only if:
a) the column data types have appropriate binary send/receive functions
b) not replicating between different major versions or different platforms
c) both publisher and subscriber tables have the exact same column
types (not when replicating from smallint to int or numeric to int8
and so on)
d) both publisher and subscriber supports COPY with binary option

Now if one enabled the binary option for table sync, that means, they
must have ensured all (a), (b), (c), and (d) are met. The point is if
one decides to use binary copy for table sync, it means that the
subsequent binary replication works too without any problem. If
required, one can disable it for normal replication i.e. post-table
sync.

That was my intention in the beginning with this patch. Then the new option also made some sense at some point, and I added copy_binary option according to reviews.
The earlier versions of the patch didn't have that. Without the new option, this patch would also be smaller.

But before changing back to the point where these are all tied to binary option without a new option, I think we should decide if that's really the ideal way to do it.
I believe that the patch is all good now with the binary_copy option which is not tied to anything, explanations in the doc and separate tests etc.
But I also agree that binary=true should make everything in binary and binary=false should do them in text format. It makes more sense.

Best,
--
Melih Mutlu
Microsoft

pgsql-hackers by date:

Previous
From: Jeroen Vermeulen
Date:
Subject: Re: libpq: PQgetCopyData() and allocation overhead
Next
From: Tomas Vondra
Date:
Subject: Re: Add LZ4 compression in pg_dump