Re: Allow logical replication to copy tables in binary format - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Allow logical replication to copy tables in binary format
Date
Msg-id CALj2ACXiUsJoXt=fMpa4yYseB5h3un_syVh-J3RxL4-6r9Dx2A@mail.gmail.com
Whole thread Raw
In response to Re: Allow logical replication to copy tables in binary format  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Allow logical replication to copy tables in binary format
List pgsql-hackers
On Wed, Mar 1, 2023 at 4:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> > > walsender ERROR:  no binary output function available for type public.myvarchar
> > > walsender STATEMENT:  COPY public.tbl1 (a) TO STDOUT  WITH (FORMAT binary)
> > >
> >
> > Thanks for sharing the example. I think to address this user can
> > create a SUBSCRIPTION with 'binary = false' and then after the initial
> > copy enables it with ALTER SUBSCRIPTION.  Personally, I feel it is not
> > required to have a separate option to allow copy in binary mode. Note,
> > where there is some use for it but having more options for similar
> > work is also confusing as users need to pay attention to different
> > options and their values. It won't be difficult to add such an option
> > in the future if we see such cases and or users specifically require
> > something like this.
>
> I agree with this thought, basically adding an extra option will
> always complicate things for the user.  And logically it doesn't make
> much sense to copy data in text mode and then stream in binary mode
> (except in some exception cases and for that, we can always alter the
> subscription).  So IMHO it makes more sense that if the binary option
> is selected then ideally it should choose to do the initial sync also
> in the binary mode.

I think I was suggesting earlier to use a separate option for binary
table sync copy based on my initial knowledge of binary COPY. Now that
I have a bit more understanding of binary COPY and subscription's
existing binary option, +1 for using the same option for table sync
too.

If used the existing subscription binary option for the table sync,
there can be following possibilities for the users:
1. users might want to enable the binary option for table sync and
disable it for subsequent replication
2. users might want to enable the binary option for both table sync
and for subsequent replication
3. users might want to disable the binary option for table sync and
enable it for subsequent replication
4. users might want to disable binary option for both table sync and
for subsequent replication

Binary copy use-cases are a bit narrower compared to the existing
subscription binary option, it works only if:
a) the column data types have appropriate binary send/receive functions
b) not replicating between different major versions or different platforms
c) both publisher and subscriber tables have the exact same column
types (not when replicating from smallint to int or numeric to int8
and so on)
d) both publisher and subscriber supports COPY with binary option

Now if one enabled the binary option for table sync, that means, they
must have ensured all (a), (b), (c), and (d) are met. The point is if
one decides to use binary copy for table sync, it means that the
subsequent binary replication works too without any problem. If
required, one can disable it for normal replication i.e. post-table
sync.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Next
From: Alvaro Herrera
Date:
Subject: Re: cataloguing NOT NULL constraints