Re: Allow logical replication to copy tables in binary format - Mailing list pgsql-hackers

From Euler Taveira
Subject Re: Allow logical replication to copy tables in binary format
Date
Msg-id 2ebc7ea8-3c46-474e-aea7-5d73ff6165fb@www.fastmail.com
Whole thread Raw
In response to Allow logical replication to copy tables in binary format  (Melih Mutlu <m.melihmutlu@gmail.com>)
Responses Re: Allow logical replication to copy tables in binary format
List pgsql-hackers
On Wed, Aug 10, 2022, at 12:03 PM, Melih Mutlu wrote:
I see that logical replication subscriptions have an option to enable binary [1]. 
When it's enabled, subscription requests publisher to send data in binary format. 
But this is only the case for apply phase. In tablesync, tables are still copied as text.
This option could have been included in the commit 9de77b54531; it wasn't.
Maybe it wasn't considered because the initial table synchronization can be a
separate step in your logical replication setup idk. I agree that the binary
option should be available for the initial table synchronization.

To copy tables, COPY command is used and that command supports copying in binary. So it seemed to me possible to copy in binary for tablesync too.
I'm not sure if there is a reason to always copy tables in text format. But I couldn't see why not to do it in binary if it's enabled.
The reason to use text format is that it is error prone. There are restrictions
while using the binary format. For example, if your schema has different data
types for a certain column, the copy will fail. Even with such restrictions, I
think it is worth adding it.

You can find the small patch that only enables binary copy attached.  
I have a few points about your implementation.

* Are we considering to support prior Postgres versions too? These releases
  support binary mode but it could be an unexpected behavior (initial sync in
  binary mode) for a publisher using 14 or 15 and a subscriber using 16. IMO
  you should only allow it for publisher on 16 or later.
* Docs should say that the binary option also applies to initial table
  synchronization and possibly emphasize some of the restrictions.
* Tests. Are the current tests enough? 014_binary.pl.


--
Euler Taveira

pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: Get the statistics based on the application name and IP address
Next
From: John Naylor
Date:
Subject: Re: optimize lookups in snapshot [sub]xip arrays