Re: Data is copied twice when specifying both child and parent table in publication - Mailing list pgsql-hackers

From Jacob Champion
Subject Re: Data is copied twice when specifying both child and parent table in publication
Date
Msg-id 3a4504df-3f8e-c21a-fff4-cb7a6d531c57@timescale.com
Whole thread Raw
In response to Re: Data is copied twice when specifying both child and parent table in publication  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On 3/30/23 20:01, Peter Smith wrote:
> For example, Just imagine if logic could be made smarter to recognize
> that since there was already the 'part_def' being subscribed so it
> should NOT use the default 'copy_data=true' when the REFRESH launches
> the ancestor table 'part'...
> 
> Even if that logic was implemented, I have a feeling you could *still*
> run into problems if the 'part' table was made of multiple partitions.
> I think you might get to a situation where you DO want some partition
> data copied (because you did not have it yet but now you are
> subscribing to the root you want it) while at the same time, you DON'T
> want to get duplicated data from other partitions (because you already
> knew about those ones  -- like your example does).

Hm, okay. My interest here is mainly because my logical-roots proposal
generalizes the problem (and therefore makes it worse).

For what it's worth, that patchset introduces the ability for the
subscriber to sync multiple tables into one. I wonder if that could be
used somehow to help fix this problem too?

> At least, we need to check there are sufficient "BE CAREFUL" warnings
> in the documentation for scenarios like this.

Agreed. These are sharp edges.

Thanks,
--Jacob



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: running logical replication as the subscription owner
Next
From: Jacob Champion
Date:
Subject: Re: Data is copied twice when specifying both child and parent table in publication