Re: Data is copied twice when specifying both child and parent table in publication - Mailing list pgsql-hackers

From Greg Nancarrow
Subject Re: Data is copied twice when specifying both child and parent table in publication
Date
Msg-id CAJcOf-fZTvpQ8X0ZtZbR4fCDAXmuXdSsFYvyRLmCY5tN1QDF8w@mail.gmail.com
Whole thread Raw
In response to Re: Data is copied twice when specifying both child and parent table in publication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Data is copied twice when specifying both child and parent table in publication  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Mon, Oct 18, 2021 at 5:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > I have not debugged it yet to find out why, but with the patch
> > applied, the original double-publish problem that I reported
> > (converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
> > occurs.
> >
>
> Yeah, I think this is a variant of the problem being fixed by
> Hou-San's patch. I think one possible idea to investigate is that on
> the subscriber-side, after fetching tables, we check the already
> subscribed tables and if the child tables already exist then we ignore
> the parent table and vice versa. We might want to consider the case
> where a user has toggled the "publish_via_partition_root" parameter.
>
> It seems both these behaviours/problems exist since commit 17b9e7f9
> (Support adding partitioned tables to publication). Adding Amit L and
> Peter E (people involved in this work) to know their opinion?
>

Actually, at least with the scenario I gave steps for, after looking
at it again and debugging, I think that the behavior is understandable
and not a bug.
The reason is that the INSERTed data is first published though the
partitions, since initially there is no partitioned table in the
publication (so publish_via_partition_root=true doesn't have any
effect). But then adding the partitioned table to the publication and
refreshing the publication in the subscriber, the data is then
published "using the identity and schema of the partitioned table" due
to publish_via_partition_root=true. Note that the corresponding table
in the subscriber may well be a non-partitioned table (or the
partitions arranged differently) so the data does need to be
replicated again.

Regards,
Greg Nancarrow
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Sasasu
Date:
Subject: Re: XTS cipher mode for cluster file encryption
Next
From: Dilip Kumar
Date:
Subject: Re: Data is copied twice when specifying both child and parent table in publication