Re: Data is copied twice when specifying both child and parent table in publication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Data is copied twice when specifying both child and parent table in publication
Date
Msg-id CAA4eK1KB8RkBUv_E2FiQc=y3S6ECKcj77HyuiMxYTH90C6J-Mw@mail.gmail.com
Whole thread Raw
In response to Re: Data is copied twice when specifying both child and parent table in publication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Data is copied twice when specifying both child and parent table in publication  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Oct 18, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 18, 2021 at 2:32 PM Amit Langote <amitlangote09@gmail.com> wrote:
> >
> > Looking at this command in Hou-san's email:
> >
> >   create publication pub for table tbl1, tbl1_part1 with
> > (publish_via_partition_root=on);
> >
> > It's adding both the root partitioned table and the leaf partition
> > *explicitly*, and it's not clear to me if the latter's inclusion in
> > the publication should be assumed because the former is found to have
> > been added to the publication, that is, as far as the latter's
> > visibility to the subscriber is concerned.  It's not a stretch to
> > imagine that a user may write the command this way to account for a
> > subscriber node on which tbl1 and tbl1_part1 are unrelated tables.
> >
> > I don't think we assume anything on the publisher side regarding the
> > state/configuration of tables on the subscriber side, at least with
> > publication commands where tables are added to a publication
> > explicitly, so it is up to the user to make sure that the tables are
> > not added duplicatively.  One may however argue that the way we've
> > decided to handle FOR ALL TABLES does assume something about
> > partitions where it skips advertising them to subscribers when
> > publish_via_partition_root flag is set to true, but that is exactly to
> > avoid the duplication of data that goes to a subscriber.
> >
>
> I think the same confusion will then apply to the new feature (For All
> Tables In Schema) being discussed [1] (that is a bit long thread so
> shared the email where the latest patch version is posted). There
> also, the partitioned table and partition can be in a different
> schema.
>

Sorry, I wanted to say that table partition and partitioned table can
be in the same schema. Now, if the user publishes all tables in a
schema, if we want to follow the "For All Tables" behavior then we
should skip the leaf table and publish only the parent table, OTOH, if
want to follow "For Table" behavior, we need to publish both
partitioned table and partition table. I feel it is better to be
consistent here in all three cases ("For Table", "For All Tables", and
"For All Tables In Schema") as it will be easier to explain and
document it.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "Bossart, Nathan"
Date:
Subject: Re: BUG #17220: ALTER INDEX ALTER COLUMN SET (..) with an optionless opclass makes index and table unusable
Next
From: Amit Kapila
Date:
Subject: Re: Skipping logical replication transactions on subscriber side