Re: Added schema level support for publication. - Mailing list pgsql-hackers

From vignesh C
Subject Re: Added schema level support for publication.
Date
Msg-id CALDaNm17CMnqZ2L9p_t0hbhVEZG33MP5yh4UbV6+TAD7fbfS2w@mail.gmail.com
Whole thread Raw
In response to Re: Added schema level support for publication.  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Added schema level support for publication.  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Sat, Aug 28, 2021 at 3:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 6:09 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Aug 27, 2021 at 4:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Aug 27, 2021 at 11:43 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Wed, Aug 25, 2021 at 3:07 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > I have implemented this in the 0003 patch, I have kept it separate to
> > > > reduce the testing effort and also it will be easier if someone
> > > > disagrees with the syntax. I will merge it to the main patch later
> > > > based on the feedback. Attached v22 patch has the changes for the
> > > > same.
> > > >
> > >
> > > Few comments on v22-0001-Added-schema-level-support-for-publication:
> > > ========================================================
> > > 1. Why in publication_add_schema(), you are registering invalidation
> > > for all schema relations? It seems this is to allow rebuilding the
> > > publication info for decoding sessions. But that is not required as
> > > you are registering rel_sync_cache_publication_cb for
> > > publication_schema relation. In rel_sync_cache_publication_cb, we are
> > > marking replicate_valid as false for each entry which will allow
> > > publication info to be rebuilt in get_rel_sync_entry.
> > >
> > > I see that it is done for a single relation in the current code in
> > > function publication_add_relation but I guess that is also not
> > > required. You can once test this. If you still think it is required,
> > > can you please explain me why and then we can accordingly add some
> > > comments in the patch.
> >
> > I felt this is required for handling the following concurrency scenario:
> > create schema sch1;
> > create table sch1.t1(c1 int);
> > insert into sch1.t1 values(10);
> > update sch1.t1 set c1 = 11;
> > # update will be successful and relation cache will update publication
> > actions based on the current state i.e no publication
> > create publication pub1 for all tables in schema sch1;
> > # After the above publication is created the relations present in this
> > schema should be invalidated so that the next update should fail.
> >
>
> What do you mean by update should fail? I think all the relations in
> RelationSyncCache via rel_sync_cache_publication_cb because you have
> updated pg_publication_schema and that should register syscache
> invalidation.

By update should fail, I meant the updates without setting replica
identity before creating the decoding context. The scenario is like
below (all in the same session, the subscription is not created):
create schema sch1;
create table sch1.t1(c1 int);
insert into sch1.t1 values(10);
# Before updating we will check CheckCmdReplicaIdentity, as there are
no publications on this table rd_pubactions will be set accordingly in
relcache entry.
update sch1.t1 set c1 = 11;
# Now we will create the publication after rd_pubactions has been set
in the cache. Now when we create this publication we should invalidate
the relations present in the schema, this is required so that when the
next update happens, we should check the publication actions again in
CheckCmdReplicaIdentity and fail the update which does not  set
replica identity after the publication is created.
create publication pub1 for all tables in schema sch1;
# After the above publication is created the relations present in this
schema will be invalidated. Now we will check the publication actions
again in CheckCmdReplicaIdentity and the update will fail.
update sch1.t1 set c1 = 11;
ERROR:  cannot update table "t1" because it does not have a replica
identity and publishes updates
HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.

The rel_sync_cache_publication_cb invalidations are registered once
the decoding context is created. Since the decoding context is not
created at this point of time we should explicitly invalidate all the
relations present in this schema by calling InvalidatePublicationRels.

> > If
> > the relations are not invalidated the updates will be successful based
> > on previous publication actions.
> > update sch1.t1 set c1 = 11;
> >
>
> I think even without special relcache invalidations the relations
> should be invalidated because of syscache invalidation as mentioned in
> the previous point. Am I missing something here?

The rel_sync_cache_publication_cb invalidations are registered once
the decoding context is created. Since the decoding context is not
created at this point of time we should explicitly invalidate all the
relations present in this schema by calling InvalidatePublicationRels.
This is to prevent updates for which the replica identity is not set,
I have mentioned more details above.

> >
> > > 5. Don't we need to expose a view for publication schemas similar to
> > > pg_publication_tables?
> >
> > pg_publication_tables is a common view for both "FOR TABLE", "FOR ALL
> > TABLES" and "FOR ALL TABLES IN SCHEMA", this view will internally
> > access pg_publication_rel and pg_publication_schema to get the
> > corresponding tables. I felt we don't need a separate view for
> > publication schemas. Thoughts?
> >
>
> Don't you think some users might want to know all the schema names for
> a publication? I am not completely sure on this point but I think it
> is good to have information for users. It might be also useful to have
> pg_publication_objects where we can display object types (like table,
> schema, sequence, etc) and then object names. If you are not convinced
> then we can wait and see what others think about this.

Ok, I will add this view in the next version.

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: Autovacuum on partitioned table (autoanalyze)
Next
From: Andres Freund
Date:
Subject: Re: archive status ".ready" files may be created too early