Re: [BUG] Unexpected action when publishing partition tables - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [BUG] Unexpected action when publishing partition tables
Date
Msg-id CAA4eK1JS7dcsrRvPVT36kBu3QtJh0WP_JjN7LfeTU5xJCNLd=A@mail.gmail.com
Whole thread Raw
In response to Re: [BUG] Unexpected action when publishing partition tables  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: [BUG] Unexpected action when publishing partition tables
List pgsql-hackers
On Wed, Oct 13, 2021 at 9:11 AM Amit Langote <amitlangote09@gmail.com> wrote:
>
> Hi Amit,
>
> On Fri, Oct 8, 2021 at 12:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Thu, Oct 7, 2021 at 12:39 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > > Sorry that I didn't comment on this earlier, but I think either
> > > GetPubPartitionOptionRelations() or InvalidatePublicationRels()
> > > introduced in the commit 4548c76738b should lock the partitions, just
> > > like to the parent partitioned table would be, before invalidating
> > > them.  There may be some hazards to invalidating a relation without
> > > locking it.
> >
> > I see your point but then on the same lines didn't the existing code
> > "for all tables" case (where we call CacheInvalidateRelcacheAll()
> > without locking all relations) have a similar problem.
>
> There might be.  I checked to see how other callers/modules use
> CacheInvalidateRelcacheAll(), though it seems that only the functions
> in publicationcmds.c use it or really was invented in 665d1fad99e for
> use by publication commands.
>
> Maybe I need to look harder than I've done for any examples of hazard.
>
> > Also, in your
> > patch, you are assuming that the callers of GetPublicationRelations()
> > will lock all the relations but what when it gets called from
> > AlterPublicationOptions()?
>
> Ah, my bad.  I hadn't noticed that one for some reason.
>
> Now that you mention it, I do find it somewhat concerning (on the
> similar grounds as what prompted my previous email) that
> AlterPublicationOptions() does away with any locking on the affected
> relations.
>
> Anyway, I'll think a bit more about the possible hazards of not doing
> the locking and will reply again if there's indeed a problem(s) that
> needs to be fixed.
>

I think you can try to reproduce the problem via the debugger. You can
stop before calling GetPubPartitionOptionRelations in
publication_add_relation()  in session-1 and then from another session
(say session-2) try to delete one of the partition table (without
replica identity).  Then stop in session-2 somewhere after acquiring
lock to the corresponding partition relation. Now, continue in
session-1 and invalidate the rels and let it complete the command. I
think session-2 will complete the update without processing the
invalidations.

If the above is true, then, this breaks the following behavior
specified in the documentation: "The tables added to a publication
that publishes UPDATE and/or DELETE operations must have REPLICA
IDENTITY defined. Otherwise, those operations will be disallowed on
those tables.". Also, I think such updates won't be replicated on
subscribers as there is no replica identity or primary key column.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: try_relation_open and relation_open behave different.
Next
From: Amit Kapila
Date:
Subject: Re: [BUG] Unexpected action when publishing partition tables