Re: pg_get_publication_tables() output duplicate relid - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: pg_get_publication_tables() output duplicate relid |
Date | |
Msg-id | CA+HiwqHYHsRZje7gfE49_ahMhVE=jyrBhU0uOWtu2cub_cFyvw@mail.gmail.com Whole thread Raw |
In response to | Re: pg_get_publication_tables() output duplicate relid (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: pg_get_publication_tables() output duplicate relid
|
List | pgsql-hackers |
On Sat, Nov 20, 2021 at 8:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Nov 19, 2021 at 10:58 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Nov 19, 2021 at 7:19 AM Amit Langote <amitlangote09@gmail.com> wrote: > > > The problematic case is attaching the partition *after* the subscriber > > > has already marked the root parent as synced and/or ready for > > > replication. Refreshing the subscription doesn't help it discover the > > > newly attached partition, because a publish_via_partition_root only > > > ever tells about the root parent, which would be already synced, so > > > the subscriber would think there's nothing to copy. > > > > Okay, I see this could be a problem but I haven't tried to reproduce it. > > > > > > Anyway, if this is a problem > > > > we need to figure the solution for this separately. > > > > > > Sure, we might need to do that after all. Though it might be a good > > > idea to be sure that we won't need to reconsider the fix we push for > > > the issue(s) being discussed here and elsewhere, because I suspect > > > that the solution to the problem I mentioned is likely to involve > > > tweaking pg_publication_tables view output. > > I have thought about this problem and I see two possibilities for a > solution (a) We could provide a new option say 'truncate' (something > on lines proposed here [1]) which would truncate the table(s) and > change its status to 'i' in the pg_subscription_rel, this would allow > the newly added partition to be synced after refresh. This could lead > to a large copy in such a case. Maybe I am missing something about the proposal, though I'd think a more automatic solution would be better, something that doesn't need to rely on an unrelated feature. > (b) We could somehow get and store all > the partition info from the publisher-side on the subscriber-side > while initial sync (say in new system table > pg_subscription_rel_members). Now, after the refresh, if this list > changes, we can allow to just get the data of that particular > partition but I guess it would mean that we need to store oids of the > publisher which might or might not be safe considering oids can > wraparound before the refresh. > > Do you have any other ideas? I thought that the idea I had earlier mentioned at [1] may be useful, which I can see is similar to your idea (b). I also suspect that it can be implemented without needing a separate catalog and storing publication-side relation OIDs in the subscription-side catalog, though maybe I haven't thought hard enough. > One more thing you mentioned is that the initial sync won't work after > refresh but later changes will be replicated but I noticed that later > changes also don't get streamed till we restart the subscriber server. > I am not sure but we might not be invalidating apply workers cache due > to which it didn't notice the same. Oh, that sounds odd and, as you appear to say, a separate problem. I'll check. -- Amit Langote EDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/CA%2BHiwqHnDHcT4OOcga9rDFyc7TvDrpN5xFH9J2pyHQo9ptvjmQ%40mail.gmail.com
pgsql-hackers by date: