Re: [PATCH] Add `truncate` option to subscription commands - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [PATCH] Add `truncate` option to subscription commands |
Date | |
Msg-id | CAA4eK1JfGACfcE8qC0FjFeQbTBKzHm4VpJ=rqTcOJN7REmBvTA@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Add `truncate` option to subscription commands (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
List | pgsql-hackers |
On Mon, May 24, 2021 at 2:10 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Mon, May 24, 2021 at 11:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > 1) Whether a table the sync worker is trying to truncate is having any > > > referencing (foreign key) tables on the subscriber? If yes, whether > > > all the referencing tables are present in the list of subscription > > > tables (output of fetch_table_list)? In this case, the sync worker is > > > truncating the primary key/referenced table. > > > > > > One way to solve the above problem is by storing the table oids of the > > > subscription tables (output of fetch_table_list) in a new column in > > > the pg_subscription catalog (like subpublications text[] column). In > > > the sync worker, before truncation of a table, use > > > heap_truncate_find_FKs to get all the referencing tables of the given > > > table and get all the subscription tables from the new pg_subscription > > > column. If all the referencing tables exist in the subscription > > > tables, then truncate the table, otherwise don't, just skip it. > > > > > > > Here, silently skipping doesn't seem like a good idea when the user > > has asked to truncate the table. Shouldn't we allow it if the user has > > provided say cascade with a truncate option? > > We could do that. In that case, the truncate option just can't be a > boolean, but it has to be an enum accepting "restrict", "cascade", > maybe "restart identity" or "continue identity" too. I have a concern > here - what if the ExecuteTruncateGuts fails with the cascade option > for whatever reason? Should the table sync worker be trapped in that > error? > How is it any different from any other error we got during table sync (say PK violation, out of memory, or any other such error)? > > > > There > > > can be a problem here if there are many subscription tables, the size > > > of the new column in pg_susbcription can be huge. However, we can > > > choose to store the table ids in this new column only when the > > > truncate option is specified. > > > > > > Another way is to let each table sync worker scan the > > > pg_subscription_rel to get all the relations that belong to a > > > subscription. But I felt this was costly. > > > > > > > I feel it is better to use pg_subscription_rel especially because we > > will do so when the user has given the truncate option and note that > > we are already accessing it in sync worker for both reading and > > writing. See LogicalRepSyncTableStart. > > Note that in pg_subscription_rel, there can exist multiple rows for > each table for a given subscription. Say, t1 is a table that the sync > worker is trying to truncate and copy. Say, t1_dep1, t1_dep2, t1_dep3 > .... are the dependent tables (we can find these using > heap_truncate_find_FKs). Now, we need to see if all the t1_dep1, > t1_dep2, t1_dep3 .... tables are in the pg_suscription_rel with the > same subscription id, then only we can delete all of them with > EexecuteTruncateGuts() using cascade option. If any of the t1_depX is > either not in the pg_subscription_rel or it is subscribed in another > subscription, then is it okay if we scan pg_suscription_rel in a loop > with t1_depX relid's? > Why do you need to search in a loop? There is an index for relid, subid. > Isn't it costiler? Or since it is a cache > lookup, maybe that's okay? > > /* Try finding the mapping. */ > tup = SearchSysCache2(SUBSCRIPTIONRELMAP, > ObjectIdGetDatum(relid), > ObjectIdGetDatum(subid)); > > > One other problem discussed in this thread was what to do when the > > same table is part of multiple subscriptions and the user has provided > > a truncate option while operating on such a subscription. > > I think we can just skip truncating a table when it is part of > multiple subscriptions. We can tell this clearly in the documentation. > Okay, I don't have any better ideas at this stage. -- With Regards, Amit Kapila.
pgsql-hackers by date: