Re: Column Filtering in Logical Replication - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Column Filtering in Logical Replication |
Date | |
Msg-id | 7ce6a43a-830a-931f-d311-a5bfd444b3ae@enterprisedb.com Whole thread Raw |
In response to | Re: Column Filtering in Logical Replication (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
On 3/9/22 11:03, Amit Kapila wrote: > ... >> Hmm, yeah. That seems like a genuine problem - it should not depend on >> the order of publications in the subscription, I guess. >> >> But is it an issue in the patch? Isn't that a pre-existing issue? AFAICS >> the problem is that we initialize publish_as_relid=relid before the loop >> over publications, and then just update it. So the first iteration >> starts with relid, but the second iteration ends with whatever value is >> set by the first iteration (e.g. the root). >> >> So with the example you posted, we start with >> >> publish_as_relid = relid = test_part1 >> >> but then if the first publication is pubviaroot=true, we update it to >> parent. And in the second iteration, we fail to find the column filter, >> because "parent" (publish_as_relid) is not part of the pub2. >> >> If we do it in the other order, we leave the publish_as_relid value as >> is (and find the filter), and then update it in the second iteration >> (and find the column filter too). >> >> Now, this can be resolved by re-calculating the publish_as_relid from >> scratch in each iteration (start with relid, then maybe update it). But >> that's just half the story - the issue is there even without column >> filters. Consider this example: >> >> create table t (a int, b int, c int) partition by range (a); >> >> create table t_1 partition of t for values from (1) to (10) >> partition by range (a); >> >> create table t_2 partition of t_1 for values from (1) to (10); >> >> create publication pub1 for table t(a) >> with (PUBLISH_VIA_PARTITION_ROOT); >> >> create publication pub2 for table t_1(a) >> with (PUBLISH_VIA_PARTITION_ROOT); >> >> >> Now, is you change subscribe to "pub1, pub2" and "pub2, pub1", we'll end >> up with different publish_as_relid values (t or t_1). Which seems like >> the same ambiguity issue. >> > > I think we should fix this existing problem by always using the > top-most table as publish_as_relid. Basically, we can check, if the > existing publish_as_relid is an ancestor of a new rel that is going to > replace it then we shouldn't replace it. Right, using the topmost relid from all publications seems like the correct solution. > However, I think even if we > fix the existing problem, we will still have the order problem for the > column filter patch, and to avoid that instead of fetching column > filters in the publication loop, we should use the final > publish_as_relid. I think it will have another problem as well if we > don't use final publish_as_relid which is that sometimes when we > should not use any filter (say when pubviaroot is true and that > publication has root partitioned table which has no filter) as per our > rule of filters for a partitioned table, it can still use some filter > from the non-root table. > Yeah, the current behavior is just a consequence of how we determine publish_as_relid now. If we rework that, we should first determine the relid and then fetch the filter only for that single rel. >> >>> * >>> Fetching column filter info in tablesync.c is quite expensive. It >>> seems to be using four round-trips to get the complete info whereas >>> for row-filter we use just one round trip. I think we should try to >>> get both row filter and column filter info in just one round trip. >>> >> >> Maybe, but I really don't think this is an issue. >> > > I am not sure but it might matter for small tables. Leaving aside the > performance issue, I think the current way will get the wrong column > list in many cases: (a) The ALL TABLES IN SCHEMA case handling won't > work for partitioned tables when the partitioned table is part of one > schema and partition table is part of another schema. (b) The handling > of partition tables in other cases will fetch incorrect lists as it > tries to fetch the column list of all the partitions in the hierarchy. > > One of my colleagues has even tested these cases both for column > filters and row filters and we find the behavior of row filter is okay > whereas for column filter it uses the wrong column list. We will share > the tests and results with you in a later email. We are trying to > unify the column filter queries with row filter to make their behavior > the same and will share the findings once it is done. I hope if we are > able to achieve this that we will reduce the chances of bugs in this > area. > OK, I'll take a look at that email. > Note: I think the first two patches for tests are not required after > commit ceb57afd3c. > Right. Will remove. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: