Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables |
Date | |
Msg-id | CAFjFpRf4wXhQeKhi4NhA9B3vo8BK99zkm-o8tEBq3qb8cwbLzQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Tue, Mar 14, 2017 at 5:47 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Mar 13, 2017 at 3:24 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> Haven't looked at 0007 yet. > > + if (rel->part_scheme) > + { > + int cnt_parts; > + > + for (cnt_parts = 0; cnt_parts < nparts; cnt_parts++) > + { > + if (rel->part_oids[cnt_parts] == > childRTE->relid) > + { > + Assert(!rel->part_rels[cnt_parts]); > + rel->part_rels[cnt_parts] = childrel; > + } > + } > + } > > It's not very appealing to use an O(n^2) algorithm here. I wonder if > we could arrange things so that inheritance expansion expands > partitions in the right order, and then we could just match them up > one-to-one. This would probably require an alternate version of > find_all_inheritors() that expand_inherited_rtentry() would call only > for partitioned tables. That seems a much better solution, but 1. Right now when we expand a multi-level partitioned table, we include indirect partitions as direct children in inheritance hierachy. part_rels array OTOH should correspond to the partitioning scheme and should hold RelOptInfos of direct partitions. 0013 patch fixes that to include only direct partitions as direct children preserving partitioning hierarchy in the inheritance hierarchy. That patch right now uses find_inheritance_children() to get Oids of direct partitions, but instead it could return rd_partdesc->oids in the form of list; OIDs ordered same as the array. Once we do that, we should expect the appinfos to appear in the same order as the rd_partdesc->oids and so RelOptInfo::part_oids. We just need to make sure that the order is preserved and assign part_rels as they appear in that loop. One would argue that we preserve the OIDs only for single-level partitioned tables, but in expand_inheritance_rtentry(), if we want to detect whether a relation is single-level partitioned or multi-level, we need to look up its direct partitions to see if they are further partitioned. That will look a bit ugly and will not be necessary once we have 0013. In case we decide to defer multi-level partitioned table changes to v11 and based on the progress in [1], I will work on fixing the order in which appinfos are created for single-level partitioned tables. > Failing that, another idea would be to use > qsort() or qsort_arg() to put the partitions in the right order. I didn't get this. I could not find documentation for qsort_arg(). Can you please elaborate? I guess, if we fix expand_inheritance_rtentry() we don't need this. It looks like we will change expand_inheritance_rtentry() anyway. > > + if (relation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE || > + !inhparent || > + !(rel->part_scheme = find_partition_scheme(root, relation))) > > Maybe just don't call this function in the first place in the > !inhparent case, instead of passing down an argument that must always > be true. The function serves a single place to re/set partitioning information. It would set the partitioning information if the above three conditions are met. Otherwise it would nullify that information. If we decide not to call this function when !inhparent, we will need to nullify the partitioning information outside of this function as well as inside this function, duplicating the code. > > + /* Match the partition key types. */ > + for (cnt_pks = 0; cnt_pks < partnatts; cnt_pks++) > + { > + /* > + * For types, it suffices to match the type > id, mod and collation; > + * len, byval and align are depedent on the first two. > + */ > + if (part_key->partopfamily[cnt_pks] != > part_scheme->partopfamily[cnt_pks] || > + part_key->partopcintype[cnt_pks] != > part_scheme->partopcintype[cnt_pks] || > + part_key->parttypid[cnt_pks] != > part_scheme->key_types[cnt_pks] || > + part_key->parttypmod[cnt_pks] != > part_scheme->key_typmods[cnt_pks] || > + part_key->parttypcoll[cnt_pks] != > part_scheme->key_collations[cnt_pks]) > + break; > + } > > I think memcmp() might be better than a for-loop. Done. PFA patches. [1] https://www.postgresql.org/message-id/2b0d42f2-3a53-763b-c9c2-47139e4b1c2e@lab.ntt.co.jp -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: