Re: [HACKERS] advanced partition matching algorithm forpartition-wise join - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: [HACKERS] advanced partition matching algorithm forpartition-wise join |
Date | |
Msg-id | CAFjFpRdjYWqbhL8L36o4NxZccx_ok7SnKdBfr++Dn4Tmvu35NQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] advanced partition matching algorithm forpartition-wise join (Dmitry Dolgov <9erthalion6@gmail.com>) |
Responses |
Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
|
List | pgsql-hackers |
Thanks for your review comments. They are helpful. On Thu, Mar 29, 2018 at 2:53 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote: > But actually you also mentioned > another topic that bothers me about this patch. Different parts of the > algorithm implementation (at least for functions that build maps of matching > partitions) are quite dependent on each other in terms of shared state. At > first glance in `partition_range_bounds_merge` we have about a dozen of > variables of different mutability level, that affect the control flow: > > outer_lb_index > inner_lb_index > merged > merged_index > overlap > merged_lb > merged_ub > finished_outer > finished_inner > ub_cmpval > lb_cmpval > inner_has_default > outer_has_default > jointype > > It looks a bit too much for me, and would require commentaries like "if you > changed the logic here, also take a look there". But I'm not saying that I have > any specific suggestions how to change it (although I'll definitely try to do > so, at least to get some better understanding of the underlying algorithm). I am working on commenting portions of the code to make it more clear and readable. Will update the patches with the comments soon, mostly this Monday. > > Just to make myself clear, I wanted to suggest not to change the commentary for > `IS_PARTITIONED_REL` significantly, but just add a sentence that you need to > check if given relation is fully set up. For that we have REL_HAS_ALL_PART_PROPS. IS_PARTITIONED_REL is really checking whether a relation is partitioned and following comment makes it clear why we have to check all those things. So, I still don't see why we need to change the comment there. * It's not enough to test whether rel->part_scheme is set, because it might * be that the basic partitioning properties of the input relations matched * but the partition bounds did not. > Also, few more random notes (mostly related to readability, since I found some > parts of the patch hard to read, but of course it's arguable). > > ``` > PartitionRangeBound outer_lb; > PartitionRangeBound outer_ub; > PartitionRangeBound inner_lb; > PartitionRangeBound inner_ub; > PartitionRangeBound *merged_lb = NULL; > PartitionRangeBound *merged_ub = NULL; > ``` > > Maybe it would be better to not repeat the type here? Let's say: > > ``` > PartitionRangeBound outer_lb, > outer_ub, PostgreSQL code uses both the styles, but I like declarations with the type is right there along side the variable. It also makes it easy to delete the variable or modify its type, without causing unnecessary diff on adjacent lines. > ``` > partition_range_bounds_merge(int partnatts, FmgrInfo *partsupfuncs, > Oid *partcollations, PartitionBoundInfo outer_bi, > int outer_nparts, PartitionBoundInfo inner_bi, > int inner_nparts, JoinType jointype, > List **outer_parts, List **inner_parts) > ``` > > From what I see in `partition.c` there are a lot functions that accept > `partnatts`, `partcollations` only to pass it down to, e.g. > `partition_rbound_cmp`. > What do you think about introducing a data structure to keep these arguments, > and pass only an instance of this structure instead? I have discussed this in an adjacent thread, but let me repeat it here. Those members come from PartitionKey or PartitionScheme. If we pack those in a structure and not include it in those two structures,we will just create a structure before calling the function and unpack those inside the comparison function e.g. partition_rbound_cmp(). We could argue that that will benefit intermediate functions like partition_range_bound_cmp(), which just pass those values down, but there is all the possibility that its future caller may not have that packing structure available readily. So, I am inclined not to add a new structure just for this. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
pgsql-hackers by date: