Re: [Proposal] Table partition + join pushdown - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: [Proposal] Table partition + join pushdown |
Date | |
Msg-id | 20151008.190410.219381863.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [Proposal] Table partition + join pushdown (Taiki Kondo <tai-kondo@yk.jp.nec.com>) |
Responses |
Re: [Proposal] Table partition + join pushdown
|
List | pgsql-hackers |
Hello, thank you for the example. I could see this patch working with it. > > In make_restrictinfos_from_check_constr, the function returns > > modified constraint predicates correspond to vars under > > hashjoinable join clauses. I don't think expression_tree_mutator > > is suitable to do that since it could allow unwanted result when > > constraint predicates or join clauses are not simple OpExpr's. > > Do you have any example of this situation? As a rather simple case on the test environment made by the provided script, the following query, explain analyze select data_x, data_y, num from check_test_div join inner_t on check_test_div.id + 1 = inner_t.id; Makes the mutation fail then result in an assertion failure. | TRAP: FailedAssertion("!(list_length(check_constr) == list_length(result))", File: "joinpath.c", Line: 1608) This is because both 'check_test_div.id + 1' and inner_t.id don't match the var-side of the constraints. I don't see clearly what to do for the situation for now but this is the one of the most important function for this feature and should be cleanly designed. At Thu, 8 Oct 2015 08:28:04 +0000, Taiki Kondo <tai-kondo@yk.jp.nec.com> wrote in <12A9442FBAE80D4E8953883E0B84E0885F9913@BPXM01GP.gisp.nec.co.jp> > Hello, Horiguchi-san. > > Thank you for your comment. > > > I got some warning on compilation on unused variables and wrong > > arguemtn type. > > OK, I'll fix it. > > > I failed to have a query that this patch works on. Could you let > > me have some specific example for this patch? > > Please find attached. > And also make sure that setting of work_mem is '64kB' (not 64MB). > > If there is the large work_mem enough to create hash table for > relation after appending, its cost may be better than pushed-down > plan's cost, then planner will not choose pushed-down plan this patch makes. > So, to make this patch working fine, work_mem size must be smaller than > the hash table size for relation after appending. > > > This patch needs more comments. Please put comment about not only > > what it does but also the reason and other things for it. > > OK, I'll put more comments in the code. > But it will take a long time, maybe... Sure. > > -- about namings > > > > Names for functions and variables are needed to be more > > appropriate, in other words, needed to be what properly informs > > what they are. The followings are the examples of such names. > > Thank you for your suggestion. > > I also think these names are not good much. > I'll try to make names better , but it maybe take a long time... > Of course, I will use your suggestion as reference. Thanks. > > "added_restrictlist"'s widely distributed as many function > > arguemnts and JoinPathExtraData makes me feel > > dizzy.. > > "added_restrictinfo" will be deleted from almost functions > other than try_join_pushdown() in next (v2) patch because > the place of filtering using this info will be changed > from Join node to Scan node and not have to place it > into other than try_join_pushdown(). I'm looking forward to see it. > > In make_restrictinfos_from_check_constr, the function returns > > modified constraint predicates correspond to vars under > > hashjoinable join clauses. I don't think expression_tree_mutator > > is suitable to do that since it could allow unwanted result when > > constraint predicates or join clauses are not simple OpExpr's. > > Do you have any example of this situation? > I am trying to find unwanted results you mentioned, but I don't have > any results in this timing. I have a hunch that it will allow unwanted > results because I have thought only about very simple situation for > this function. As mentioned above. > > Otherwise could you give me clear explanation on what it does? > > This function transfers CHECK() constraint to filter expression by following > procedures. > (1) Get outer table's CHECK() constraint by using get_relation_constraints(). > (2) Walk through expression tree got in (1) by using expression_tree_mutator() > with check_constraint_mutator() and change only outer's Var node to > inner's one according to join clause. > > For example, when CHECK() constraint of table A is "num % 4 = 0" and > join clause between table A and B is "A.num = B.data", > then we can get "B.data % 4 = 0" for filtering purpose. > > This also accepts more complex join clause like "A.num = B.data * 2", > then we can get "(B.data * 2) % 4 = 0". > > In procedure (2), to decide whether to use each join clause for changing > Var node or not, I implement check_constraint_mutator() to judge whether > join clause is hash-joinable or not. Thanks for the explanation. I think that the function has been made considering only rather plain calses. We should put more thought to making the logic clearer so that we can define the desired/possible capability and limitations clearly. > Actually, I want to judge whether OpExpr as top expression tree of > join clause means "=" or not, but I can't find how to do it. > > If you know how to do it, please let me know. I don't see for now, too :p But we at least should put more consideration on the mechanism to obtain the expressions. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: