Re: [Proposal] Table partition + join pushdown - Mailing list pgsql-hackers

From Taiki Kondo
Subject Re: [Proposal] Table partition + join pushdown
Date
Msg-id 12A9442FBAE80D4E8953883E0B84E0885F9913@BPXM01GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: [Proposal] Table partition + join pushdown  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: [Proposal] Table partition + join pushdown
Re: [Proposal] Table partition + join pushdown
List pgsql-hackers
Hello, Horiguchi-san.

Thank you for your comment.

> I got some warning on compilation on unused variables and wrong
> arguemtn type.

OK, I'll fix it.

> I failed to have a query that this patch works on. Could you let
> me have some specific example for this patch?

Please find attached.
And also make sure that setting of work_mem is '64kB' (not 64MB).

If there is the large work_mem enough to create hash table for
relation after appending, its cost may be better than pushed-down
plan's cost, then planner will not choose pushed-down plan this patch makes.
So, to make this patch working fine, work_mem size must be smaller than
the hash table size for relation after appending.

> This patch needs more comments. Please put comment about not only
> what it does but also the reason and other things for it.

OK, I'll put more comments in the code.
But it will take a long time, maybe...


> -- about namings
>
> Names for functions and variables are needed to be more
> appropriate, in other words, needed to be what properly informs
> what they are. The followings are the examples of such names.

Thank you for your suggestion.

I also think these names are not good much.
I'll try to make names better , but it maybe take a long time...
Of course, I will use your suggestion as reference.

> "added_restrictlist"'s widely distributed as many function
> arguemnts and JoinPathExtraData makes me feel
> dizzy..

"added_restrictinfo" will be deleted from almost functions
other than try_join_pushdown() in next (v2) patch because
the place of filtering using this info will be changed
from Join node to Scan node and not have to place it
into other than try_join_pushdown().


> In make_restrictinfos_from_check_constr, the function returns
> modified constraint predicates correspond to vars under
> hashjoinable join clauses. I don't think expression_tree_mutator
> is suitable to do that since it could allow unwanted result when
> constraint predicates or join clauses are not simple OpExpr's.

Do you have any example of this situation?
I am trying to find unwanted results you mentioned, but I don't have
any results in this timing. I have a hunch that it will allow unwanted
results because I have thought only about very simple situation for
this function.


> Otherwise could you give me clear explanation on what it does?

This function transfers CHECK() constraint to filter expression by following
procedures.
(1) Get outer table's CHECK() constraint by using get_relation_constraints().
(2) Walk through expression tree got in (1) by using expression_tree_mutator()
    with check_constraint_mutator() and change only outer's Var node to
    inner's one according to join clause.

For example, when CHECK() constraint of table A is "num % 4 = 0" and
join clause between table A and B is "A.num = B.data",
then we can get "B.data % 4 = 0" for filtering purpose.

This also accepts more complex join clause like "A.num = B.data * 2",
then we can get "(B.data * 2) % 4 = 0".

In procedure (2), to decide whether to use each join clause for changing
Var node or not, I implement check_constraint_mutator() to judge whether
join clause is hash-joinable or not.

Actually, I want to judge whether OpExpr as top expression tree of
join clause means "=" or not, but I can't find how to do it.

If you know how to do it, please let me know.



Best regards,
--
Taiki Kondo

NEC Solution Innovators, Ltd.






-----Original Message-----
From: Kyotaro HORIGUCHI [mailto:horiguchi.kyotaro@lab.ntt.co.jp]
Sent: Tuesday, October 06, 2015 8:35 PM
To: tai-kondo@yk.jp.nec.com
Cc: kaigai@ak.jp.nec.com; aki-iwaasa@vt.jp.nec.com; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [Proposal] Table partition + join pushdown

Hello.

I tried to read this and had some random comments on this.

-- general

I got some warning on compilation on unused variables and wrong arguemtn type.

I failed to have a query that this patch works on. Could you let me have some specific example for this patch?

This patch needs more comments. Please put comment about not only what it does but also the reason and other things for
it.


-- about namings

Names for functions and variables are needed to be more appropriate, in other words, needed to be what properly informs
whatthey are. The followings are the examples of such names. 

"added_restrictlist"'s widely distributed as many function arguemnts and JoinPathExtraData makes me feel dizzy..
create_mergejoin_pathtakes it as "filtering_clauses", which looks far better. 

try_join_pushdown() is also the name with much wider meaning. This patch tries to move hashjoins on inheritance parent
tounder append paths. It could be generically called 'pushdown' 
but this would be better be called such like 'transform appended hashjoin' or 'hashjoin distribution'. The latter would
bebetter. 
(The function name would be try_distribute_hashjoin for the
case.)

The name make_restrictinfos_from_check_contr() also tells me wrong thing. For example,
extract_constraints_for_hashjoin_distribution() would inform me about what it returns.


-- about what make_restrictinfos_from_check_constr() does

In make_restrictinfos_from_check_constr, the function returns modified constraint predicates correspond to vars under
hashjoinablejoin clauses. I don't think expression_tree_mutator is suitable to do that since it could allow unwanted
resultwhen constraint predicates or join clauses are not simple OpExpr's. 

Could you try more simple and straight-forward way to do that?
Otherwise could you give me clear explanation on what it does?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center





Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
Next
From: Amir Rohan
Date:
Subject: Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.