Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Date
Msg-id CA+TgmoYHQJzAJdSnk-r6povk5vfE1RX8J+4OGVr_5rf1NP4e6Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
On Tue, Mar 14, 2017 at 8:04 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>> In 0001, you've removed a comment about how GEQO needs special
>> handling, but it doesn't look as if you've made any compensating
>> change elsewhere.  That seems unlikely to be correct.  If GEQO needs
>> some paths to survive longer than others, how can it be right for this
>> code to create them all in the same context?
>
> Thanks for pointing that out. I have replaced the code and the
> comments back. There was another issue that the temporary paths
> created by geqo will not be freed when geqo moves one genetic string
> to next genetic string (or what it calls as tour: an list of relation
> to be joined in a given order). To fix this, we need to set the path
> context to the temporary context of geqo inside geqo_eval() before
> calling gimme_tree() and reset it later. That way the temporary paths
> are also created in the temporary memory context of geqo. Fixed in the
> patch.

Yeah, that looks better.

>> Incidentally,
>> geqo_eval() seems to be an existing precedent for the idea of throwing
>> away paths and RelOptInfos, so we might want to use similar code for
>> partitionwise join.
>
> There are some differences in what geqo does and what partition-wise
> needs to do. geqo tries many joining orders each one in a separate
> temporary context. The way geqo slices the work, every slice produces
> a full plan. For partition-wise join I do not see a way to slice the
> work such that the whole path and corresponding RelOptInfos come from
> the same slice. So, we can't use the same method as GEQO.

What I was thinking about was the use of this technique for getting
rid of joinrels:
   root->join_rel_list = list_truncate(root->join_rel_list,                                       savelength);
root->join_rel_hash= savehash;
 

makePathNode() serves to segregate paths into a separate memory
context that can then be destroyed, but as you point out, the path
lists are still hanging around, and so are the RelOptInfo nodes.  It
seems to me we could do a lot better using this technique.  Suppose we
jigger things so that the List objects created by add_path go into
path_cxt, and so that RelOptInfo nodes also go into path_cxt.  Then
when we blow up path_cxt we won't have dangling pointers in the
RelOptInfo objects any more because the RelOptInfos themselves will be
gone.  The only problem is that the join_rel_list (and join_rel_hash
if it exists) will be corrupt, but we can fix that using the technique
demonstrated above.

Of course, that supposes that 0009 can manage to postpone creating
non-sampled child joinrels until create_partition_join_plan(), which
it currently doesn't.  In fact, unless I'm missing something, 0009
hasn't been even slightly adapted to take advantage of the
infrastructure in 0001; it doesn't seem to reset the path_cxt or
anything.  That seems like a fairly major omission.

Incidentally, I committed 0002, 0003, and 0005 as a single commit with
a few tweaks; I think you may need to do a bit of rebasing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Use asynchronous connect API inlibpqwalreceiver
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables