Re: Allowing join removals for more join types - Mailing list pgsql-hackers

From David Rowley
Subject Re: Allowing join removals for more join types
Date
Msg-id CAApHDvoTeUGH34PrWNHtQT_FABtU95FiNL-Sq6UXbsJXW09f0w@mail.gmail.com
Whole thread Raw
In response to Re: Allowing join removals for more join types  (Dilip kumar <dilip.kumar@huawei.com>)
Responses Re: Allowing join removals for more join types
List pgsql-hackers
On Fri, May 23, 2014 at 8:28 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:

On 23 May 2014 12:43 David Rowley Wrote,

 

>I'm hitting a bit of a roadblock on point 1. Here's a snipped from my latest attempt:

 

>                      if (bms_membership(innerrel->relids) == BMS_SINGLETON)

>                      {

>                                  int subqueryrelid = bms_singleton_member(innerrel->relids);

>                                  RelOptInfo *subqueryrel = find_base_rel(innerrel->subroot, subqueryrelid);

>                     

>                                  if (relation_has_unique_index_for(root, subqueryrel, clause_list, NIL, NIL))

>                                              return true;

>                      }

 

>But it seems that innerrel->subroot is still NULL at this stage of planning and from what I can tell does not exist anywhere else yet and is not generated until make_one_rel() is called from query_planner()

 

>Am I missing something major here,or does this sound about right?

 

It’s true that, till this point of time we haven’t prepared the base relation list for the subquery, and that will be done from make_one_rel while generating the SUBQURY path list.

 

I can think of one solution but I think it will be messy…

 

We get the base relation info directly from subquery

Like currently in your patch (shown in below snippet) we are getting the distinct and groupby clause from sub Query,  similarly we can get base relation info from  (Query->jointree)

 

            if (innerrel->rtekind == RTE_SUBQUERY)

            {

                        Query *query = root->simple_rte_array[innerrelid]->subquery;

 

                        if (sortclause_is_unique_on_restrictinfo(query, clause_list, query->groupClause) ||

                                    sortclause_is_unique_on_restrictinfo(query, clause_list, query->distinctClause))

                                    return true;

            }


I'm getting the idea that this is just not the right place in planning to do this for subqueries.
You seem to be right about the messy part too

Here's a copy and paste of the kludge I've ended up with while testing this out:

if (list_length(subquery->jointree->fromlist) == 1)
{
RangeTblEntry *base_rte;
RelOptInfo *subqueryrelid;
RangeTblRef *rtr = (RangeTblRef *) linitial(subquery->jointree->fromlist);
if (!IsA(rtr, RangeTblRef))
return false;

base_rte = rt_fetch(rtr->rtindex, subquery->rtable);
if (base_rte->relkind != RTE_RELATION)
return false;

subqueryrelid = build_simple_rel(<would have to fake this>, rtr->rtindex, RELOPT_BASEREL);

I don't have a PlannerInfo to pass to build_simple_rel and it just seems like a horrid hack to create one that we're not going to be keeping.
Plus It would be a real shame to have to call build_simple_rel() for the same relation again when we plan the subquery later. 

I'm getting the idea that looking for unique indexes on the sub query is not worth the hassle for now. Don't get me wrong, they'd be nice to have, but I just think that it's a less common use case and these are more likely to have been pulled up anyway.
 
Unless there's a better way, I think I'm going to spend the time looking into inner joins instead.

Regards

David Rowley

 

 

Regards,

Dilip


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: SKIP LOCKED DATA (work in progress)
Next
From: Andres Freund
Date:
Subject: Re: -DDISABLE_ENABLE_ASSERT