Re: speeding up planning with partitions - Mailing list pgsql-hackers

From Amit Langote
Subject Re: speeding up planning with partitions
Date
Msg-id af3da198-e4aa-0535-8b13-40b02a4a7c41@lab.ntt.co.jp
Whole thread Raw
In response to Re: speeding up planning with partitions  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: speeding up planning with partitions
List pgsql-hackers
On 2019/03/05 19:25, Amit Langote wrote:
> On 2019/03/04 19:38, Amit Langote wrote:
>> 2. Defer inheritance expansion to add_other_rels_to_query().  ...
>>
>> Also, delaying adding children also affects adding junk columns to the
>> query's targetlist based on PlanRowMarks, because preprocess_targetlist
>> can no longer finalize which junk columns to add for a "parent"
>> PlanRowMark; that must be delayed until all child PlanRowMarks are added
>> and their allMarkTypes propagated to the parent PlanRowMark.
> 
> I thought more on this and started wondering why we can't call
> preprocess_targetlist() from query_planner() instead of from
> grouping_planner()?  We don't have to treat parent row marks specially if
> preprocess_targetlist() is called after adding other rels (and hence all
> child row marks).  This will change the order in which expressions are
> added to baserels targetlists and hence the order of expressions in their
> Path's targetlist, because the expressions contained in targetlist
> (including RETURNING) and other junk expressions will be added after
> expressions referenced in WHERE clauses, whereas the order is reverse
> today.  But if we do what we propose above, the order will be uniform for
> all cases, that is, not one for regular table baserels and another for
> inherited table baserels.

I wrestled with this idea a bit and concluded that we don't have to
postpone *all* of preprocess_targetlist() processing to query_planner,
only the part that adds row mark "junk" Vars, because only those matter
for the problem being solved.  To recap, the problem is that delaying
adding inheritance children (and hence their row marks if any) means we
can't add "junk" columns needed to implement row marks, because which ones
to add is not clear until we've seen all the children.

I propose that we delay the addition of "junk" Vars to query_planner() so
that it doesn't stand in the way of deferring inheritance expansion to
query_planner() too.  That means the order of reltarget expressions for
row-marked rels will change, but I assume that's OK.  At least it will be
consistent for both non-inherited baserels and inherited ones.

Attached updated version of the patches with the above proposal
implemented by patch 0002.  To summarize, the patches are as follows:

0001: make building of "other rels" a separate step that runs after
deconstruct_jointree(), implemented by a new subroutine of query_planner
named add_other_rels_to_query()

0002: delay adding "junk" Vars to after add_other_rels_to_query()

0003: delay inheritance expansion to add_other_rels_to_query()

0004, 0005: adjust inheritance_planner() to account for the fact that
inheritance is now expanded by query_planner(), not subquery_planner()

0006: perform partition pruning while inheritance is being expanded,
instead of during set_append_append_rel_size()

0007: add a 'live_parts' field to RelOptInfo to store partition indexes
(not RT indexes) of unpruned partitions, which speeds up looping over
part_rels array of the partitioned parent

0008: avoid copying PartitionBoundInfo struct for planning

Thanks,
Amit

Attachment

pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: Re: Suggestions on message transfer among backends
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Pluggable Storage - Andres's take