RE: speeding up planning with partitions - Mailing list pgsql-hackers

From Imai, Yoshikazu
Subject RE: speeding up planning with partitions
Date
Msg-id 0F97FA9ABBDBE54F91744A9B37151A5129E604@g01jpexmbkw24
Whole thread Raw
In response to Re: speeding up planning with partitions  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses RE: speeding up planning with partitions
Re: speeding up planning with partitions
List pgsql-hackers
Amit-san,

On Thu, Mar 14, 2019 at 9:04 AM, Amit Langote wrote:
> > 0002:
> > * I don't really know that delaying adding resjunk output columns to
> the plan doesn't affect any process in the planner. From the comments
> of PlanRowMark, those columns are used in only the executor so I think
> delaying adding junk Vars wouldn't be harm, is that right?
> 
> I think so.  The only visible change in behavior is that "rowmark" junk
> columns are now placed later in the final plan's targetlist.

ok, thanks.

> > 0003:
> > * Is there need to do CreatePartitionDirectory() if rte->inh becomes
> false?
> >
> > +        else if (rte->rtekind == RTE_RELATION && rte->inh)
> > +        {
> > +            rte->inh = has_subclass(rte->relid);
> > +
> > +            /*
> > +             * While at it, initialize the PartitionDesc
> infrastructure for
> > +             * this query, if not done yet.
> > +             */
> > +            if (root->glob->partition_directory == NULL)
> > +                root->glob->partition_directory =
> > +
>     CreatePartitionDirectory(CurrentMemoryContext);
> > +        }
> 
> We need to have create "partition directory" before we access a
> partitioned table's PartitionDesc for planning.  These days, we rely
> solely on PartitionDesc to determine a partitioned table's children.  So,
> we need to see the PartitionDesc before we can tell there are zero children
> and set inh to false.  In other words, we need the "partition directory"
> to have been set up in advance.

Ah, I see.

> > 0004:
> > * In addition to commit message, this patch also changes the method
> of adjusting AppendRelInfos which reference to the subquery RTEs, and
> new one seems logically correct.
> 
> Do you mean I should mention it in the patch's commit message?

Actually I firstly thought that it's better to mention it in the patch's commit message, but I didn't mean that here. I
onlywanted to state that I confirmed new method of adjusting AppendRelInfos seems logically correct :)
 

> > * In inheritance_planner(), we do ChangeVarNodes() only for orig_rtable,
> so the codes concatenating lists of append_rel_list may be able to be
> moved before doing ChangeVarNodes() and then the codes concatenating
> lists of rowmarks, rtable and append_rel_list can be in one block (or
> one bunch).
> 
> Yeah, perhaps.  I'll check.
> 
> On 2019/03/14 17:35, Imai, Yoshikazu wrote:> Amit-san,
> > I have done code review of v31 patches from 0004 to 0008.
> >
> > 0004:
> > * s/childern/children
> 
> Will fix.
> 
> > 0005:
> > * This seems reasonable for not using a lot of memory in specific
> > case, although it needs special looking of planner experts.
> 
> Certainly.

Thanks for these.

> > 0006:
> > * The codes initializing/setting RelOptInfo's part_rels looks like a
> > bit complicated, but I didn't come up with any good design so far.
> 
> I guess you mean in add_appendrel_other_rels().  The other option is not
> have that code there and expand partitioned tables freshly for every
> target child, which we got rid of in patch 0004.  But we don't want to
> do that.

Yeah, that's right.

> > 0007:
> > * This changes some processes using "for loop" to using
> > "while(bms_next_member())" which speeds up processing when we scan few
> > partitions in one statement, but when we scan a lot of partitions in
> > one statement, its performance will likely degraded. I measured the
> > performance of both cases.
> > I executed select statement to the table which has 4096 partitions.
> >
> > [scanning 1 partition]
> > Without 0007 : 3,450 TPS
> > With 0007    : 3,723 TPS
> >
> > [scanning 4096 partitions]
> > Without 0007 : 10.8 TPS
> > With 0007    : 10.5 TPS
> >
> > In the above result, performance degrades 3% in case of scanning 4096
> > partitions compared before and after applying 0007 patch. I think when
> > scanning a lot of tables, executor time would be also longer, so the
> > increasement of planner time would be relatively smaller than it. So
> > we might not have to care this performance degradation.
> 
> Well, live_parts bitmapset is optimized for queries scanning only few
> partitions.  It's true that it's slightly more expensive than a simple
> for loop over part_rels, but not too much as you're also saying.

Thanks for the comments.

--
Yoshikazu Imai

pgsql-hackers by date:

Previous
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: Timeout parameters
Next
From: Michael Paquier
Date:
Subject: Re: Offline enabling/disabling of data checksums