RE: Small performance tweak to run-time partition pruning - Mailing list pgsql-hackers

From Imai, Yoshikazu
Subject RE: Small performance tweak to run-time partition pruning
Date
Msg-id 0F97FA9ABBDBE54F91744A9B37151A511ECD81@g01jpexmbkw24
Whole thread Raw
In response to Re: Small performance tweak to run-time partition pruning  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
On Tue, Oct 9, 2018 at 2:02 AM, David Rowley wrote:
> Yeah, so subplan_map is just an array that stores the List index of
> the Append or MergeAppend's subplans. When we perform run-time pruning
> during executor initialisation, if we prune away some of these
> subplans then we don't initialise those pruned subplans at all which
> results in missing executor nodes for those plans. Instead of
> maintaining an array to store these with a bunch of NULLs in them to
> represent the pruned subnodes, for performance reasons, we make a
> gapless array and store them in there. This means that for the
> run-time pruning that we could do running actual execution
> (ExecFindMatchingSubPlans), the old subplan_map would be out of date,
> as it would contain the indexes of the subplans as if we'd not pruned
> any.  We can simply not bother adjusting the subplan_map if no further
> pruning is required. This could leave the map pointing to subplans
> that don't exist, but we only need to care about that when the map is
> actually going to be used for something. The good news is, we know in
> advance if the map will be used again.

Thanks for the detail explanation! I got it fully.

> > I saw the patch and it seems good to me about the codes.
> > I still couldn't check additional test cases in the patch.
> >
> >
> > BTW, when I looking the codes, I thought there is further improvements in
> > ExecInitAppend which is described below.
> >
> >         j = i = 0;
> >         firstvalid = nplans;
> >         foreach(lc, node->appendplans)
> >         {
> >                 if (bms_is_member(i, validsubplans))
> >                 {
> >                         Plan       *initNode = (Plan *) lfirst(lc);
> >
> >                         /*
> >                          * Record the lowest appendplans index which is a valid partial
> >                          * plan.
> >                          */
> >                         if (i >= node->first_partial_plan && j < firstvalid)
> >                                 firstvalid = j;
> >
> >                         appendplanstates[j++] = ExecInitNode(initNode, estate, eflags);
> >                 }
> >                 i++;
> >         }
> >
> > If number of valid subplans is few compared to node->appendplans, it is a waste to check
> > bms_is_member(i, validsubplans) for all node->appendplans.
> > Of course, node->appendplans is List struct and we have to loop it to find valid plan,
> > that we can't simply use while(bms_next_member(i, validsubplans)) loop.
> > I don't have any good idea for it now, but can we improve it?
> 
> 
> 
> I do have other ideas for that but not ready to share code yet, but it
> basically requires reimplementing List to use arrays as their
> underlying implementation. This will allow the bms_next_member() type
> loop and list_nth() will be O(1) instead of O(N).

Great! 
So there might be little gain in using memory, but we get large performance improvement.

> It might be possible to squeeze a bit more performance out of this
> code with the current List implementation, but I it might actually
> slow performance in some cases, for example, if the only surviving
> partition was one of the last ones in the List.  Getting the final
> element with list_nth() is optimized, but if you consider a
> time-based, say monthly, RANGE partition, a DBA might maintain a
> partition ready for the next month of data, so it might be very common
> to access the 2nd last element in the list for all queries looking at
> "this months" data. In that case, list_nth(), in its current form, is
> as slow as can be.  

I understand accessing 2nd last element is slow in current List implementation,
but I don't know your new implementation is also slow in that case.
I don't know I might misunderstood "squeeze ~ out of ~ with ~" sentence.


> You'd also need to do a bms_num_members() or
> bms_get_singleton_member(), in order to decide if the alternative
> method is going to be any faster. That call is not going to be free.

Yeah, I need to use bms_num_members() or bms_get_singleton_member() to
check number of valid subplans is enough smaller than number of all append plans
to check alternative method is faster.

> It might also be possible to form the loop so that it calls
> bms_next_member() then store the result and loop until we reach that
> number. That would only save the bms_is_member call per loop, but I'd
> much rather do the array idea as it should also speed up plenty of
> other things.

Actually, it is possible refactor that loop with the method you described,
but it would be complex and hacky codes for only saving the wasting loop.

So, I also like your array idea.

--
Yoshikazu Imai


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pgsql: Fix event triggers for partitioned tables
Next
From: Michael Paquier
Date:
Subject: Re: Refactor textToQualifiedNameList()